Course Overview
Interpreters: on-line (速度慢, 占用内存大)
Compilers: off-line
data:image/s3,"s3://crabby-images/1cb77/1cb77a007839b8d43257083a0e375766fec38ecb" alt=""
Compiler Five Phases:
-
Lexical Analysis: syntactic
recognize words
-
Parsing: syntactic: syntactic
diagramming sentences
-
Semantic Analysis: types and scopes
understand meaning (hard), catch inconsistencies
无法做到真正的解析语义, 所以只是寻找程序中自相矛盾的地方, 比如: 变量重复定义, 变量前后类型不同
-
Optimization
run faster, use less memory, less power, network, database…
-
Code Generation: translation to another languages, machine code or even high level programming language
编译器的提升:
data:image/s3,"s3://crabby-images/80dc8/80dc866aa659926afe50e52a206e5d6d2d8779e6" alt=""
The Course Project
COOL: Classroom Object Oriented Language
COOL
的语法
-
main 的返回值
-
继承
-
递归和递推
-
init 函数
-
一些特性 (用 Object 模拟泛型)
环境配置
这个的环境配置还是很简单的, 官方是提供了一个系统镜像
-
根据自己电脑版本下载Virtual Box 并安装
-
下载系统镜像 (这里可能需要一点魔法)
-
点击镜像中的
.vbox
文件直接运行即可 -
进入到桌面, 打开终端 (左下第四个图标) 进行测试
-
写个 hello.cl,
coolc
编译出 hello.s,spim
执行 hello.s -
测试成功
Lexical Analysis
Lexical Analysis
每个lexeme被分为不同的token classes
data:image/s3,"s3://crabby-images/7ac37/7ac37c7d8e0178d821988804317aa057dc2e016a" alt=""
词法分析器接收, 生成类似于键值对的 token 二元组 <type - value>, 然后发送给解析器
data:image/s3,"s3://crabby-images/134fa/134fa5208c1ff4ac6319c814d05860b8321679f6" alt=""
过程总结:
data:image/s3,"s3://crabby-images/6eaa0/6eaa0b1789b29134ca3aec81bd3e73ed64a255d8" alt=""
Lexical Analysis Examples
Fortran 的一个例子 (因为历史原因, Fortran 中的空格不重要):
DO 5 I = 1,25 // 从此句到有标签 5 的句子做循环, I 是变量, 从 1 加到 25
DO 5 I = 1.25 // 声明一个变量 DO5I, 值为 1.25
在词法分析中虽然会扫描到关键字 DO
, 但只有扫描到后面的 ,
才能确定这是循环的关键字
data:image/s3,"s3://crabby-images/ba50e/ba50e4694a8179cccc10860ff34964d71936fd30" alt=""
lookahead: 根据后面的词语来确定当前词语的作用
Regular Language
data:image/s3,"s3://crabby-images/42a93/42a93bc8550e412bb57b282fd7db79e8bdaf1286" alt=""
总而言之, 正则语言是一个强大的工具
Formal Language
data:image/s3,"s3://crabby-images/b1dc0/b1dc0e56ace7629f632d6a1608792f3c23c61e2a" alt=""
data:image/s3,"s3://crabby-images/f393a/f393ac635cb28dc6b8b31233bd5267483f19a16c" alt=""
将语义和语法分离, 不同语法会有相同语义, 二者之间的映射关系必须是多对一
Lexical Specifications
data:image/s3,"s3://crabby-images/5836f/5836fa8dd45ec5ccc4aec976e3b02f3280a96a39" alt=""
data:image/s3,"s3://crabby-images/198c4/198c472a1126e30c9e22d9b496a7416daa808230" alt=""
data:image/s3,"s3://crabby-images/115f8/115f8e8b939204ec01a9b546d5578e657763bca0" alt=""
data:image/s3,"s3://crabby-images/7cb2e/7cb2e2559b2859958512df8591a8b76512432108" alt=""
Implementation of Lexical Analysis
Lexical Specification
data:image/s3,"s3://crabby-images/af666/af6661308c9d7550611cfaa7cb7eaef7b3ffb998" alt=""
- 对于每个表达式用正则去匹配;
- 如果匹配到了多个正则, 则按照优先级进行识别; 如果没有则判断为一个错误
Finite Automata
基本介绍:
DFA 和 NFA:
Regular Expressions into NFAs
词法分析器:
转换示例:
一个例子:
NFA to DFA
方法:
例子:
Implementing Finite Automata
data:image/s3,"s3://crabby-images/0a7c0/0a7c0a5b123024a0b4d36af9621f408cee3198fd" alt=""
优化一下 (压缩空间):
NFA
需要对每个状态都进行一次搜索, 所以就很慢:
DFA
和 NFA
的权衡:
DFA | faster | less compact |
NFA | slower | concise |
Introduction to Parsing
Introduction to Parsing
Parsing:
data:image/s3,"s3://crabby-images/b7874/b7874f428c9d71534c7ba09cf84780f6fdb31ddb" alt=""
data:image/s3,"s3://crabby-images/a8683/a86838e3ea07f9e37d98bb4477c2ddfad9a09eb4" alt=""
data:image/s3,"s3://crabby-images/68526/6852689a8dd2e9693e7b64b67dedfe9cecebc0ed" alt=""
Context Free Grammars
data:image/s3,"s3://crabby-images/2060f/2060f95776df428497acf662f3f1b0e2b82c4ea9" alt=""
data:image/s3,"s3://crabby-images/ac492/ac492e9c906231e17a3555bc173bad6cefe74039" alt=""
Derivations
Parse Tree:
data:image/s3,"s3://crabby-images/165f2/165f2bd5b228744d979e5e50fe4146c81352c123" alt=""
最左推导:
data:image/s3,"s3://crabby-images/725d4/725d4e6c2e37222a6e3c15dec25e73aaec216967" alt=""
data:image/s3,"s3://crabby-images/57f5d/57f5d4c86c004d9c6e92b49b19b31896ec36299b" alt=""
最右推导:
data:image/s3,"s3://crabby-images/e50a9/e50a9da88ee56e4c70d8cdb3ef5d53a13397ecc6" alt=""
Ambiguity
data:image/s3,"s3://crabby-images/6e33d/6e33d6d2c864272e32a63b0a9ae8f1e1fdee447a" alt=""
Syntax-Directed Translation
Error Handling
error kinds:
error detect:
error correction:
Abstract Syntax Trees
data:image/s3,"s3://crabby-images/74ea8/74ea8ca465c16324626bc97640308d1fdd345cba" alt=""
正常的语法树:
AST(Abstract Syntax Tree):
Recursive Descent Parsing
不匹配, 回溯:
匹配成功:
Recursive Descent Algorithm
从上到下, 从左到右
data:image/s3,"s3://crabby-images/0da59/0da598f3c3f337edddfdeaf29b4b609af4829f1d" alt=""
data:image/s3,"s3://crabby-images/b8dec/b8dec779005b9b449bad7231ff341fc09bbbddbd" alt=""
一个完整的例子, 实际上整个过程就是 dfs:
Recursive Descent Limitations
没有遍历所有情况, 所以可能会有提前识别的风险
如上图, int * int
在被识别到第一个 int
后, 指针后移, 于是变成了识别 int
和 * int
, 识别错误
Left Recursion
Recursive Descent Algorithm
是从左到右进行分析, 而左递归是从右到左产生语言, 无法使用该算法分析, 所以要将左递归转换为右递归
转换方法:
总结:
Top-Down Parsing & Bottom-Up Parsing I
Predictive Parsing
data:image/s3,"s3://crabby-images/c73af/c73af843fffbb611bfebc3346aa7925b441842b3" alt=""
LL(K): 从左到右扫描, 最左生成, 向前看 K 个 tokens(此章节中 K=1)
像之前的 $T \rightarrow T+E | T$ 的例子中, $T+E$ 和 $T$ 的第一个 token 相同, 所以无法用 LL(1) 语法预测, 所以引入了 left-factored
语法 (提取左边的公因式):
data:image/s3,"s3://crabby-images/7127b/7127b90c5d85bc875d130deaa77511b9d65f6610" alt=""
leftmost non-terminal
: 最左非终结符
next input token
: 下一个输入的 token
rhs of production to use
: 生成式的右边部分
data:image/s3,"s3://crabby-images/207fa/207faa2d22c94ce4b1dbc52a11c5a16b1882a0fc" alt=""
使用 LL(1) parsing table 的算法:
一个例子:
First Sets
data:image/s3,"s3://crabby-images/82b6f/82b6f42cdfc62734dcf6de39f4e9dd0de9d39c12" alt=""
data:image/s3,"s3://crabby-images/04910/04910d93bf987a1472a83c95ddc23f617b3c6dc3" alt=""
终结符的 First Set 是他本身, 所以 $First(t) = {t} \dots$
$E$ 根据 LL(1) 会识别到 $T$, 所以 $First(E) \supset First(T) $, 但是因为 $E$ 没有生成新的 (区别于 $First(T)$ ) 终结符, 所以 $First(E) = First(T)$
Follow Sets
data:image/s3,"s3://crabby-images/ddbaf/ddbafa8c276a8ad3d1365d367b7eb83b8e71f2d8" alt=""
data:image/s3,"s3://crabby-images/c2e0d/c2e0d2c0f8088e793ce1612f652e9221975bcb53" alt=""
data:image/s3,"s3://crabby-images/3eaf9/3eaf99903f8ecfd219135212faed9458f4cb1eda" alt=""
LL(1) Parsing Tables
data:image/s3,"s3://crabby-images/ee39f/ee39f64b782e8bdac8118c48fa1c0256054ad4f0" alt=""
data:image/s3,"s3://crabby-images/90b2e/90b2e701e80be3695919f7449f663f282a97ae07" alt=""
data:image/s3,"s3://crabby-images/84f51/84f5153a00af5305533ea8a36e87e352bc02ea51" alt=""
Bottom Up Parsing
rightmost derivation
data:image/s3,"s3://crabby-images/5d03f/5d03f3f4f1299ce9b4a1f5d6203c6e2892c54a78" alt=""
data:image/s3,"s3://crabby-images/904e2/904e28b16ac6a6883e2f80d93c10852c3f6fa8de" alt=""
Shift Reduce Parsing
data:image/s3,"s3://crabby-images/77a98/77a98fbe636037e9cd41d27b439c1a9a1524deeb" alt=""
| 表示输入的位置
data:image/s3,"s3://crabby-images/d9948/d9948cdbf9c0c0bfea8c5659ba1ec40347d7a4bb" alt=""
lhs: left hand side 推导式的左边部分
rhs: right hand side 推导式的右边部分
Bottom-Up Parsing II
Handlers
data:image/s3,"s3://crabby-images/d9724/d97246f0a828aa24f4ff34b254bc0ec3657bc16a" alt=""
data:image/s3,"s3://crabby-images/b7937/b79377ef7cd093f25e63a7597e2479f97112f600" alt=""
Recognizing Handlers
data:image/s3,"s3://crabby-images/db1c2/db1c2ea5f1903f8c2253b849201ed79fd4574045" alt=""
data:image/s3,"s3://crabby-images/8a786/8a78625bffaacda0d2783576901d75695a0915a6" alt=""
data:image/s3,"s3://crabby-images/5dbbf/5dbbf2bcf57aeeff87972c9a496512bcdf3bb297" alt=""
data:image/s3,"s3://crabby-images/01563/01563513b4eb9a91e270435671442a4f07fca0d5" alt=""
data:image/s3,"s3://crabby-images/00ff3/00ff39f7d23f232ecf20bed325c54880e2ea248f" alt=""
Recognizing Viable Prefixes
data:image/s3,"s3://crabby-images/f90af/f90af42361ebc61cf915892547130c94a681927c" alt=""
data:image/s3,"s3://crabby-images/c2964/c2964aa89fcb4b3d3486a8e42b9b4b9dae9da8d0" alt=""
Valid Items
data:image/s3,"s3://crabby-images/8c76a/8c76ab5f7b598b73ee5f8519f74fef661381d5cc" alt=""
data:image/s3,"s3://crabby-images/17800/17800babfa01bc0447bc23887854a6e345458a02" alt=""
SLR Parsing
data:image/s3,"s3://crabby-images/33640/33640e4694843bd117113cfbb14b785d491318d4" alt=""
data:image/s3,"s3://crabby-images/85805/85805c951164a081645075a2aabaf319470dc7b7" alt=""
reduce: 规约, 生成式右到左执行
shift: 移入, 生成式左到右执行
data:image/s3,"s3://crabby-images/b27b1/b27b14c46be56cfdad93fb010bf06af0f264b8b1" alt=""
SLR Parsing Example
data:image/s3,"s3://crabby-images/a5cfb/a5cfb6d9251dc98c28984dc6f86c14acf6d0e9a3" alt=""
data:image/s3,"s3://crabby-images/4cd3c/4cd3cb35fad78d6ac77ded82fd6879a1bbab8b96" alt=""
red : reduce
SLR Improvements
data:image/s3,"s3://crabby-images/cfeda/cfeda26a69cb4a0409ac2171f6533b73b9d911b5" alt=""
data:image/s3,"s3://crabby-images/881fa/881fa45cf398fe96171738dbd03304c7a8e10cb9" alt=""
data:image/s3,"s3://crabby-images/a5d53/a5d5397106ec55d8c588e95736966a1b93b6139c" alt=""
data:image/s3,"s3://crabby-images/78913/789135290d0a42c66c1fd845219d3c9edb48933e" alt=""
data:image/s3,"s3://crabby-images/0f0ad/0f0ad7f3bf1a4bf0ac5292c732138d26bdb86342" alt=""
data:image/s3,"s3://crabby-images/2ef28/2ef281ebe6b9e6fcc11165399d1221e236d7e1dc" alt=""
SLR Examples
data:image/s3,"s3://crabby-images/e77e2/e77e2a5941f7bfa34d18875ce7ad1a1362fc0669" alt=""
data:image/s3,"s3://crabby-images/7add7/7add7719097555dc3edf97fe92934013b223ba63" alt=""