# 跟我一起自己写编译器-5.10.LR语法分析

## 5.10. LR语法分析

LR语法分析本质上为从左到右自底向上算法，从左到右一个一个读入字符，然后按照产生式进行规约，直到规约出文法开始的符号。

## 5.10.1. LR算法

LR语法分析最重要的就是移入和规约。下面举一个例子来理解移入和规约。

``````graph LR
number("number(1)")
``````

``````graph TD
pro("PRODUCTION") --> number("number(1)")
``````

``````graph TD
pro("PRODUCTION") --> number("number(1)")
+
``````

``````graph TD
pro("PRODUCTION") --> number("number(1)")
+
number2("number(2)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
number3("number(3)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
n4("number(4)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
n5("number(5)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
proSum2(PRODUCTION) --> pro4 & add2 & pro5
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
proSum2(PRODUCTION) --> pro4 & add2 & pro5
right(")")
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
proSum2(PRODUCTION) --> pro4 & add2 & pro5
right(")")
proLeftRight(PRODUCTION) --> left & proSum2 & right
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
proSum2(PRODUCTION) --> pro4 & add2 & pro5
right(")")
proLeftRight(PRODUCTION) --> left & proSum2 & right
pro2LeftRight --> proLeftRight & proMum & mul2
``````

``````graph TD
pro1("PRODUCTION") --> number("number(1)")
+
pro2("PRODUCTION") --> number2("number(2)")
*
pro3("PRODUCTION") --> number3("number(3)")
proMum(PRODUCTION) --> pro2 & pro3 & *
mul2("*")
left("(")
pro4(PRODUCTION) --> n4("number(4)")
pro5(PRODUCTION) -->n5("number(5)")
proSum2(PRODUCTION) --> pro4 & add2 & pro5
right(")")
proLeftRight(PRODUCTION) --> left & proSum2 & right
pro2LeftRight --> proLeftRight & proMum & mul2
summm(PRODUCTION) --> pro1 & + & pro2LeftRight
``````

## 5.10.3. SLR算法介绍

SLR算法是最简单的LR算法，首先当然是需要计算FIRST集和FOLLOW集的。然后就是计算增广文法，这里涉及到项集的概念。

## 5.10.4. SLR项集

``````graph TD
I1("SUM -> · number<br/>SUM -> · SUM + number")
I2("SUM -> number ·")
I3("SUM -> SUM · + number")
I4("SUM -> SUM + · number")
I5("SUM -> SUM + number ·")
I1 -->|number| I2
I1 -->|SUM| I3
I3 -->|+| I4
I4 -->|number| I5
``````

``````graph TD
I1("状态1<br/>PRODUCTION -> · SUM<br/>PRODUCTION -> · MUL<br/>SUM -> · PRODUCTION + MUL<br/>MUL -> · MUL * number<br/>MUL -> · number")
I2("状态2<br/>MUL -> number ·")
I3("状态3<br/>PRODUCTION -> SUM · ")
I4("状态4<br/>PRODUCTION -> MUL · <br/>MUL -> MUL · * number")
I5("状态5<br/>SUM -> PRODUCTION · + MUL")
I6("状态6<br/>MUL -> MUL * · number")
I7("状态7<br/>MUL -> MUL * number · ")
I8("状态8<br/>SUM -> PRODUCTION + · MUL<br/>MUL -> · MUL * number<br/>MUL -> · number")
I9("状态9<br/>SUM -> PRODUCTION + MUL ·<br/>MUL -> MUL · * number")
I10("状态10<br/>MUL -> number ·")
I1 -->|number| I2
I1 -->|SUM| I3
I1 -->|MUL| I4
I1 -->|PODUCTION| I5
I4 -->|*| I6
I6 -->|number| I7
I5 -->|+| I8
I8 -->|MUL| I9
I8 -->|number| I10
I9 -->|*| I6

``````

## 5.10.5. SLR表生成算法

1 s2
2
3
4 s6
5 s8
6 s7
7
8 s10
9 s6
10

1 s2 goto3 goto4 goto5
2
3
4 s6
5 s8
6 s7
7
8 s10 goto9
9 s6
10

1 s2 goto3 goto4 goto5
2
3 ACC
4 s6 ACC
5 s8
6 s7
7
8 s10 goto9
9 s6
10

1 s2 goto3 goto4 goto5
2 r(MUL->number) r(MUL->number) r(MUL->number)
3 ACC
4 s6 ACC
5 s8
6 s7
7
8 s10 goto9
9 s6
10

1 s2 goto3 goto4 goto5
2 r(MUL->number) r(MUL->number) r(MUL->number)
3 r(PRODUCTION -> SUM) ACC
4 r(PRODUCTION -> MUL) s6 ACC
5 s8
6 s7
7 r(MUL -> MUL * number ) r(MUL -> MUL * number ) r(MUL -> MUL * number )
8 s10 goto9
9 r(SUM -> PRODUCTION + MUL) s6 r(SUM -> PRODUCTION + MUL)
10 r(MUL->number) r(MUL->number) r(MUL->number)

1 最开始有一个状态栈，其中包含状态1

`[栈底,1]` ` [栈底]`
2 从状态栈顶读取状态

`[栈底,1,2]` `[栈底,number1]`
3 从状态栈顶读取状态

`[栈底,1,4]` `[栈底,MUL]`
4 从状态栈顶读取状态

`[栈底,1,5]` `[栈底,PRODUCTION]`
5 从状态栈顶读取状态

6 从状态栈顶读取状态

7 从状态栈顶读取状态

8 从状态栈顶读取状态

9 从状态栈顶读取状态

,读入字符`number3`， 执行`s7` `[栈底,1,5,8,9,6,7]` ` [栈底,PRODUCTION,+,MUL,*,number3]`
10 从状态栈顶读取状态

`[栈底,1,5,8,9] ` `[栈底,PRODUCTION,+,MUL]`
11 从状态栈顶读取状态

`[栈底,1,3] ` `[栈底,SUM]`
12 从状态栈顶读取状态

## 5.10.6 SLR算法伪代码

1. 构造文法的规范LR(0) 项集簇 C = \${I_0, I_1, … I_n}\$.
2. 对于状态i，
① 如果[A->α·aβ]在状态中，a为终结符且GOTO(i,a)=j， 则ACTION[i,a]为: “移入j”
② 如果[A->α·]在状态中且A不为开始符号，则对于所有的FOLLOW(A)的字符a，将ACTION[i,a]设为: “规约A->α”
③ 如果[A->α·]在状态中且A为开始符号，则将ACTION[i,\$]设为: “接受”
3. 对于状态i，如果A为非终结符且 GOTO(i,A)=j 则GOTO[i,A]=j
4. 剩下的地方就是报错 5. 起点是[S’->·S]

## 5.10.8 LR1 增广文法

LR1在SLR的基础上继续进行增广，继续引入了end集合，下面继续回忆一下文法和项集

``````graph TD
I1("状态1<br/>PRODUCTION -> · SUM<br/>PRODUCTION -> · MUL<br/>SUM -> · PRODUCTION + MUL<br/>MUL -> · MUL * number<br/>MUL -> · number")
I2("状态2<br/>MUL -> number ·")
I3("状态3<br/>PRODUCTION -> SUM · ")
I4("状态4<br/>PRODUCTION -> MUL · <br/>MUL -> MUL · * number")
I5("状态5<br/>SUM -> PRODUCTION · + MUL")
I6("状态6<br/>MUL -> MUL * · number")
I7("状态7<br/>MUL -> MUL * number · ")
I8("状态8<br/>SUM -> PRODUCTION + · MUL<br/>MUL -> · MUL * number<br/>MUL -> · number")
I9("状态9<br/>SUM -> PRODUCTION + MUL ·<br/>MUL -> MUL · * number")
I10("状态10<br/>MUL -> number ·")
I1 -->|number| I2
I1 -->|SUM| I3
I1 -->|MUL| I4
I1 -->|PODUCTION| I5
I4 -->|*| I6
I6 -->|number| I7
I5 -->|+| I8
I8 -->|MUL| I9
I8 -->|number| I10
I9 -->|*| I6

``````

`SUM -> · PRODUCTION + MUL`得出，`PRODUCTION -> · SUM`后可为`+`

``````graph TD
I1("状态1<br/>PRODUCTION -> · SUM, \$/+<br/>PRODUCTION -> · MUL, \$/+<br/>SUM -> · PRODUCTION + MUL, \$/+<br/>MUL -> · MUL * number, \$/+<br/>MUL -> · number, *")
I2("状态2<br/>MUL -> number · , *")
I3("状态3<br/>PRODUCTION -> SUM · , \$/+")
I4("状态4<br/>PRODUCTION -> MUL · , \$/+<br/>MUL -> MUL · * number, \$/+")
I5("状态5<br/>SUM -> PRODUCTION · + MUL, \$/+")
I6("状态6<br/>MUL -> MUL * · number, \$/+")
I7("状态7<br/>MUL -> MUL * number ·, \$/+")
I8("状态8<br/>SUM -> PRODUCTION + · MUL, \$/+<br/>MUL -> · MUL * number, \$/+<br/>MUL -> · number, *")
I9("状态9<br/>SUM -> PRODUCTION + MUL ·, \$/+<br/>MUL -> MUL · * number, \$/+")
I10("状态10<br/>MUL -> number ·, *")
I1 -->|number| I2
I1 -->|SUM| I3
I1 -->|MUL| I4
I1 -->|PODUCTION| I5
I4 -->|*| I6
I6 -->|number| I7
I5 -->|+| I8
I8 -->|MUL| I9
I8 -->|number| I10
I9 -->|*| I6

``````

## 5.10.9 LR1表生成算法伪代码

1. 构造文法的规范LR(0) 项集簇 C = \${I_0, I_1, … I_n}\$.
2. 对于状态i，
① 如果[A->α·aβ]在状态中，a为终结符且GOTO(i,a)=j， 则ACTION[i,a]为: “移入j”
② 如果[A->α·]在状态中且A不为开始符号，则对于此时的增广文法中的end集的字符a，将ACTION[i,a]设为: “规约A->α”
③ 如果[A->α·]在状态中且A为开始符号，则将ACTION[i,\$]设为: “接受”
3. 对于状态i，如果A为非终结符且 GOTO(i,A)=j 则GOTO[i,A]=j
4. 剩下的地方就是报错 5. 起点是[S’->·S]

## 5.10.12 总结

http://fightinggg.github.io/fluid/QV7MPS.html

fightinggg

2021年6月24日