{"id":21,"date":"2020-08-12T11:45:27","date_gmt":"2020-08-12T03:45:27","guid":{"rendered":"http:\/\/47.108.144.50\/?p=21"},"modified":"2020-08-13T20:38:12","modified_gmt":"2020-08-13T12:38:12","slug":"llvm%e5%ae%98%e6%96%b9%e4%b8%87%e8%8a%b1%e7%ad%92%e6%95%99%e7%a8%8b1-lexer%ef%bc%8cparse-%ef%bc%8cast","status":"publish","type":"post","link":"http:\/\/pareto.fun\/?p=21","title":{"rendered":"llvm\u524d\u7aef-\u5b98\u65b9\u4e07\u82b1\u7b52\u6559\u7a0b1-Lexer\uff0cParse \uff0cAST"},"content":{"rendered":"\n<p><strong>\u7531\u4e8e\u5f88\u591a\u6559\u7a0b\u7f16\u5199\u7684pass\u5b58\u5728\u95ee\u9898\uff0c\u81ea\u5df1\u53c8\u4e0d\u4f1a\u6539 \uff0c\u672c\u7cfb\u5217\u6559\u7a0b\u76ee\u7684\u662f\u5b66\u4e60\u7f16\u5199llvm pass \u4e3b\u8981\u8ddf\u7740llvm\u5b98\u7f51\u63d0\u4f9b\u7684\u6559\u7a0b\u6765\u5b66\u4e60\uff0c\u5b98\u65b9\u63d0\u4f9b\u7684\u6559\u7a0b\u4e00\u4e2a\u81ea\u5236\u7684llvm\u524d\u7aef\u7f16\u7a0b\u8bed\u8a00<strong>\u4e07\u82b1\u7b52<\/strong>\u4e3a\u4f8b\u3002<\/strong><\/p>\n\n\n\n<h1 class=\"wp-block-heading\">\u4e07\u82b1\u7b52\u8bed\u8a00 &#8211; The Kaleidoscope Language<\/h1>\n\n\n\n<p>\u4e07\u82b1\u7b52\u662f\u4e00\u4e2a\u5141\u8bb8\u5b9a\u4e49\u51fd\u6570\uff0c\u4f7f\u7528\u5224\u65ad\uff0c\u6570\u5b66\u7b49\u7684\u7a0b\u5e8f\u8bed\u8a00\uff0c\u6574\u4e2a\u6559\u7a0b\u4e2d\uff0c\u6211\u4eec\u4f1a\u62d3\u5c55\u4e07\u82b1\u7b52\u6765\u652f\u6301 if\/then\/else \u6784\u5efa\uff0c\u5faa\u73af\uff0c\u7528\u6237\u81ea\u5b9a\u4e49\u64cd\u4f5c\u7b26\uff0c\u4e00\u884c\u547d\u4ee4\u63a5\u53e3\u7684JIT \u7f16\u8bd1 \uff0c\u8c03\u8bd5\u4fe1\u606f\u7b49\u3002<\/p>\n\n\n\n<p>\u6211\u4eec\u60f3\u8ba9\u4e8b\u60c5\u7b80\u5355\uff0c\u6240\u4ee5\u5728\u4e07\u82b1\u7b52\u552f\u4e00\u7684\u6570\u636e\u7c7b\u578b\u662f64\u4f4d\u7684\u6d6e\u70b9\u7c7b\u578b\uff08C\u8bed\u8a00\u4e2d\u7684double\uff09\uff0c\u6240\u6709\u7684\u503c\u9ed8\u8ba4\u90fd\u662f\u53cc\u7cbe\u5ea6\u5e76\u4e14\u4e07\u82b1\u7b52\u4e0d\u9700\u8981\u5b9a\u4e49\u3002\u8fd9\u6837\u8ba9\u4e07\u82b1\u7b52\u975e\u5e38\u53cb\u597d\uff0c\u8bed\u6cd5\u7b80\u5355\u3002\u4f8b\u5982\u4e00\u4e2aFibonacci number \u8ba1\u7b97\u7684\u4f8b\u5b50\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"># Compute the x'th fibonacci number.<br>def fib(x)<br>  if x &lt; 3 then<br> &nbsp;  1<br>  else<br> &nbsp;  fib(x-1)+fib(x-2)<br>\u200b<br># This expression will compute the 40th number.<br>fib(40)<\/pre>\n\n\n\n<p>LLVM JIT \u4e5f\u8ba9\u4e07\u82b1\u7b52\u652f\u6301\u8c03\u7528\u6807\u51c6\u5e93\u51fd\u6570 \u53d8\u5f97\u7b80\u5355\uff0c\u8fd9\u610f\u5473\u7740\u4f60\u53ef\u4ee5\u7528&#8217;extern&#8217; \u5173\u952e\u8bcd\u5728\u4f60\u4f7f\u7528\u524d\u6765\u5b9a\u4e49\u4e2a\u51fd\u6570\u3002\u4f8b\u5982\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">extern sin(arg);<br>extern cos(arg);<br>extern atan2(arg1 arg2);<br>\u200b<br>atan2(sin(.4), cos(42))<\/pre>\n\n\n\n<p>\u66f4\u6709\u8da3\u7684\u4f8b\u5b50\u5728\u7b2c\u516d\u7ae0 \u6211\u4eec\u5199\u4e86\u4e00\u4e9b\u4e07\u82b1\u7b52\u7684\u5e94\u7528\u7528\u6765\u5728\u4e0d\u540c\u7684\u653e\u5927\u500d\u6570\u6765\u5c55\u73b0Mandelbrot \u96c6\u5408\u3002<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">\u8bcd\u6cd5\u5206\u6790\u5668 &#8211; Lexer<\/h1>\n\n\n\n<p>\u5b9e\u73b0\u4e00\u4e2a\u7a0b\u5e8f\u8bed\u8a00\u7684\u65f6\u5019\uff0c\u7b2c\u4e00\u4ef6\u4e8b\u5c31\u662f\u9700\u8981\u5904\u7406\u6587\u672c\u5185\u5bb9\u548c\u610f\u8bc6\u5230\u5b83\u5728\u8868\u8fbe\u4ec0\u4e48\u3002\u4f20\u7edf\u5b9e\u73b0\u65b9\u5f0f\u662f\u4f7f\u7528&#8221;Lexer&#8221; \u8bcd\u6cd5\u5206\u6790\u5668\u628a\u8f93\u5165\u5206\u89e3\u6210 &#8220;tokens&#8221; \u3002\u6bcf\u4e2a\u8bcd\u6cd5\u5206\u6790\u5668\u8fd4\u56de\u7684token \u5305\u62ectoken\u7801 \u548c\u4e00\u4e9b\u5143\u6570\u636e\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">\/\/ The lexer returns tokens [0-255] if it is an unknown character, otherwise one<br>\/\/ of these for known things.<br>enum Token {<br>  tok_eof = -1,<br>\u200b<br>  \/\/ commands<br>  tok_def = -2,<br>  tok_extern = -3,<br>\u200b<br>  \/\/ primary<br>  tok_identifier = -4,<br>  tok_number = -5,<br>};<br>\u200b<br>static std::string IdentifierStr; \/\/ Filled in if tok_identifier<br>static double NumVal; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \/\/ Filled in if tok_number<\/pre>\n\n\n\n<p>\u901a\u8fc7\u8bcd\u6cd5\u5206\u6790\u5668\u8fd4\u56de\u7684\u6bcf\u4e2atoken \u4f1a\u662f\u67d0\u4e2aToken \u679a\u4e3e\u503c(Token enum values)\u6216\u8005\u662f\u4e0d\u77e5\u9053(\u672a\u5b9a\u4e49)\u7684\u5b57\u7b26\u50cf &#8220;+&#8221; \u5c31\u4f1a\u8fd4\u56de\u5b83\u7684Ascii\u503c\u3002\u5982\u679c\u5f53\u524d\u7684token\u662f\u4e00\u4e2a\u6807\u8bc6\u7b26(Token == -4)\uff0c\u90a3\u4e48\u5168\u5c40\u53d8\u91cfIdentifierStr\u5c31\u4f1a\u6807\u8bc6\u7b26\u7684\u540d\u79f0\u3002\u5982\u679c\u5f53\u524dkoken\u662f\u6570\u5b57\uff0c\u90a3\u4e48\u5168\u5c40\u53d8\u91cfNumVal \u5c31\u4f1a\u4fdd\u5b58\u5b83\u7684\u503c\u3002\u6211\u4eec\u56e0\u4e3a\u7b80\u5355\u4f7f\u7528\u5168\u5c40\u53d8\u91cf\uff0c\u4f46\u662f\u5bf9\u4e8e\u771f\u6b63\u7684\u8bed\u8a00\u5b9e\u73b0\u5e76\u4e0d\u662f\u4e00\u4e2a\u597d\u7684\u9009\u62e9\u3002<\/p>\n\n\n\n<p>\u8bcd\u6cd5\u5206\u6790\u7684\u771f\u6b63\u5b9e\u73b0\u662f\u4e00\u4e2a\u53ebgettok\u7684\u7b80\u5355\u51fd\u6570\uff0cgettok\u51fd\u6570\u88ab\u8c03\u7528\u65f6\u8fd4\u56de\u6807\u51c6\u8f93\u5165\u7684\u4e0b\u4e00\u4e2atoken\u3002\u5b83\u5b9a\u4e49\u7684\u5f00\u59cb\u662f\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">\/\/ gettok - Return the next token from standard input.<br>static int gettok() {<br>  static int LastChar = ' ';<br>\u200b<br>  \/\/ Skip any whitespace.<br>  while (isspace(LastChar))<br> &nbsp;  LastChar = getchar();<\/pre>\n\n\n\n<p>gettok \u901a\u8fc7\u8c03\u7528C\u51fd\u6570getchar() \u6765\u4ece\u6807\u51c6\u8f93\u5165\u6bcf\u6b21\u8bfb\u53d6\u4e00\u4e2a\u5b57\u7b26\u3002\u5b83\u5728\u8bc6\u522b\u51fa\u5b83\u4eec\u540e\u5c31\u5403\u6389\u5b83\u4eec\uff0c\u5e76\u5c06\u6700\u540e\u8bfb\u53d6\u4f46\u672a\u5904\u7406\u7684\u5b57\u7b26\u5b58\u50a8\u5728LastChar\u4e2d\u3002\u5b83\u8981\u505a\u7684\u7b2c\u4e00\u4ef6\u4e8b\u662f\u5ffd\u7565\u4ee4\u724c\u4e4b\u95f4\u7684\u7a7a\u683c\u3002\u8fd9\u662f\u901a\u8fc7\u4e0a\u9762\u7684\u5faa\u73af\u5b8c\u6210\u7684\u3002<\/p>\n\n\n\n<p>gettok\u8981\u505a\u7684\u4e0b\u4e00\u4ef6\u4e8b\u662f\u8bc6\u522b\u6807\u8bc6\u7b26\u548c\u7279\u5b9a\u7684\u5173\u952e\u5b57\u50cf&#8221;def&#8221; , \u4e07\u82b1\u7b52\u901a\u8fc7\u5b83\u7684\u5faa\u73af\u6765\u5b9e\u73b0\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">if (isalpha(LastChar)) { \/\/ identifier: [a-zA-Z][a-zA-Z0-9]*<br>  IdentifierStr = LastChar;<br>  while (isalnum((LastChar = getchar())))<br> &nbsp;  IdentifierStr += LastChar;<br>\u200b<br>  if (IdentifierStr == \"def\")<br> &nbsp;  return tok_def;<br>  if (IdentifierStr == \"extern\")<br> &nbsp;  return tok_extern;<br>  return tok_identifier;<br>}<\/pre>\n\n\n\n<p>\u6ce8\u610f\uff0c\u6b64\u4ee3\u7801\u5728\u5bf9\u6807\u8bc6\u7b26\u8fdb\u884c\u8bcd\u6cd5\u8bc6\u522b\u65f6\u4f1a\u5168\u5c40\u8bbe\u7f6e\u201c IdentifierStr\u201d\uff0c \u53e6\u5916<\/p>\n\n\n\n<p>&#8230;&#8230;.\u81ea\u5df1\u770b\u5230\u7b2c\u4e8c\u7ae0\u4e2d\u95f4\u7ec8\u4e8e\u770b\u4e0d\u4e0b\u53bb\u4e86 \uff0c\u76f4\u63a5\u770b\u4ee3\u7801\u624d\u662f\u6211\u7684\u98ce\u683c\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#include \"llvm\/ADT\/STLExtras.h\"<br>#include &lt;algorithm&gt;<br>#include &lt;cctype&gt;<br>#include &lt;cstdio&gt;<br>#include &lt;cstdlib&gt;<br>#include &lt;map&gt;<br>#include &lt;memory&gt;<br>#include &lt;string&gt;<br>#include &lt;vector&gt;<br>\u200b<br>\/\/===----------------------------------------------------------------------===\/\/<br>\/\/ Lexer<br>\/\/===----------------------------------------------------------------------===\/\/<br>\u200b<br>\/\/ The lexer returns tokens [0-255] if it is an unknown character, otherwise one<br>\/\/ of these for known things.<br>enum Token {<br>  tok_eof = -1,<br>\u200b<br>  \/\/ commands<br>  tok_def = -2,<br>  tok_extern = -3,<br>\u200b<br>  \/\/ primary<br>  tok_identifier = -4,<br>  tok_number = -5<br>};<br>\u200b<br>static std::string IdentifierStr; \/\/ Filled in if tok_identifier<br>static double NumVal; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \/\/ Filled in if tok_number<br>\u200b<br>\/\/\/ gettok - Return the next token from standard input.<br>\/\/\u8bfb\u53d6\u8f93\u5165\uff0c\u6807\u8bb0token\u548c\u5bf9\u5e94\u7684\u503c\u3002<br>static int gettok() {<br>  static int LastChar = ' ';<br>\u200b<br>  \/\/ Skip any whitespace.<br>  while (isspace(LastChar))<br> &nbsp;  LastChar = getchar();<br>\u200b<br>  if (isalpha(LastChar)) { \/\/ identifier: [a-zA-Z][a-zA-Z0-9]*<br> &nbsp;  IdentifierStr = LastChar;<br> &nbsp;  while (isalnum((LastChar = getchar())))<br> &nbsp; &nbsp;  IdentifierStr += LastChar;<br>\u200b<br> &nbsp;  if (IdentifierStr == \"def\")<br> &nbsp; &nbsp;  return tok_def;<br> &nbsp;  if (IdentifierStr == \"extern\")<br> &nbsp; &nbsp;  return tok_extern;<br> &nbsp;  return tok_identifier;<br>  }<br>\u200b<br>  if (isdigit(LastChar) || LastChar == '.') { \/\/ Number: [0-9.]+<br> &nbsp;  std::string NumStr;<br> &nbsp;  do {<br> &nbsp; &nbsp;  NumStr += LastChar;<br> &nbsp; &nbsp;  LastChar = getchar();<br> &nbsp;  } while (isdigit(LastChar) || LastChar == '.');<br>\u200b<br> &nbsp;  NumVal = strtod(NumStr.c_str(), nullptr);<br> &nbsp;  return tok_number;<br>  }<br>\u200b<br>  if (LastChar == '#') {<br> &nbsp;  \/\/ Comment until end of line.<br> &nbsp;  do<br> &nbsp; &nbsp;  LastChar = getchar();<br> &nbsp;  while (LastChar != EOF &amp;&amp; LastChar != '\\n' &amp;&amp; LastChar != '\\r');<br>\u200b<br> &nbsp;  if (LastChar != EOF)<br> &nbsp; &nbsp;  return gettok();<br>  }<br>\u200b<br>  \/\/ Check for end of file.  Don't eat the EOF.<br>  if (LastChar == EOF)<br> &nbsp;  return tok_eof;<br>\u200b<br>  \/\/ Otherwise, just return the character as its ascii value.<br>  int ThisChar = LastChar;<br>  LastChar = getchar();<br>  return ThisChar;<br>}<br>\u200b<br>\/\/===----------------------------------------------------------------------===\/\/<br>\/\/ Abstract Syntax Tree (aka Parse Tree)<br>\/\/===----------------------------------------------------------------------===\/\/<br>\u200b<br>namespace {<br>\u200b<br>\/\/\/ ExprAST - Base class for all expression nodes.<br>class ExprAST {<br>public:<br>  virtual ~ExprAST() = default;<br>};<br>\u200b<br>\/\/\/ NumberExprAST - Expression class for numeric literals like \"1.0\".<br>class NumberExprAST : public ExprAST {<br>  double Val;<br>\u200b<br>public:<br>  NumberExprAST(double Val) : Val(Val) {}<br>};<br>\u200b<br>\/\/\/ VariableExprAST - Expression class for referencing a variable, like \"a\".<br>class VariableExprAST : public ExprAST {<br>  std::string Name;<br>\u200b<br>public:<br>  VariableExprAST(const std::string &amp;Name) : Name(Name) {}<br>};<br>\u200b<br>\/\/\/ BinaryExprAST - Expression class for a binary operator.<br>class BinaryExprAST : public ExprAST {<br>  char Op;<br>  std::unique_ptr&lt;ExprAST&gt; LHS, RHS;<br>\u200b<br>public:<br>  BinaryExprAST(char Op, std::unique_ptr&lt;ExprAST&gt; LHS,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  std::unique_ptr&lt;ExprAST&gt; RHS)<br> &nbsp; &nbsp;  : Op(Op), LHS(std::move(LHS)), RHS(std::move(RHS)) {}<br>};<br>\u200b<br>\/\/\/ CallExprAST - Expression class for function calls.<br>class CallExprAST : public ExprAST {<br>  std::string Callee;<br>  std::vector&lt;std::unique_ptr&lt;ExprAST&gt;&gt; Args;<br>\u200b<br>public:<br>  CallExprAST(const std::string &amp;Callee,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  std::vector&lt;std::unique_ptr&lt;ExprAST&gt;&gt; Args)<br> &nbsp; &nbsp;  : Callee(Callee), Args(std::move(Args)) {}<br>};<br>\u200b<br>\/\/\/ PrototypeAST - This class represents the \"prototype\" for a function,<br>\/\/\/ which captures its name, and its argument names (thus implicitly the number<br>\/\/\/ of arguments the function takes).<br>class PrototypeAST {<br>  std::string Name;<br>  std::vector&lt;std::string&gt; Args;<br>\u200b<br>public:<br>  PrototypeAST(const std::string &amp;Name, std::vector&lt;std::string&gt; Args)<br> &nbsp; &nbsp;  : Name(Name), Args(std::move(Args)) {}<br>\u200b<br>  const std::string &amp;getName() const { return Name; }<br>};<br>\u200b<br>\/\/\/ FunctionAST - This class represents a function definition itself.<br>class FunctionAST {<br>  std::unique_ptr&lt;PrototypeAST&gt; Proto;<br>  std::unique_ptr&lt;ExprAST&gt; Body;<br>\u200b<br>public:<br>  FunctionAST(std::unique_ptr&lt;PrototypeAST&gt; Proto,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  std::unique_ptr&lt;ExprAST&gt; Body)<br> &nbsp; &nbsp;  : Proto(std::move(Proto)), Body(std::move(Body)) {}<br>};<br>\u200b<br>} \/\/ end anonymous namespace<br>\u200b<br>\/\/===----------------------------------------------------------------------===\/\/<br>\/\/ Parser<br>\/\/===----------------------------------------------------------------------===\/\/<br>\u200b<br>\/\/\/ CurTok\/getNextToken - Provide a simple token buffer.  CurTok is the current<br>\/\/\/ token the parser is looking at.  getNextToken reads another token from the<br>\/\/\/ lexer and updates CurTok with its results.<br>static int CurTok;<br>static int getNextToken() { return CurTok = gettok(); }<br>\u200b<br>\/\/\/ BinopPrecedence - This holds the precedence for each binary operator that is<br>\/\/\/ defined.<br>static std::map&lt;char, int&gt; BinopPrecedence;<br>\u200b<br>\/\/\/ GetTokPrecedence - Get the precedence of the pending binary operator token.<br>static int GetTokPrecedence() {<br>  if (!isascii(CurTok))<br> &nbsp;  return -1;<br>\u200b<br>  \/\/ Make sure it's a declared binop.<br>  int TokPrec = BinopPrecedence[CurTok];<br>  if (TokPrec &lt;= 0)<br> &nbsp;  return -1;<br>  return TokPrec;<br>}<br>\u200b<br>\/\/\/ LogError* - These are little helper functions for error handling.<br>std::unique_ptr&lt;ExprAST&gt; LogError(const char *Str) {<br>  fprintf(stderr, \"Error: %s\\n\", Str);<br>  return nullptr;<br>}<br>std::unique_ptr&lt;PrototypeAST&gt; LogErrorP(const char *Str) {<br>  LogError(Str);<br>  return nullptr;<br>}<br>\u200b<br>static std::unique_ptr&lt;ExprAST&gt; ParseExpression();<br>\u200b<br>\/\/\/ numberexpr ::= number<br>static std::unique_ptr&lt;ExprAST&gt; ParseNumberExpr() {<br>  auto Result = std::make_unique&lt;NumberExprAST&gt;(NumVal);<br>  getNextToken(); \/\/ consume the number<br>  return std::move(Result);<br>}<br>\u200b<br>\/\/\/ parenexpr ::= '(' expression ')'<br>static std::unique_ptr&lt;ExprAST&gt; ParseParenExpr() {<br>  getNextToken(); \/\/ eat (.<br>  auto V = ParseExpression();<br>  if (!V)<br> &nbsp;  return nullptr;<br>\u200b<br>  if (CurTok != ')')<br> &nbsp;  return LogError(\"expected ')'\");<br>  getNextToken(); \/\/ eat ).<br>  return V;<br>}<br>\u200b<br>\/\/\/ identifierexpr<br>\/\/\/ &nbsp; ::= identifier<br>\/\/\/ &nbsp; ::= identifier '(' expression* ')'<br>static std::unique_ptr&lt;ExprAST&gt; ParseIdentifierExpr() {<br>  std::string IdName = IdentifierStr;<br>\u200b<br>  getNextToken(); \/\/ eat identifier.<br>\u200b<br>  if (CurTok != '(') \/\/ Simple variable ref.<br> &nbsp;  return std::make_unique&lt;VariableExprAST&gt;(IdName);<br>\u200b<br>  \/\/ Call.<br>  getNextToken(); \/\/ eat (<br>  std::vector&lt;std::unique_ptr&lt;ExprAST&gt;&gt; Args;<br>  if (CurTok != ')') {<br> &nbsp;  while (true) {<br> &nbsp; &nbsp;  if (auto Arg = ParseExpression())<br> &nbsp; &nbsp; &nbsp;  Args.push_back(std::move(Arg));<br> &nbsp; &nbsp;  else<br> &nbsp; &nbsp; &nbsp;  return nullptr;<br>\u200b<br> &nbsp; &nbsp;  if (CurTok == ')')<br> &nbsp; &nbsp; &nbsp;  break;<br>\u200b<br> &nbsp; &nbsp;  if (CurTok != ',')<br> &nbsp; &nbsp; &nbsp;  return LogError(\"Expected ')' or ',' in argument list\");<br> &nbsp; &nbsp;  getNextToken();<br> &nbsp;  }<br>  }<br>\u200b<br>  \/\/ Eat the ')'.<br>  getNextToken();<br>\u200b<br>  return std::make_unique&lt;CallExprAST&gt;(IdName, std::move(Args));<br>}<br>\u200b<br>\/\/\/ primary<br>\/\/\/ &nbsp; ::= identifierexpr<br>\/\/\/ &nbsp; ::= numberexpr<br>\/\/\/ &nbsp; ::= parenexpr<br>static std::unique_ptr&lt;ExprAST&gt; ParsePrimary() {<br>  switch (CurTok) {<br>  default:<br> &nbsp;  return LogError(\"unknown token when expecting an expression\");<br>  case tok_identifier:<br> &nbsp;  return ParseIdentifierExpr();<br>  case tok_number:<br> &nbsp;  return ParseNumberExpr();<br>  case '(':<br> &nbsp;  return ParseParenExpr();<br>  }<br>}<br>\u200b<br>\/\/\/ binoprhs<br>\/\/\/ &nbsp; ::= ('+' primary)*<br>static std::unique_ptr&lt;ExprAST&gt; ParseBinOpRHS(int ExprPrec,<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;  std::unique_ptr&lt;ExprAST&gt; LHS) {<br>  \/\/ If this is a binop, find its precedence.<br>  while (true) {<br> &nbsp;  int TokPrec = GetTokPrecedence();<br>\u200b<br> &nbsp;  \/\/ If this is a binop that binds at least as tightly as the current binop,<br> &nbsp;  \/\/ consume it, otherwise we are done.<br> &nbsp;  if (TokPrec &lt; ExprPrec)<br> &nbsp; &nbsp;  return LHS;<br>\u200b<br> &nbsp;  \/\/ Okay, we know this is a binop.<br> &nbsp;  int BinOp = CurTok;<br> &nbsp;  getNextToken(); \/\/ eat binop<br>\u200b<br> &nbsp;  \/\/ Parse the primary expression after the binary operator.<br> &nbsp;  auto RHS = ParsePrimary();<br> &nbsp;  if (!RHS)<br> &nbsp; &nbsp;  return nullptr;<br>\u200b<br> &nbsp;  \/\/ If BinOp binds less tightly with RHS than the operator after RHS, let<br> &nbsp;  \/\/ the pending operator take RHS as its LHS.<br> &nbsp;  int NextPrec = GetTokPrecedence();<br> &nbsp;  if (TokPrec &lt; NextPrec) {<br> &nbsp; &nbsp;  RHS = ParseBinOpRHS(TokPrec + 1, std::move(RHS));<br> &nbsp; &nbsp;  if (!RHS)<br> &nbsp; &nbsp; &nbsp;  return nullptr;<br> &nbsp;  }<br>\u200b<br> &nbsp;  \/\/ Merge LHS\/RHS.<br> &nbsp;  LHS = std::make_unique&lt;BinaryExprAST&gt;(BinOp, std::move(LHS),<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; std::move(RHS));<br>  }<br>}<br>\u200b<br>\/\/\/ expression<br>\/\/\/ &nbsp; ::= primary binoprhs<br>\/\/\/<br>static std::unique_ptr&lt;ExprAST&gt; ParseExpression() {<br>  auto LHS = ParsePrimary();<br>  if (!LHS)<br> &nbsp;  return nullptr;<br>\u200b<br>  return ParseBinOpRHS(0, std::move(LHS));<br>}<br>\u200b<br>\/\/\/ prototype<br>\/\/\/ &nbsp; ::= id '(' id* ')'<br>static std::unique_ptr&lt;PrototypeAST&gt; ParsePrototype() {<br>  if (CurTok != tok_identifier)<br> &nbsp;  return LogErrorP(\"Expected function name in prototype\");<br>\u200b<br>  std::string FnName = IdentifierStr;<br>  getNextToken();<br>\u200b<br>  if (CurTok != '(')<br> &nbsp;  return LogErrorP(\"Expected '(' in prototype\");<br>\u200b<br>  std::vector&lt;std::string&gt; ArgNames;<br>  while (getNextToken() == tok_identifier)<br> &nbsp;  ArgNames.push_back(IdentifierStr);<br>  if (CurTok != ')')<br> &nbsp;  return LogErrorP(\"Expected ')' in prototype\");<br>\u200b<br>  \/\/ success.<br>  getNextToken(); \/\/ eat ')'.<br>\u200b<br>  return std::make_unique&lt;PrototypeAST&gt;(FnName, std::move(ArgNames));<br>}<br>\u200b<br>\/\/\/ definition ::= 'def' prototype expression<br>static std::unique_ptr&lt;FunctionAST&gt; ParseDefinition() {<br>  fprintf(stdout,\"[+] print %s\\n\" ,IdentifierStr.c_str());<br>  \/\/ std::cout &lt;&lt; IdentifierStr &lt;&lt;std::endl;<br>  getNextToken(); \/\/ eat def.<br>  fprintf(stdout,\"[+] print %s\\n\" ,IdentifierStr.c_str());<br>  auto Proto = ParsePrototype();<br>  if (!Proto)<br> &nbsp;  return nullptr;<br>\u200b<br>  if (auto E = ParseExpression())<br> &nbsp;  return std::make_unique&lt;FunctionAST&gt;(std::move(Proto), std::move(E));<br>  return nullptr;<br>}<br>\u200b<br>\/\/\/ toplevelexpr ::= expression<br>static std::unique_ptr&lt;FunctionAST&gt; ParseTopLevelExpr() {<br>  if (auto E = ParseExpression()) {<br> &nbsp;  \/\/ Make an anonymous proto.<br> &nbsp;  auto Proto = std::make_unique&lt;PrototypeAST&gt;(\"__anon_expr\",<br> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; std::vector&lt;std::string&gt;());<br> &nbsp;  return std::make_unique&lt;FunctionAST&gt;(std::move(Proto), std::move(E));<br>  }<br>  return nullptr;<br>}<br>\u200b<br>\/\/\/ external ::= 'extern' prototype<br>static std::unique_ptr&lt;PrototypeAST&gt; ParseExtern() {<br>  getNextToken(); \/\/ eat extern.<br>  return ParsePrototype();<br>}<br>\u200b<br>\/\/===----------------------------------------------------------------------===\/\/<br>\/\/ Top-Level parsing<br>\/\/===----------------------------------------------------------------------===\/\/<br>\u200b<br>static void HandleDefinition() {<br>  if (ParseDefinition()) {<br> &nbsp;  fprintf(stderr, \"Parsed a function definition.\\n\");<br>  } else {<br> &nbsp;  \/\/ Skip token for error recovery.<br> &nbsp;  getNextToken();<br>  }<br>}<br>\u200b<br>static void HandleExtern() {<br>  if (ParseExtern()) {<br> &nbsp;  fprintf(stderr, \"Parsed an extern\\n\");<br>  } else {<br> &nbsp;  \/\/ Skip token for error recovery.<br> &nbsp;  getNextToken();<br>  }<br>}<br>\u200b<br>static void HandleTopLevelExpression() {<br>  \/\/ Evaluate a top-level expression into an anonymous function.<br>  if (ParseTopLevelExpr()) {<br> &nbsp;  fprintf(stderr, \"Parsed a top-level expr\\n\");<br>  } else {<br> &nbsp;  \/\/ Skip token for error recovery.<br> &nbsp;  getNextToken();<br>  }<br>}<br>\u200b<br>\/\/\/ top ::= definition | external | expression | ';'<br>static void MainLoop() {<br>  while (true) {<br> &nbsp;  fprintf(stderr, \"ready&gt; \");<br> &nbsp;  switch (CurTok) {<br> &nbsp;  case tok_eof:<br> &nbsp; &nbsp;  return;<br> &nbsp;  case ';': \/\/ ignore top-level semicolons.<br> &nbsp; &nbsp;  getNextToken();<br> &nbsp; &nbsp;  break;<br> &nbsp;  case tok_def:<br> &nbsp; &nbsp;  HandleDefinition();<br> &nbsp; &nbsp;  break;<br> &nbsp;  case tok_extern:<br> &nbsp; &nbsp;  HandleExtern();<br> &nbsp; &nbsp;  break;<br> &nbsp;  default:<br> &nbsp; &nbsp;  HandleTopLevelExpression();<br> &nbsp; &nbsp;  break;<br> &nbsp;  }<br>  }<br>}<br>\u200b<br>\/\/===----------------------------------------------------------------------===\/\/<br>\/\/ Main driver code.<br>\/\/===----------------------------------------------------------------------===\/\/<br>\u200b<br>int main() {<br>  \/\/ Install standard binary operators.<br>  \/\/ 1 is lowest precedence.<br>  BinopPrecedence['&lt;'] = 10;<br>  BinopPrecedence['+'] = 20;<br>  BinopPrecedence['-'] = 20;<br>  BinopPrecedence['*'] = 40; \/\/ highest.<br>\u200b<br>  \/\/ Prime the first token.<br>  fprintf(stderr, \"ready&gt; \");<br>  getNextToken();<br>\u200b<br>  \/\/ Run the main \"interpreter loop\" now.<br>  MainLoop();<br>\u200b<br>  return 0;<br>}<\/pre>\n\n\n\n<p>\u7f16\u8bd1\u547d\u4ee4\uff08\u81ea\u5df1\u6539\u4e0b\uff09\uff1a$LLVM_HOME\/bin\/clang++ -g -O3 toy.cpp -std=c++14 <code>llvm-config --ldflags --libs<\/code> -lpthread -I$LLVM_HOME\/include -ldl -lz -ltinfo<\/p>\n\n\n\n<p>\u4ec0\u4e48\u662f\u62bd\u8c61\u8bed\u6cd5\u6811\uff0c\u4e2a\u4eba\u7b80\u5355\u7406\u89e3\u5c31\u662f\u7528\u4e00\u9897\u6811\u6765\u5b58\u4e00\u4e2a\u8868\u8fbe\u5f0f\u6216\u8005\u8bf4\u8bed\u6cd5\u5757\uff08\u4e0d\u4e00\u5b9a\u6b63\u786e\uff09\uff0c\u8868\u8fbe\u5f0f\u8d8a\u590d\u6742\u6811\u8d8a\u590d\u6742\u3002\u901a\u8fc7\u76f4\u63a5\u770b\u4ee3\u7801\u53d1\u73b0\u4e07\u82b1\u7b52\u652f\u6301\u5b9a\u4e49\u4f7f\u7528\u4e86\u56db\u79cdtoken \uff0cdef extern identifier number,\u7528\u4e8e\u8bcd\u6cd5\u5206\u6790\uff0c\u9996\u5148\u5206\u6790\u8bed\u53e5\uff0c\u53ef\u4ee5\u662f\u6807\u8bc6\u7b26\u3001\u6570\u5b57\u6216\u8005\u8868\u8fbe\u5f0f\u3002<\/p>\n\n\n\n<p>\u4ee5A(1+1)\u4e3a\u4f8b\u5b50\u5206\u6790\uff1a<\/p>\n\n\n\n<p>\u7a0b\u5e8f\u6267\u884c\u6d41\u7a0b\u4e3a HandleTopLevelExpression -&gt; ParseTopLevelExpr -&gt; ParseExpression-&gt;ParsePrimary -&gt; ParseIdentifierExpr -&gt;ParseExpression-&gt;ParsePrimary -&gt;ParseNumberExpr -&gt;ParseBinOpRHS- &gt; ParsePrimary -&gt;ParseNumberExpr -&gt;ParseBinOpRHS(\u4e00\u6b21\u8fed\u4ee3) \u5148\u770b\u7a0b\u5e8f\u6267\u884c\u4e0d\u5173\u5fc3\u8bed\u6cd5\u6811\u7ec4\u6210\u3002\u5c06A(1+1)\u770b\u4f5c\u4e00\u4e2a\u8868\u8fbe\u5f0f\uff0cA\u4e3aidentifier \u518d\u5c061+1 \u770b\u4f5c\u4e00\u4e2a\u8868\u8fbe\u5f0f\uff0c\u89e3\u6790 1 \u540e\u9047\u5230&#8221;+&#8221; \u540e\u5224\u65ad\u64cd\u4f5c\u7b26\u4f18\u5148\u7ea7\u540e\u89e3\u6790\u4e0b\u4e00\u4e2atoken\uff0c\u6b64\u65f6\u4e0b\u4e00\u4e2atoken\u4e3a1 \uff0c\u5728\u5224\u65ad\u518d\u4e0b\u4e00\u4e2atoken\u7684\u4f18\u5148\u7ea7\uff0c\u56e0\u4e3a\u662f&#8221;)&#8221;\u8fd4\u56de&#8221;-1&#8243;,\u7136\u540e\u5c06&#8221;1 + 1&#8243;\u8868\u8fbe\u5f0f\u5408\u5e76\u518d\u8fd4\u56de\uff0c\u5c42\u5c42\u8fd4\u56de\u56de\u5230ParseIDentifierExpr ,\u6210\u4e3a&#8221;A&#8221;\u6807\u8bc6\u7b26\u7684\u53c2\u6570\uff0c\u8fd4\u56de\u5408\u6210\u7684\u8bed\u6cd5\u6811\uff0c\u8868\u8fbe\u5f0f\u89e3\u6790\u7ed3\u675f\u3002<\/p>\n\n\n\n<p>\u7136\u540e\u6211\u4eec\u518d\u770b\u5176\u4e2d\u4f7f\u7528\u4e86\u54ea\u4e9b\u8bed\u6cd5\u6811\uff0c\u5728ParseIdentifierExpr\u4f1a\u8fd4\u56de\u4e00\u4e2aCallExprAST \u8bed\u6cd5\u6811\uff0cCallExprAST\u8bed\u6cd5\u6811\u521d\u59cb\u5316\u65f6\u53c2\u6570\u4e3a\u51fd\u6570\u540d\u79f0callee \u548cBinaryExprAST \u8bed\u6cd5\u6811\uff0c\u5728\u8fd9\u4e2a\u4f8b\u5b50\u4e2d\uff0c\u4f7f\u7528\u7684\u7c7bParseNumberExpr\u8be5\u7c7b\u4ec5\u6709\u4e00\u4e2a\u6210\u5458\u8868\u793a\u8fd9\u4e2a\u6570\u503c\uff0c\u563f\u563f \u5176\u5b9e\u518d\u770b\u4e00\u773c\u5c31\u4f1a\u53d1\u73b0&#8221;1+1&#8243;\u5176\u5b9e\u4e0d\u662f\u8fd4\u56deParseNumberExpr \uff0c\u7531\u4e8e\u9047\u5230&#8221;+&#8221; \u4e14\u540e\u9762\u7684token\u5408\u6cd5\uff0c\u6240\u4ee5\u8fd4\u56de\u4e86\u4e00\u4e2aBinaryExprAST\u8bed\u6cd5\u6811\uff0c\u6700\u7ec8\u751f\u6210\u7684\u6811\u5c31\u662f:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"521\" height=\"267\" src=\"http:\/\/47.108.144.50\/wp-content\/uploads\/2020\/08\/image.png\" alt=\"\" class=\"wp-image-24\" srcset=\"http:\/\/pareto.fun\/wp-content\/uploads\/2020\/08\/image.png 521w, http:\/\/pareto.fun\/wp-content\/uploads\/2020\/08\/image-300x154.png 300w\" sizes=\"(max-width: 521px) 100vw, 521px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p><a href=\"https:\/\/llvm.org\/docs\/tutorial\/MyFirstLanguageFrontend\/index.html\">https:\/\/llvm.org\/docs\/tutorial\/MyFirstLanguageFrontend\/index.html<\/a> MyfirstLanguagefrontend<\/p><\/blockquote>\n\n\n\n<p>\u5176\u4e2d\u7684\u5c0f\u63d2\u66f2\uff0c\u6253\u5370std::string \u62a5\u4e86\u4e2a\u9519\uff0c\u5185\u5bb9\u5fd8\u4e86 \u5c31\u662f\u7f3a\u5c11\u4e86stdlibc++ \u7684\u8c03\u8bd5\u4fe1\u606f\u5305\uff0c\u88c5\u4e00\u4e0b\u5c31\u597d\u3002<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">sudo apt install libstdc++6-8-dbg<\/pre>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u7531\u4e8e\u5f88\u591a\u6559\u7a0b\u7f16\u5199\u7684pass\u5b58\u5728\u95ee\u9898\uff0c\u81ea\u5df1\u53c8\u4e0d\u4f1a\u6539 \uff0c\u672c\u7cfb\u5217\u6559\u7a0b\u76ee\u7684\u662f\u5b66\u4e60\u7f16\u5199llvm pass \u4e3b\u8981 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[4],"_links":{"self":[{"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/posts\/21"}],"collection":[{"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/pareto.fun\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=21"}],"version-history":[{"count":7,"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/posts\/21\/revisions"}],"predecessor-version":[{"id":34,"href":"http:\/\/pareto.fun\/index.php?rest_route=\/wp\/v2\/posts\/21\/revisions\/34"}],"wp:attachment":[{"href":"http:\/\/pareto.fun\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=21"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/pareto.fun\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=21"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/pareto.fun\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=21"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}