Friday, June 17, 2005

A simple evaluating parser

Continuing on from the lexer from the last day, here's a simple parser. The parser grammar is as follows:
expression : simpleExpression EOF ;
simpleExpression : term ( PLUS term | MINUS term )* ;
term : factor ( ASTERISK factor | SLASH factor ) ;
factor :
    MINUS factor
  |  NUMBER
  |  LPAREN simpleExpression RPAREN
  ;
Writing a simple recursive descent parser for this grammar is easy enough. Every rule turns into a function. The function follows the grammar in order to match it, using tokens from the lexer to make decisions. To make this parser useful, every rule will return the evaluation of the expression so far. So, the factor rule will return either a number, or else it will recurse into factor or simpleExpression as appropriate.

Here's the parser:

using System;
using System.Text;

namespace Calc
{
    class Parser
    {
        private Lexer _lexer;
       
        public Parser(Lexer lexer)
        {
            _lexer = lexer;
        }
       
        private Lexer Lexer
        {
            get { return _lexer; }
        }
       
        private void Expect(TokenKind kind)
        {
            if (Lexer.Kind != kind)
                throw new ApplicationException(
                    string.Format("Expected {0}, found {1}.", kind, Lexer.Kind));
        }
       
        private void Eat(TokenKind kind)
        {
            Expect(kind);
            Lexer.NextToken();
        }
       
        public double Expression()
        {
            double result = SimpleExpression(false);
            Eat(TokenKind.Eof);
            return result;
        }
       
        private double SimpleExpression(bool skip)
        {
            double result = Term(skip);
            for (;;)
                switch (Lexer.Kind)
                {
                    case TokenKind.Plus: result += Term(true); break;
                    case TokenKind.Minus: result -= Term(true); break;
                    default:
                        return result;
                }
        }
       
        private double Term(bool skip)
        {
            double result = Factor(skip);
            for (;;)
                switch (Lexer.Kind)
                {
                    case TokenKind.Asterisk: result *= Factor(true); break;
                    case TokenKind.Slash: result /= Factor(true); break;
                    default:
                        return result;
                }
        }
       
        private double Factor(bool skip)
        {
            if (skip)
                Lexer.NextToken();
            switch (Lexer.Kind)
            {
                case TokenKind.Number:
                {
                    double result = Lexer.TokenAsDouble;
                    Lexer.NextToken();
                    return result;
                }
               
                case TokenKind.LParen:
                {
                    double result = SimpleExpression(true);
                    Eat(TokenKind.RParen);
                    return result;
                }
               
                case TokenKind.Minus:
                    return - Factor(true);
               
                default:
                    throw new ApplicationException("Unexpected token: " + Lexer.Kind);
            }
        }
       
        static void Main(string[] args)
        {
            try
            {
                StringBuilder source = new StringBuilder();
                foreach (string arg in args)
                    source.Append(arg).Append(' ');
                if (args.Length > 0)
                    source.Length -= 1;
                Lexer lexer = new Lexer(source.ToString());
                Parser parser = new Parser(lexer);
                Console.WriteLine("Result: {0}", parser.Expression());
            }
            catch (Exception ex)
            {
                Console.Error.WriteLine(ex);
            }
        }
    }
}
Remove the "static void Main" function from the lexer, and recompile with
csc Lexer.cs Parser.cs
and execute:
C:\> parser 45
Result: 45

C:\> parser 45 - 24
Result: 21

C:\> parser 2 * (3 + 4)
Result: 14
Next time, I'll talk about how simple stack-based virtual machines work, and after that how the parser can be modified to produce code for virtual machines.

No comments: