Expression Language

Parsers are responsible for enforcing a language’s syntax rules, which are specified in a grammar. There are numerous tools, known as parser generators, which can read a grammar, and output a corresponding parser. For this project, we have chosen the Jison(Bison equivalent for Javascript) parser generator because it outputs JavaScript parsers which can easily be used on the web. In addition to enforcing syntax, a parser is also responsible for constructing an abstract syntax tree (AST) representation of a program.

What is Jison?

Jison is an utility which takes a context-free grammar as input and outputs a JavaScript file capable of parsing the language described by that grammar. You can then use the generated script to parse inputs and accept, reject, or perform actions based on the input. (The syntax is similar to Yacc/Bison).

Installing Jison

Jison is easy to install.

npm install jison -g

Defining the Grammer

The basic structure of a Jison grammer file is as follows:


%{
Javascript Declarations
%}

Jison Declarations

%%
Grammar Rules
%%

Additional Javascript code
  • Javascript Declarations : This contains all the function declarations to be used in the grammer.

  • Jison Delarations : The Jison declarations section contains declarations that define terminal and nonterminal symbols, specify precedence, and so on.

  • Grammer Rules : The grammar rules section contains one or more Jison grammar rules, and nothing else. There must always be at least one grammar rule, and the first `%%’ (which precedes the grammar rules) may never be omitted even if it is the first thing in the file.

  • Additional Javascript Code : The additional Javascript code section is copied verbatim to the end of the parser file, just as the Javascript declarations section is copied to the beginning. This is the most convenient place to put anything that you want to have in the parser file

Phases of Expression Parsing

There are two phases of parsing :

  • Parsing the expression language syntax and generating the AST.
  • Performing a build pass on the AST to generate valid Javascript.

Expression Language Implementation

Implementation of Expression language involves the following steps;

  • Defining a grammar file - ‘expression-syntax-parser.jison
  • Generating the corresponding javascript file using jison commandline utility - ‘expression-syntax-parser.js
  • Parsing the expression to create an AST.
  • Build pass the AST to generate and execute the javascript - ‘expression-ast-parser.js
  • Utility wrapper - ‘expression-language.js’ - acts as the interface between application and expression language.

The in-depth analysis of the files is as follows:

Defining a Grammar file

The grammar file contains the sections as mentioned in ‘Defining the grammar’ above;

  • Jison Declarations: Defining some terminal and non-terminal symbols:

    Symbols in the Jison grammar represent the grammatical classifications of the language.

    A terminal symbol (also known as a token type) represents a class of syntactically equivalent tokens. You use the symbol in grammar rules to mean that a token in that class is allowed.

    A nonterminal symbol stands for a class of syntactically equivalent groupings. The symbol name is used in writing grammar rules. By convention, it should be all lower case.

    Symbol names can contain letters, underscores, periods, and non-initial digits and dashes.

    Sample implementation of the jison declaration section,

    "@m"                               return "@M";
    "where"                            return "@W";
    "@i"                               return "@I";
    "like"                             return "LIKE";
    "between"                          return "BETWEEN";
    "and"                              return "and";
    "or"                               return "or";
    "callContext"                      return "@CC";
    "@modelprop"                       return "@MP";
    
  • Grammar Rules: A Jison grammar rule has the following general form:

    result: components…;
    
    ere result is the nonterminal symbol that this rule describes, and components are various terminal and nonterminal symbols that are put together by this rule.
    
    r Example,
    
    p: exp '+' exp;
    
    ys that two grouping of type exp, with a ‘+’ token in between, can be combined into a larger grouping of type exp.
    
    ite space in rules is significant only to separate symbols. You can add extra white space as you wish.
    
    ltiple rules for the same result can be written separately or can be joined with the vertical-bar character ‘|’ as follows:
    
    sult:
    le1-components…
    rule2-components…
    
    
    ey are still considered distinct rules even when joined in this way.
    
    ample code implementation*:
    

    ModelExpressionLiteral : “@M” “IDENTIFIER” “.” “IDENTIFIER” {
    $$=new ModelLiteralNode($2,$4,createSourceLocation(null,@1,@4)); } ;

    In the above code, we define a node in the AST as 'ModelExpressionLiteral' which is composed of a set of valid non-terminal symbols. This means that whenever the parser encounters the combination of these symbols in the specified order it triggers a function which returns a new 'ModelLiteralNode'.
    
    ***${number}*** - specifies the token in the rule, e.g. $2 specifies ***"IDENTIFIER"*** in the above rule
    
    ***@{number}*** - specifies the location of the symbol in the expression, e.g. @1 specifies the ***location of '@M'***.
    
    ***createSourceLocation*** - This is a factory function which takes the start and end location of the expression and returns a loc object.
    
    ***Recursive Rules*** : A rule is called recursive when its result nonterminal appears also on its right hand side. Nearly all Jison grammars need to use recursion, because that is the only way to define a sequence of any number of a particular thing. Consider this recursive definition of a comma-separated sequence of one or more expressions:
    
    pseq1:
    p
    expseq1 ',' exp
    
    
    nce the recursive use of expseq1 is the leftmost symbol in the right hand side, we call this left recursion. By contrast, here the same construct is defined using right recursion:
    
    pseq1:
    p
    exp ',' expseq1
    
    
    y kind of sequence can be defined using either left recursion or right recursion, but you should always use left recursion, because it can parse a sequence of any number of elements with bounded stack space.
    
    
    	*Sample code implementation of recursive grammar rule*:
    

    ModelExpression : ModelExpressionLiteral | ModelExpression “.” “IDENTIFIER” {
    $$=new ModelExpression($1,$3,createSourceLocation(null,@1,@3)); } ;

    
    ***Indirect or Mutual Recursion*** : This occurs when the result of the rule does not appear directly on its right hand side, but does appear in rules for other nonterminals which do appear on its right hand side.
    
    pr:
    imary
    primary '+' primary
    imary:
    nstant
    '(' expr ')'
    
    e above pseudo-code defines two mutually-recursive nonterminals, since each refers to the other.
    
    ample code implementation of mutually recursive grammar rule*:
    
    lterExpression
     : "IDENTIFIER" FilterNode
      {
         $$=new FilterExpression($1,$2,createSourceLocation(null,$1,$2));
      }
     ;
    
    lterNode
     : DBOperator Literal
         {
             $$=new FilterNode($1,$2,createSourceLocation(null,@1,@2));        
         }
     ;
    
  • Additional Javascript Code : Whatever javascript code that is defined here is copied directly to the generated javascript file from the grammar file.

    This section acts as the dump of helper functions which are scattered all over the grammar section.

    In our implementation we have defined all the AST node constructors in this section so that it can be accessed by the future passes on the AST.

    Sample code implementation :

    function ProgramNode(body, loc) {
        this.type = "Program";
        this.body = body;
        this.loc = loc;
    }
    
    function EmptyStatementNode(loc) {
        this.type = "EmptyStatement";
        this.loc = loc;
    }
    
    function BlockStatementNode(body, loc) {
        this.type = "BlockStatement";
        this.body = body;
        this.loc = loc;
    }
    

Generating Javascript File from Grammar File

Once you have it installed jison, you can run it from the command line as shown below. The resulting parser is placed in a file with the same name as your grammar file and a “js” extension.

jison grammar_file

Parsing the expression to create an AST

Using the generated expression parser, any valid expression can be converted to AST.

var parser = require('./expression-syntax-parser');
var source = "(@mCustomer.age where accNo = @i.accountNumber) < 65"
var ast = parser.parse(source);

’expression-syntax-parser’ is the generated javascript from the grammar file. ‘source’ is a valid expression.

Build pass the AST

The general concept here is to traverse each node of the generated AST and creating a valid javascript/perform some task based on the properties of the corresponding nodes.

In the file ’expression-ast-parser.js’ we define a build method on the prototype of each constructor function exposed on the ast object of the parser.

The build method is responsible for stiching properties of the particular node togather to perform an action or create some valid javascript.

Every build function is designed to be asynchronous and uses promises to achieve the same.

The ast parser uses the q promise library extensively.

Utility Wrapper

This is the interface between the application and the expression language.

In the file ’expression-language.js’ we define a function which takes an expression and an optional instance (more on this in the usage section below).

This function wraps the ast creation, ast build passing, etc. steps and makes the life easier by just taking the expression and returning a promise.

The utilty function returns a thenable promise which will eventually get resolved with the required result or rejected with an error.

Expression Language Usage

The following sample program shows the usage of expression language. This is a sample code (from model-validations.js mixin file) which shows an use of expression language in oeCloud in the validation framework to implement the use-case of validateWhen (more on this in the validation framework section of the wiki).

var exprLang = require('../../lib/expression-language/expression-language.js');

inst.constructor.validationRules.forEach(function(obj) {
    if (obj.args.validateWhen) {
        validateWhenPromises.push(exprLang(obj.args.validateWhen, data));
    } else {
        validateWhenPromises.push(exprLang('true', data));
    }
});

q.allSettled(validateWhenPromises).then(function(results) {
    results.map(function(d) {
        return d.value;
    }).forEach(function(d, i) {
        if (d) {
            var obj = inst.constructor.validationRules[i];
            obj.args.inst = inst;
            obj.args.data = data;
            obj.args.path = path;
            fnArr.push(async.apply(obj.expression, obj.args));
        }
    });

In the above code, for each validateWhen property we call the expression language utility wrapper to create an array of promises. We then wait for all the promises to get resolved to perform further action.

Expression Parsing (to create AST)

Let’s take a grammar rule and observe the steps;

(@mCustomer.age where accNo = @i.accountNumber) < 65

@mCustomer.age - This denotes the property age of the model Customer. where - This create a where query taking into account the conditions which follow. @i.accountNumber - This takes the account number from the instance parameter. (The second paramater in the expression language utility wrapper).

The above grammar results in a query to the database to fetch the age of the customer whose accNo is equal to the accountNumber taken from the instance object and then the value is compared to the right-hand side literal and returns a boolean value.

This particular grammar rule creates an AST denoted by the JSON below:

{
    "type": "Program",
    "body": [
        {
            "type": "ExpressionStatement",
            "expression": {
                "type": "BinaryExpression",
                "operator": "<",
                "left": {
                    "type": "ModelQueryExpression",
                    "modelExpression": {
                        "type": "ModelLiteralNode",
                        "modelName": "Customer",
                        "propertyName": "age",
                        "loc": {
                            "source": null,
                            "start": {
                                "line": 1,
                                "column": 1
                            },
                            "end": {
                                "line": 1,
                                "column": 15
                            }
                        }
                    },
                    "where": {
                        "type": "WhereLiteralNode",
                        "where": {
                            "type": "FilterExpression",
                            "property": "accNo",
                            "filterUnary": {
                                "type": "FilterNode",
                                "operator": "=",
                                "literal": {
                                    "type": "InstanceLiteralNode",
                                    "expression": "accountNumber",
                                    "operation": null,
                                    "loc": {
                                        "source": null,
                                        "start": {
                                            "line": 1,
                                            "column": 30
                                        },
                                        "end": {
                                            "line": 1,
                                            "column": 46
                                        }
                                    }
                                },
                                "loc": {
                                    "source": null,
                                    "start": {
                                        "line": 1,
                                        "column": 28
                                    },
                                    "end": {
                                        "line": 1,
                                        "column": 46
                                    }
                                }
                            },
                            "loc": {
                                "source": null,
                                "start": { },
                                "end": { }
                            }
                        },
                        "loc": {
                            "source": null,
                            "start": {
                                "line": 1,
                                "column": 16
                            },
                            "end": {
                                "line": 1,
                                "column": 46
                            }
                        }
                    },
                    "loc": {
                        "source": null,
                        "start": {
                            "line": 1,
                            "column": 1
                        },
                        "end": {
                            "line": 1,
                            "column": 46
                        }
                    }
                },
                "right": {
                    "type": "Literal",
                    "value": 65,
                    "loc": {
                        "source": null,
                        "start": {
                            "line": 1,
                            "column": 50
                        },
                        "end": {
                            "line": 1,
                            "column": 52
                        }
                    }
                },
                "loc": {
                    "source": null,
                    "start": {
                        "line": 1,
                        "column": 0
                    },
                    "end": {
                        "line": 1,
                        "column": 52
                    }
                }
            },
            "loc": {
                "source": null,
                "start": {
                    "line": 1,
                    "column": 0
                },
                "end": {
                    "line": 1,
                    "column": 52
                }
            }
        }
    ],
    "loc": {
        "source": null,
        "start": {
            "line": 1,
            "column": 0
        },
        "end": {
            "line": 1,
            "column": 52
        }
    }
}

Some useful grammar rules

Let’s take the same grammar rule as above and describe the usage;

(@mCustomer.age where accNo = @i.accountNumber) < 65

@mCustomer.age - This denotes the property age of the model Customer. where - This create a where query taking into account the conditions which follow. @i.accountNumber - aka InstanceLiteralNode - This takes the account number from the instance parameter. (The second paramater in the expression language utility wrapper).

The above grammar results in a query to the database to fetch the age of the customer whose accNo is equal to the accountNumber taken from the instance object and then the value is compared to the right-hand side literal and returns a boolean value.

Other grammar rules:

callContext - callContext.name -This denotes the name property on the callContext and can be used like any other literal node. (like string,number or InstanceLiteralNode).

where - where accNo {DBOperator} ‘12QWE345’

DBOperator can be any one of the following :

">=" : Takes a literal node as right-hand side argument.
"<=" : Takes a literal node as right-hand side argument.
">"  : Takes a literal node as right-hand side argument.
"<"	 : Takes a literal node as right-hand side argument.
"="  : Takes a literal node as right-hand side argument.
"!=" : Takes a literal node as right-hand side argument.
"BETWEEN" : Takes an array as right-hand side argument.
"INQ" : Takes an array as right-hand side argument.
"NOTINQ" : Takes an array as right-hand side argument.
"LIKE" : Takes RegExp as right-hand side argument.
"NOTLIKE" : Takes RegExp as right-hand side argument.

Return an object/value based on the some condition

This is the special case when we would like to return some object to be consumed by any operation based on some condition;

if(@i.AccountDetails.salary <= 20){return @i.loanAmount <= @i.AccountDetails.salary * 2}

The above expression compares the ‘AccountDetails.salary’ property on the instance to the numeric value ‘20’ and returns the evaluation result of the boolean expression i.e. comparision of loanAmount property on the instance to twice the ‘AccountDetails.salary’ property on the instance defined in the code block.