由于自顶向下解析的性质,ANTLR 在到达表达式内部的叶子之前生成具有一些长重复结构的解析树,其中包含许多多余的节点。
例如,在以下代码上使用 C.g4 语法 ( https://github.com/antlr/grammars-v4/tree/master/c ):
main(){
int a=5, b=10;
for(int i =0;i<b;i++){
b=a--;
}
}
生成的树是:
(compilationUnit (translationUnit (externalDeclaration (functionDefinition (declarator (directDeclarator (directDeclarator main) ()))) (compoundStatement { (blockItemList (blockItemList (blockItem (declaration (declarationSpecifiers (declarationSpecifier (typeSpecifier int))))) (initDeclaratorList (initDeclaratorList (initDeclarator (declarator ( directDeclarator a)) = (初始化器 (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression 5))))))))) ))))))))),(initDeclarator (declarator (directDeclarator b)) = (初始化程序 (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression 10)))))) )))))))))))))));( andExpression(equalityExpression(relationalExpression(shiftExpression(additiveExpression(multiplicativeExpression(castExpression(unaryExpression(postfixExpression(primaryExpression 0)))))))))))))))));( addExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression b)))))))))))))))) ;(表达式 (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression (equalityExpression (relationalExpression (shiftExpression (additiveExpression (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (postfixExpression (primaryExpression i)) ++))))))))) )))))))) (语句 (compoundStatement { (blockItemList (blockItem (语句 (expressionStatement (表达式 (assignmentExpression (unaryExpression (postfixExpression (primaryExpression b))))) (assignmentOperator =) (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression) ExclusiveOrExpression(andExpression(equalityExpression(relationalExpression(shiftExpression(additiveExpression)(multiplicativeExpression (castExpression (unaryExpression (postfixExpression (postfixExpression (primaryExpression a)) --)))))))))))))))))))))))))))))))))))))))))) ) )
其中与代码存根“int a=5”匹配的树子结构是:
(declaration (declarationSpecifiers (declarationSpecifier (typeSpecifier int))) (initDeclaratorList (initDeclaratorList (initDeclarator (declarator (directDeclarator a))) = (initializer (assignmentExpression (conditionalExpression (logicalOrExpression (logicalAndExpression (inclusiveOrExpression (exclusiveOrExpression (andExpression) equalityExpression (relationalExpression (shiftExpression (additiveExpression) (multiplicativeExpression (castExpression (unaryExpression (postfixExpression (primaryExpression 5))))))))))))))))))
我们清楚地看到,这大约可以简化为:
(declaration (declarationSpecifiers (declarationSpecifier (typeSpecifier int))) (initDeclaratorList (initDeclaratorList (initDeclarator (declarator (directDeclarator a)) = (initializer (assignmentExpression (postfixExpression (primaryExpression 5))))))
我正在使用解析树来执行某些静态分析,并且由于上面显示的多余节点,我需要在系统的侦听器端执行大量检查以访问正确的感兴趣的树节点。
所以我想知道是否有一种简单的方法可以通过使用一组转换规则来删除多余的节点和/或减少长重复结构来修改解析树。