2009-06-04 5 views
5

Первым генератором синтаксиса, с которым я работал, был Parse :: RecDescent, а доступные ему руководства/учебные пособия были отличными, но наиболее полезной функцией было его инструменты отладки , в частности, трассировка (активируется установкой $ RD_TRACE в 1). Я ищу генератор парсера, который поможет вам отладить его правила.Помогите найти подходящий генератор синтаксического анализа ruby ​​/ python

Дело в том, что оно должно быть написано на питоне или в рубине и иметь подробный режим/режим трассировки или очень полезные методы отладки.

Кто-нибудь знает такой генератор синтаксического анализатора?

EDIT: когда я сказал отладку, я не имел в виду отладку python или ruby. Я имел в виду отладку генератора синтаксического анализатора, посмотреть, что он делает на каждом шагу, увидеть каждый символ, который он читает, правила, которые он пытается сопоставить. Надеюсь, вы поняли.

BOUNTY EDIT: чтобы выиграть награду, пожалуйста, покажите структуру генератора синтаксического анализатора и проиллюстрируйте некоторые из ее функций отладки. Повторяю, меня не интересует pdb, но в рамках отладки парсера. Кроме того, не упоминайте верхушки деревьев. Меня это не интересует.

+0

эй! и как насчет рубинов? : D – knoopx

ответ

6

Python - довольно простой язык для отладки. Вы можете просто импортировать pdb pdb.settrace().

Однако эти генераторы синтаксического анализатора предположительно снабжены хорошими средствами отладки.

http://www.antlr.org/

http://www.dabeaz.com/ply/

http://pyparsing.wikispaces.com/

В ответ на BOUNTY

Здесь отладки PLY, в действии.

Исходный код

tokens = (
    'NAME','NUMBER', 
    ) 

literals = ['=','+','-','*','/', '(',')'] 

# Tokens 

t_NAME = r'[a-zA-Z_][a-zA-Z0-9_]*' 

def t_NUMBER(t): 
    r'\d+' 
    t.value = int(t.value) 
    return t 

t_ignore = " \t" 

def t_newline(t): 
    r'\n+' 
    t.lexer.lineno += t.value.count("\n") 

def t_error(t): 
    print("Illegal character '%s'" % t.value[0]) 
    t.lexer.skip(1) 

# Build the lexer 
import ply.lex as lex 
lex.lex(debug=1) 

# Parsing rules 

precedence = (
    ('left','+','-'), 
    ('left','*','/'), 
    ('right','UMINUS'), 
    ) 

# dictionary of names 
names = { } 

def p_statement_assign(p): 
    'statement : NAME "=" expression' 
    names[p[1]] = p[3] 

def p_statement_expr(p): 
    'statement : expression' 
    print(p[1]) 

def p_expression_binop(p): 
    '''expression : expression '+' expression 
        | expression '-' expression 
        | expression '*' expression 
        | expression '/' expression''' 
    if p[2] == '+' : p[0] = p[1] + p[3] 
    elif p[2] == '-': p[0] = p[1] - p[3] 
    elif p[2] == '*': p[0] = p[1] * p[3] 
    elif p[2] == '/': p[0] = p[1]/p[3] 

def p_expression_uminus(p): 
    "expression : '-' expression %prec UMINUS" 
    p[0] = -p[2] 

def p_expression_group(p): 
    "expression : '(' expression ')'" 
    p[0] = p[2] 

def p_expression_number(p): 
    "expression : NUMBER" 
    p[0] = p[1] 

def p_expression_name(p): 
    "expression : NAME" 
    try: 
     p[0] = names[p[1]] 
    except LookupError: 
     print("Undefined name '%s'" % p[1]) 
     p[0] = 0 

def p_error(p): 
    if p: 
     print("Syntax error at '%s'" % p.value) 
    else: 
     print("Syntax error at EOF") 

import ply.yacc as yacc 
yacc.yacc() 

import logging 
logging.basicConfig(
    level=logging.INFO, 
    filename="parselog.txt" 
) 

while 1: 
    try: 
     s = raw_input('calc > ') 
    except EOFError: 
     break 
    if not s: continue 
    yacc.parse(s, debug=1) 

Выход

lex: tokens = ('NAME', 'NUMBER') 
lex: literals = ['=', '+', '-', '*', '/', '(', ')'] 
lex: states = {'INITIAL': 'inclusive'} 
lex: Adding rule t_NUMBER -> '\d+' (state 'INITIAL') 
lex: Adding rule t_newline -> '\n+' (state 'INITIAL') 
lex: Adding rule t_NAME -> '[a-zA-Z_][a-zA-Z0-9_]*' (state 'INITIAL') 
lex: ==== MASTER REGEXS FOLLOW ==== 
lex: state 'INITIAL' : regex[0] = '(?P<t_NUMBER>\d+)|(?P<t_newline>\n+)|(?P<t_NAME>[a-zA-Z 
_][a-zA-Z0-9_]*)' 
calc > 2+3 
PLY: PARSE DEBUG START 

State : 0 
Stack : . LexToken(NUMBER,2,1,0) 
Action : Shift and goto state 3 

State : 3 
Stack : NUMBER . LexToken(+,'+',1,1) 
Action : Reduce rule [expression -> NUMBER] with [2] and goto state 9 
Result : <int @ 0x1a1896c> (2) 

State : 6 
Stack : expression . LexToken(+,'+',1,1) 
Action : Shift and goto state 12 

State : 12 
Stack : expression + . LexToken(NUMBER,3,1,2) 
Action : Shift and goto state 3 

State : 3 
Stack : expression + NUMBER . $end 
Action : Reduce rule [expression -> NUMBER] with [3] and goto state 9 
Result : <int @ 0x1a18960> (3) 

State : 18 
Stack : expression + expression . $end 
Action : Reduce rule [expression -> expression + expression] with [2,'+',3] and goto state 
3 
Result : <int @ 0x1a18948> (5) 

State : 6 
Stack : expression . $end 
Action : Reduce rule [statement -> expression] with [5] and goto state 2 
5 
Result : <NoneType @ 0x1e1ccef4> (None) 

State : 4 
Stack : statement . $end 
Done : Returning <NoneType @ 0x1e1ccef4> (None) 
PLY: PARSE DEBUG END 
calc > 

Анализировать Таблица генерируется parser.out

Created by PLY version 3.2 (http://www.dabeaz.com/ply) 

Grammar 

Rule 0  S' -> statement 
Rule 1  statement -> NAME = expression 
Rule 2  statement -> expression 
Rule 3  expression -> expression + expression 
Rule 4  expression -> expression - expression 
Rule 5  expression -> expression * expression 
Rule 6  expression -> expression/expression 
Rule 7  expression -> - expression 
Rule 8  expression -> (expression) 
Rule 9  expression -> NUMBER 
Rule 10 expression -> NAME 

Terminals, with rules where they appear 

(     : 8 
)     : 8 
*     : 5 
+     : 3 
-     : 4 7 
/     : 6 
=     : 1 
NAME     : 1 10 
NUMBER    : 9 
error    : 

Nonterminals, with rules where they appear 

expression   : 1 2 3 3 4 4 5 5 6 6 7 8 
statement   : 0 

Parsing method: LALR 

state 0 

    (0) S' -> . statement 
    (1) statement -> . NAME = expression 
    (2) statement -> . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    NAME   shift and go to state 1 
    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 

    expression      shift and go to state 6 
    statement      shift and go to state 4 

state 1 

    (1) statement -> NAME . = expression 
    (10) expression -> NAME . 

    =    shift and go to state 7 
    +    reduce using rule 10 (expression -> NAME .) 
    -    reduce using rule 10 (expression -> NAME .) 
    *    reduce using rule 10 (expression -> NAME .) 
    /    reduce using rule 10 (expression -> NAME .) 
    $end   reduce using rule 10 (expression -> NAME .) 


state 2 

    (7) expression -> - . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 9 

state 3 

    (9) expression -> NUMBER . 

    +    reduce using rule 9 (expression -> NUMBER .) 
    -    reduce using rule 9 (expression -> NUMBER .) 
    *    reduce using rule 9 (expression -> NUMBER .) 
    /    reduce using rule 9 (expression -> NUMBER .) 
    $end   reduce using rule 9 (expression -> NUMBER .) 
    )    reduce using rule 9 (expression -> NUMBER .) 


state 4 

    (0) S' -> statement . 



state 5 

    (8) expression -> (. expression) 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 10 

state 6 

    (2) statement -> expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    $end   reduce using rule 2 (statement -> expression .) 
    +    shift and go to state 12 
    -    shift and go to state 11 
    *    shift and go to state 13 
    /    shift and go to state 14 


state 7 

    (1) statement -> NAME = . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 15 

state 8 

    (10) expression -> NAME . 

    +    reduce using rule 10 (expression -> NAME .) 
    -    reduce using rule 10 (expression -> NAME .) 
    *    reduce using rule 10 (expression -> NAME .) 
    /    reduce using rule 10 (expression -> NAME .) 
    $end   reduce using rule 10 (expression -> NAME .) 
    )    reduce using rule 10 (expression -> NAME .) 


state 9 

    (7) expression -> - expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    +    reduce using rule 7 (expression -> - expression .) 
    -    reduce using rule 7 (expression -> - expression .) 
    *    reduce using rule 7 (expression -> - expression .) 
    /    reduce using rule 7 (expression -> - expression .) 
    $end   reduce using rule 7 (expression -> - expression .) 
    )    reduce using rule 7 (expression -> - expression .) 

    ! +    [ shift and go to state 12 ] 
    ! -    [ shift and go to state 11 ] 
    ! *    [ shift and go to state 13 ] 
    !/    [ shift and go to state 14 ] 


state 10 

    (8) expression -> (expression .) 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    )    shift and go to state 16 
    +    shift and go to state 12 
    -    shift and go to state 11 
    *    shift and go to state 13 
    /    shift and go to state 14 


state 11 

    (4) expression -> expression - . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 17 

state 12 

    (3) expression -> expression + . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 18 

state 13 

    (5) expression -> expression * . expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 19 

state 14 

    (6) expression -> expression/. expression 
    (3) expression -> . expression + expression 
    (4) expression -> . expression - expression 
    (5) expression -> . expression * expression 
    (6) expression -> . expression/expression 
    (7) expression -> . - expression 
    (8) expression -> . (expression) 
    (9) expression -> . NUMBER 
    (10) expression -> . NAME 

    -    shift and go to state 2 
    (    shift and go to state 5 
    NUMBER   shift and go to state 3 
    NAME   shift and go to state 8 

    expression      shift and go to state 20 

state 15 

    (1) statement -> NAME = expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    $end   reduce using rule 1 (statement -> NAME = expression .) 
    +    shift and go to state 12 
    -    shift and go to state 11 
    *    shift and go to state 13 
    /    shift and go to state 14 


state 16 

    (8) expression -> (expression) . 

    +    reduce using rule 8 (expression -> (expression) .) 
    -    reduce using rule 8 (expression -> (expression) .) 
    *    reduce using rule 8 (expression -> (expression) .) 
    /    reduce using rule 8 (expression -> (expression) .) 
    $end   reduce using rule 8 (expression -> (expression) .) 
    )    reduce using rule 8 (expression -> (expression) .) 


state 17 

    (4) expression -> expression - expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    +    reduce using rule 4 (expression -> expression - expression .) 
    -    reduce using rule 4 (expression -> expression - expression .) 
    $end   reduce using rule 4 (expression -> expression - expression .) 
    )    reduce using rule 4 (expression -> expression - expression .) 
    *    shift and go to state 13 
    /    shift and go to state 14 

    ! *    [ reduce using rule 4 (expression -> expression - expression .) ] 
    !/    [ reduce using rule 4 (expression -> expression - expression .) ] 
    ! +    [ shift and go to state 12 ] 
    ! -    [ shift and go to state 11 ] 


state 18 

    (3) expression -> expression + expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    +    reduce using rule 3 (expression -> expression + expression .) 
    -    reduce using rule 3 (expression -> expression + expression .) 
    $end   reduce using rule 3 (expression -> expression + expression .) 
    )    reduce using rule 3 (expression -> expression + expression .) 
    *    shift and go to state 13 
    /    shift and go to state 14 

    ! *    [ reduce using rule 3 (expression -> expression + expression .) ] 
    !/    [ reduce using rule 3 (expression -> expression + expression .) ] 
    ! +    [ shift and go to state 12 ] 
    ! -    [ shift and go to state 11 ] 


state 19 

    (5) expression -> expression * expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    +    reduce using rule 5 (expression -> expression * expression .) 
    -    reduce using rule 5 (expression -> expression * expression .) 
    *    reduce using rule 5 (expression -> expression * expression .) 
    /    reduce using rule 5 (expression -> expression * expression .) 
    $end   reduce using rule 5 (expression -> expression * expression .) 
    )    reduce using rule 5 (expression -> expression * expression .) 

    ! +    [ shift and go to state 12 ] 
    ! -    [ shift and go to state 11 ] 
    ! *    [ shift and go to state 13 ] 
    !/    [ shift and go to state 14 ] 


state 20 

    (6) expression -> expression/expression . 
    (3) expression -> expression . + expression 
    (4) expression -> expression . - expression 
    (5) expression -> expression . * expression 
    (6) expression -> expression ./expression 

    +    reduce using rule 6 (expression -> expression/expression .) 
    -    reduce using rule 6 (expression -> expression/expression .) 
    *    reduce using rule 6 (expression -> expression/expression .) 
    /    reduce using rule 6 (expression -> expression/expression .) 
    $end   reduce using rule 6 (expression -> expression/expression .) 
    )    reduce using rule 6 (expression -> expression/expression .) 

    ! +    [ shift and go to state 12 ] 
    ! -    [ shift and go to state 11 ] 
    ! *    [ shift and go to state 13 ] 
    !/    [ shift and go to state 14 ] 
+0

это очень приятно! благодаря! если никакие другие ответы не будут отправлены в течение дня, вы выиграете. – Geo

0

У Python wiki есть list Язык Parsers, написанный на Python.

+0

Я знаю это. Это не помогает. – Geo

+1

@Geo: Почему downvote (предполагая, что это был ты)? Как мог тот, кто знал, что вы уже видели этот список? –

+1

Вопрос был довольно конкретным, я думаю. Я попросил генераторы синтаксического анализатора, которые имеют хорошие функции отладки. Он опубликовал список, а не что-то конкретное. – Geo

2

Я знаю, что Баунти уже заявлено, но здесь является эквивалентом парсер написанный в Pyparsing (плюс поддержка вызовов функций с нулевыми или более аргументов запятыми delimted):

from pyparsing import * 

LPAR, RPAR = map(Suppress,"()") 
EQ = Literal("=") 
name = Word(alphas, alphanums+"_").setName("name") 
number = Word(nums).setName("number") 

expr = Forward() 
operand = Optional('-') + (Group(name + LPAR + 
            Group(Optional(delimitedList(expr))) + 
            RPAR) | 
          name | 
          number | 
          Group(LPAR + expr + RPAR)) 
binop = oneOf("+ - */**") 
expr << (Group(operand + OneOrMore(binop + operand)) | operand) 

assignment = name + EQ + expr 
statement = assignment | expr 

этот тест код выполняется синтаксический анализатор через его основные шаги:

tests = """\ 
    sin(pi/2) 
    y = mx+b 
    E = mc ** 2 
    F = m*a 
    x = x0 + v*t +a*t*t/2 
    1 - sqrt(sin(t)**2 + cos(t)**2)""".splitlines() 

for t in tests: 
    print t.strip() 
    print statement.parseString(t).asList() 
    print 

дает этот вывод:

sin(pi/2) 
[['sin', [['pi', '/', '2']]]] 

y = mx+b 
['y', '=', ['mx', '+', 'b']] 

E = mc ** 2 
['E', '=', ['mc', '**', '2']] 

F = m*a 
['F', '=', ['m', '*', 'a']] 

x = x0 + v*t +a*t*t/2 
['x', '=', ['x0', '+', 'v', '*', 't', '+', 'a', '*', 't', '*', 't', '/', '2']] 

1 - sqrt(sin(t)**2 + cos(t)**2) 
[['1', '-', ['sqrt', [[['sin', ['t']], '**', '2', '+', ['cos', ['t']], '**', '2']]]]] 

Для отладки, мы добавим этот код:

# enable debugging for name and number expressions 
name.setDebug() 
number.setDebug() 

А теперь мы их повторные первый тест (отображающий входную строку и простую линейку столбцов):

t = tests[0] 
print ("1234567890"*10)[:len(t)] 
print t 
statement.parseString(t) 
print 

Давать этот вывод :

123456789
    sin(pi/2) 
Match name at loc 4(1,5) 
Matched name -> ['sin'] 
Match name at loc 4(1,5) 
Matched name -> ['sin'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match number at loc 11(1,12) 
Matched number -> ['2'] 
Match name at loc 4(1,5) 
Matched name -> ['sin'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match number at loc 11(1,12) 
Matched number -> ['2'] 

Pyparsing также поддерживает разбор паркета, сортировку разметки времени разбора (подробнее о пакетной обработке here). Вот та же последовательность разбора, но с Packrat включен:

same parse, but with packrat parsing enabled 
123456789
    sin(pi/2) 
Match name at loc 4(1,5) 
Matched name -> ['sin'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 8(1,9) 
Matched name -> ['pi'] 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match name at loc 11(1,12) 
Exception raised:Expected name (at char 11), (line:1, col:12) 
Match number at loc 11(1,12) 
Matched number -> ['2'] 

Это было интересное упражнение, и полезно для меня, чтобы увидеть отладочные функции из других библиотек анализатора.

1

ANTLR выше, имеет преимущество для создания удобочитаемого и понятного кода, так как он (очень сложный и мощный) сверху вниз анализатор, так что вы можете пройти через него с обычным отладчиком и посмотреть, что он действительно делает.

Именно поэтому это мой генератор синтаксического анализа.

снизу вверх парсер генераторы как PLY имеют тот недостаток, что для больших грамматик это почти невозможно понять, что отладочные действительно означает и почему таблица синтаксического анализа, как она есть.

 Смежные вопросы

  • Нет связанных вопросов^_^