Modify the config file if necessary and then execute:
python demo.py --config=<config> --text=<text>
The image of the output parse tree will be saved in out/parse-tree.svg
. (This behavior can be changed in demo.py
.)
Take the regular expression (a|b)*abb
for example.
-
Create an NFA from the regular expression:
import ptree from ptree.lexer.regex import Regex, RegexEngine regex = Regex(name='(a|b)*abb', pattern='(a|b)*abb') engine = RegexEngine() nfa = engine.parse(regex) ptree.render(nfa, directory='out', name='nfa', output_format='svg')
-
Convert the NFA to a DFA:
dfa = nfa.to_dfa() ptree.render(dfa, directory='out', name='dfa', output_format='svg')
Take the regular expressions a*b+
, a
, abb
for example.
-
Create three NFAs from the regular expressions.
import ptree from ptree.lexer.regex import Regex, RegexEngine regexes = [ Regex(name='a*b+', pattern='a*b+'), Regex(name='a', pattern='a'), Regex(name='abb', pattern='abb') ] engine = RegexEngine() nfas = [engine.parse(regex) for regex in regexes] for i, nfa in enumerate(nfas): ptree.render(nfa, directory='out', name=f'nfa-{i}', output_format='svg')
-
Merge the NFAs.
from ptree.lexer.nfa import NFA nfa = NFA.union(nfas) ptree.render(nfa, directory='out', name='nfa', output_format='svg')
-
Convert the NFA to DFA.
dfa = nfa.to_dfa() ptree.render(dfa, directory='out', name='dfa', output_format='svg')
-
Create a lexer from the config file.
The config file used in this example is:
nonterminal_symbols: terminal_symbols: COMMENT: '(//[^\n]*)|(/\*([^\*]|(\*)*[^\*/])*(\*)*\*/)' KEYWORD: 'auto|short|int|long|float|double|char|struct|union|enum|typedef|const|unsigned|signed|extern|register|static|volatile|void|if|else|switch|case|for|do|while|goto|continue|break|default|sizeof|return|using|namespace' IDENTIFIER: '[A-Za-z_][A-Za-z0-9_]*' INTEGER: '[0-9]+' FLOAT: '[0-9]+\.[0-9]+' COMPARISON: '==|>|<|>=|<=|!=' LSTREAM: '<<' RSTREAM: '>>' LP: '\(' RP: '\)' LB: '{' RB: '}' COMMA: ',' SEMICOLON: ';' LSB: '\[' RSB: '\]' ASSIGN_OP: '=' ADD_OP: '\+' SUB_OP: '\-' MULT_OP: '\*' DIV_OP: '/' MOD_OP: '%' POWER_OP: '\^' AND_OP: '&&' OR_OP: '\|\|' NOT_OP: '!' SPACE: '[ \t\n\r]+' ignored_symbols: ? SPACE ? COMMENT start_symbol: production_rules:
import ptree config = ptree.load_config('config.yaml') grammar = ptree.Grammar(config) lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool)
-
Tokenize the input string.
The input string is:
int main() {int a = a + 1; cout << a << endl; return 0;}
tokens = lexer.tokenize('''int main() {int a = a + 1; cout << a << endl; return 0;}''') ptree.pprint(tokens)
+----+------------+--------+ | | SYMBOL | VALUE | +====+============+========+ | 1 | KEYWORD | int | +----+------------+--------+ | 2 | IDENTIFIER | main | +----+------------+--------+ | 3 | LP | ( | +----+------------+--------+ | 4 | RP | ) | +----+------------+--------+ | 5 | LB | { | +----+------------+--------+ | 6 | KEYWORD | int | +----+------------+--------+ | 7 | IDENTIFIER | a | +----+------------+--------+ | 8 | ASSIGN_OP | = | +----+------------+--------+ | 9 | IDENTIFIER | a | +----+------------+--------+ | 10 | ADD_OP | + | +----+------------+--------+ | 11 | INTEGER | 1 | +----+------------+--------+ | 12 | SEMICOLON | ; | +----+------------+--------+ | 13 | IDENTIFIER | cout | +----+------------+--------+ | 14 | LSTREAM | << | +----+------------+--------+ | 15 | IDENTIFIER | a | +----+------------+--------+ | 16 | LSTREAM | << | +----+------------+--------+ | 17 | IDENTIFIER | endl | +----+------------+--------+ | 18 | SEMICOLON | ; | +----+------------+--------+ | 19 | KEYWORD | return | +----+------------+--------+ | 20 | INTEGER | 0 | +----+------------+--------+ | 21 | SEMICOLON | ; | +----+------------+--------+ | 22 | RB | } | +----+------------+--------+
-
Write a config file (in YAML format).
nonterminal_symbols: ? E ? T ? F terminal_symbols: '+': '\+' '-': '\-' '*': '\*' '/': '/' '(': '\(' ')': '\)' 'num': '[0-9]+' ignored_symbols: start_symbol: E production_rules: - E -> E + T - E -> E - T - E -> T - T -> T * F - T -> T / F - T -> F - F -> num - F -> ( E )
-
Generate the parse table.
import ptree config = ptree.load_config('config.yaml') grammar = ptree.Grammar(config) grammar.init() parser = ptree.Parser(grammar) ptree.pprint(grammar.parse_table)
+----+----------------------------------------------+--------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | | ACTION | GOTO | STATE | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | | + | - | * | / | ( | ) | num | $ | E | T | F | | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 0 | | | | | s1 | | s2 | | 4 | 5 | 3 | {F -> . ( E ), *; F -> . num, *; T -> . F, $; E -> . E - T, -; _S -> . E, $; E -> . T, $; T -> . F, /; F -> . ( E ), $; T -> . T * F, $; T -> . T / F, $; E -> . E + T, -; F -> . num, $; T -> . T * F, /; E -> . E - T, $; T -> . F, +; E -> . E + T, $; F -> . ( E ), /; E -> . T, +; T -> . F, -; T -> . T * F, +; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; E -> . E - T, +; T -> . T * F, -; F -> . num, -; E -> . T, -; E -> . E + T, +; F -> . ( E ), -; T -> . T * F, *; T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | | | | | s6 | | s7 | | 8 | 10 | 9 | {F -> . ( E ), *; F -> . num, *; E -> . E - T, -; T -> . F, /; F -> ( . E ), /; E -> . E + T, -; T -> . T / F, ); E -> . T, ); F -> . ( E ), ); F -> ( . E ), +; T -> . T * F, ); T -> . T * F, /; F -> . num, ); T -> . T * F, *; T -> . F, +; F -> . ( E ), /; E -> . E - T, ); E -> . T, +; E -> . E + T, ); T -> . F, -; T -> . T * F, +; F -> ( . E ), *; F -> ( . E ), -; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; E -> . E - T, +; T -> . T * F, -; F -> . num, -; F -> ( . E ), $; E -> . T, -; E -> . E + T, +; F -> . ( E ), -; T -> . F, ); T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 2 | r7 | r7 | r7 | r7 | | | | r7 | | | | {F -> num . , -; F -> num . , +; F -> num . , /; F -> num . , *; F -> num . , $} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 3 | r6 | r6 | r6 | r6 | | | | r6 | | | | {T -> F . , -; T -> F . , +; T -> F . , /; T -> F . , *; T -> F . , $} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 4 | s12 | s11 | | | | | | acc | | | | {E -> E . - T, -; E -> E . - T, $; E -> E . + T, +; _S -> E . , $; E -> E . + T, -; E -> E . + T, $; E -> E . - T, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 5 | r3 | r3 | s14 | s13 | | | | r3 | | | | {T -> T . / F, -; T -> T . / F, +; E -> T . , +; T -> T . * F, /; T -> T . * F, +; E -> T . , -; E -> T . , $; T -> T . / F, /; T -> T . / F, *; T -> T . / F, $; T -> T . * F, *; T -> T . * F, $; T -> T . * F, -} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 6 | | | | | s6 | | s7 | | 15 | 10 | 9 | {F -> ( . E ), ); F -> . ( E ), *; F -> . num, *; E -> . E - T, -; T -> . F, /; F -> ( . E ), /; E -> . E + T, -; T -> . T / F, ); E -> . T, ); F -> . ( E ), ); F -> ( . E ), +; T -> . T * F, ); T -> . T * F, /; F -> . num, ); T -> . T * F, *; T -> . F, +; F -> . ( E ), /; E -> . E - T, ); E -> . T, +; E -> . E + T, ); T -> . F, -; T -> . T * F, +; F -> ( . E ), *; F -> ( . E ), -; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; E -> . E - T, +; T -> . T * F, -; F -> . num, -; E -> . T, -; E -> . E + T, +; F -> . ( E ), -; T -> . F, ); T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 7 | r7 | r7 | r7 | r7 | | r7 | | | | | | {F -> num . , ); F -> num . , -; F -> num . , +; F -> num . , /; F -> num . , *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 8 | s16 | s18 | | | | s17 | | | | | | {E -> E . + T, ); F -> ( E . ), +; E -> E . - T, ); F -> ( E . ), -; E -> E . - T, -; E -> E . + T, +; F -> ( E . ), *; F -> ( E . ), $; F -> ( E . ), /; E -> E . + T, -; E -> E . - T, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 9 | r6 | r6 | r6 | r6 | | r6 | | | | | | {T -> F . , -; T -> F . , +; T -> F . , ); T -> F . , /; T -> F . , *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 10 | r3 | r3 | s20 | s19 | | r3 | | | | | | {T -> T . / F, -; T -> T . / F, +; E -> T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; E -> T . , -; T -> T . / F, /; T -> T . / F, *; T -> T . * F, *; E -> T . , ); T -> T . * F, ); T -> T . / F, )} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11 | | | | | s1 | | s2 | | | 21 | 3 | {F -> . ( E ), *; F -> . num, *; T -> . F, $; E -> E - . T, +; T -> . F, /; F -> . ( E ), $; T -> . T * F, $; T -> . T / F, $; F -> . num, $; T -> . T * F, /; E -> E - . T, -; T -> . F, +; F -> . ( E ), /; T -> . F, -; T -> . T * F, +; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; E -> E - . T, $; T -> . T / F, -; F -> . num, /; T -> . F, *; T -> . T * F, -; F -> . num, -; F -> . ( E ), -; T -> . T * F, *; T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 12 | | | | | s1 | | s2 | | | 22 | 3 | {F -> . ( E ), *; F -> . num, *; T -> . F, $; T -> . F, /; F -> . ( E ), $; T -> . T * F, $; T -> . T / F, $; E -> E + . T, +; F -> . num, $; T -> . T * F, /; T -> . F, +; F -> . ( E ), /; E -> E + . T, -; T -> . F, -; T -> . T * F, +; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; T -> . T * F, -; F -> . num, -; E -> E + . T, $; F -> . ( E ), -; T -> . T * F, *; T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 13 | | | | | s1 | | s2 | | | | 23 | {T -> T / . F, +; F -> . ( E ), *; F -> . ( E ), /; T -> T / . F, /; T -> T / . F, *; F -> . num, *; F -> . num, /; F -> . num, $; T -> T / . F, $; F -> . num, -; T -> T / . F, -; F -> . num, +; F -> . ( E ), +; F -> . ( E ), -; F -> . ( E ), $} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 14 | | | | | s1 | | s2 | | | | 24 | {F -> . ( E ), *; T -> T * . F, /; T -> T * . F, *; F -> . num, /; F -> . num, *; F -> . ( E ), /; T -> T * . F, $; F -> . num, $; F -> . ( E ), $; F -> . num, -; T -> T * . F, -; F -> . ( E ), +; F -> . ( E ), -; T -> T * . F, +; F -> . num, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 15 | s16 | s18 | | | | s25 | | | | | | {E -> E . + T, ); F -> ( E . ), +; E -> E . - T, ); F -> ( E . ), -; E -> E . - T, -; E -> E . + T, +; F -> ( E . ), *; F -> ( E . ), ); F -> ( E . ), /; E -> E . + T, -; E -> E . - T, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 16 | | | | | s6 | | s7 | | | 26 | 9 | {F -> . ( E ), *; F -> . num, *; T -> . F, /; E -> E + . T, +; T -> . T / F, ); F -> . ( E ), ); T -> . T * F, ); T -> . T * F, /; F -> . num, ); T -> . T * F, *; T -> . F, +; F -> . ( E ), /; E -> E + . T, -; F -> . ( E ), -; T -> . F, -; T -> . T * F, +; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; T -> . T * F, -; F -> . num, -; E -> E + . T, ); T -> . F, ); T -> . T / F, *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 17 | r8 | r8 | r8 | r8 | | | | r8 | | | | {F -> ( E ) . , $; F -> ( E ) . , /; F -> ( E ) . , +; F -> ( E ) . , -; F -> ( E ) . , *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 18 | | | | | s6 | | s7 | | | 27 | 9 | {F -> . ( E ), *; F -> . num, *; E -> E - . T, +; T -> . F, /; T -> . T / F, ); F -> . ( E ), ); T -> . T * F, ); T -> . T * F, /; F -> . num, ); T -> . T * F, *; E -> E - . T, -; T -> . F, +; F -> . ( E ), /; T -> . F, -; T -> . T * F, +; T -> . T / F, +; T -> . T / F, /; F -> . ( E ), +; F -> . num, +; T -> . T / F, -; F -> . num, /; T -> . F, *; T -> . T / F, *; T -> . T * F, -; F -> . num, -; F -> . ( E ), -; T -> . F, ); E -> E - . T, )} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 19 | | | | | s6 | | s7 | | | | 28 | {T -> T / . F, +; F -> . ( E ), *; F -> . ( E ), /; T -> T / . F, /; T -> T / . F, *; F -> . num, *; F -> . num, /; F -> . ( E ), ); F -> . num, -; T -> T / . F, -; F -> . num, ); T -> T / . F, ); F -> . ( E ), +; F -> . ( E ), -; F -> . num, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 20 | | | | | s6 | | s7 | | | | 29 | {F -> . ( E ), *; T -> T * . F, /; T -> T * . F, *; F -> . num, /; F -> . num, *; F -> . ( E ), /; F -> . ( E ), ); F -> . num, -; T -> T * . F, ); F -> . num, ); T -> T * . F, -; F -> . ( E ), +; F -> . ( E ), -; T -> T * . F, +; F -> . num, +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 21 | r2 | r2 | s14 | s13 | | | | r2 | | | | {T -> T . / F, -; T -> T . / F, +; E -> E - T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . / F, /; T -> T . / F, *; E -> E - T . , -; T -> T . / F, $; T -> T . * F, *; E -> E - T . , $; T -> T . * F, $; T -> T . * F, -} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 22 | r1 | r1 | s14 | s13 | | | | r1 | | | | {T -> T . / F, -; E -> E + T . , -; E -> E + T . , $; T -> T . / F, +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; T -> T . / F, /; T -> T . / F, *; T -> T . / F, $; T -> T . * F, *; T -> T . * F, $; E -> E + T . , +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 23 | r5 | r5 | r5 | r5 | | | | r5 | | | | {T -> T / F . , $; T -> T / F . , *; T -> T / F . , -; T -> T / F . , +; T -> T / F . , /} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 24 | r4 | r4 | r4 | r4 | | | | r4 | | | | {T -> T * F . , *; T -> T * F . , $; T -> T * F . , -; T -> T * F . , /; T -> T * F . , +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 25 | r8 | r8 | r8 | r8 | | r8 | | | | | | {F -> ( E ) . , /; F -> ( E ) . , ); F -> ( E ) . , +; F -> ( E ) . , -; F -> ( E ) . , *} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 26 | r1 | r1 | s20 | s19 | | r1 | | | | | | {T -> T . / F, -; E -> E + T . , -; T -> T . / F, +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; E -> E + T . , ); T -> T . / F, /; T -> T . / F, *; T -> T . * F, *; E -> E + T . , +; T -> T . * F, ); T -> T . / F, )} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 27 | r2 | r2 | s20 | s19 | | r2 | | | | | | {T -> T . / F, -; T -> T . / F, +; E -> E - T . , +; T -> T . * F, /; T -> T . * F, +; T -> T . * F, -; T -> T . / F, /; T -> T . / F, *; E -> E - T . , ); E -> E - T . , -; T -> T . * F, *; T -> T . * F, ); T -> T . / F, )} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 28 | r5 | r5 | r5 | r5 | | r5 | | | | | | {T -> T / F . , *; T -> T / F . , ); T -> T / F . , -; T -> T / F . , +; T -> T / F . , /} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 29 | r4 | r4 | r4 | r4 | | r4 | | | | | | {T -> T * F . , *; T -> T * F . , -; T -> T * F . , ); T -> T * F . , /; T -> T * F . , +} | +----+-----+-----+-----+-----+----+-----+-----+-----+----+----+----+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-
Tokenize the input string.
The input string is
3*(6+(4/2)-5)+8
.lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool) tokens = lexer.tokenize('3*(6+(4/2)-5)+8') ptree.pprint(tokens)
+----+--------+-------+ | | SYMBOL | VALUE | +====+========+=======+ | 1 | num | 3 | +----+--------+-------+ | 2 | * | * | +----+--------+-------+ | 3 | ( | ( | +----+--------+-------+ | 4 | num | 6 | +----+--------+-------+ | 5 | + | + | +----+--------+-------+ | 6 | ( | ( | +----+--------+-------+ | 7 | num | 4 | +----+--------+-------+ | 8 | / | / | +----+--------+-------+ | 9 | num | 2 | +----+--------+-------+ | 10 | ) | ) | +----+--------+-------+ | 11 | - | - | +----+--------+-------+ | 12 | num | 5 | +----+--------+-------+ | 13 | ) | ) | +----+--------+-------+ | 14 | + | + | +----+--------+-------+ | 15 | num | 8 | +----+--------+-------+
-
Generate the parse tree.
parse_tree = parser.parse(tokens) ptree.render(parse_tree, directory='out', name='parse-tree', output_format='svg')
-
Get the predefined grammar.
1. S'->S 2. S->CβBA 3. A->Aαβ 4. A->αβ 5. B->C 6. B->Dβ 7. C->α 8. D->α
-
Write a config file (in YAML format).
nonterminal_symbols: # ? name ? A ? B ? C ? D ? S terminal_symbols: # name: regex α: a β: b ignored_symbols: start_symbol: S production_rules: # - left part -> right part - S -> C β B A - A -> A α β - A -> α β - B -> C - B -> D β - C -> α - D -> α
-
Generate the parse table.
import ptree config = ptree.load_config('config.yaml') grammar = ptree.Grammar(config) grammar.init() parser = ptree.Parser(grammar) ptree.pprint(grammar.parse_table)
+----+-----------------+-------------------+-----------------------------------------------------------------------------------------+ | | ACTION | GOTO | STATE | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | | α | β | $ | A | B | C | D | S | | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 0 | s1 | | | | | 3 | | 2 | {C -> . α, β; _S -> . S, $; S -> . C β B A, $} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 1 | | r6 | | | | | | | {C -> α . , β} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 2 | | | acc | | | | | | {_S -> S . , $} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 3 | | s4 | | | | | | | {S -> C . β B A, $} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 4 | s5 | | | | 6 | 8 | 7 | | {C -> . α, α; S -> C β . B A, $; D -> . α, β; B -> . D β, α; B -> . C, α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 5 | r6 | r7 | | | | | | | {C -> α . , α; D -> α . , β} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 6 | s10 | | | 9 | | | | | {A -> . A α β, $; A -> . A α β, α; A -> . α β, $; S -> C β B . A, $; A -> . α β, α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 7 | | s11 | | | | | | | {B -> D . β, α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 8 | r4 | | | | | | | | {B -> C . , α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 9 | s12 | | r1 | | | | | | {A -> A . α β, $; A -> A . α β, α; S -> C β B A . , $} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 10 | | s13 | | | | | | | {A -> α . β, $; A -> α . β, α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 11 | r5 | | | | | | | | {B -> D β . , α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 12 | | s14 | | | | | | | {A -> A α . β, $; A -> A α . β, α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 13 | r3 | | r3 | | | | | | {A -> α β . , $; A -> α β . , α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+ | 14 | r2 | | r2 | | | | | | {A -> A α β . , $; A -> A α β . , α} | +----+-----+-----+-----+---+---+---+---+---+-----------------------------------------------------------------------------------------+
-
Tokenize the input string.
The input string is
abababab
.lexer = ptree.Lexer(config, symbol_pool=grammar.symbol_pool) tokens = lexer.tokenize('abababab') ptree.pprint(tokens)
+---+--------+-------+ | | SYMBOL | VALUE | +===+========+=======+ | 1 | α | a | +---+--------+-------+ | 2 | β | b | +---+--------+-------+ | 3 | α | a | +---+--------+-------+ | 4 | β | b | +---+--------+-------+ | 5 | α | a | +---+--------+-------+ | 6 | β | b | +---+--------+-------+ | 7 | α | a | +---+--------+-------+ | 8 | β | b | +---+--------+-------+
-
Generate the parse tree.
parse_tree = parser.parse(tokens) ptree.render(parse_tree, directory='out', name='parse-tree', output_format='svg')