1- # pythonparser
1+ # Pythonparser
2+
23[ ![ elena-lyulina] ( https://circleci.com/gh/elena-lyulina/pythonparser/tree/master.svg?style=shield )] ( https://app.circleci.com/pipelines/github/elena-lyulina/pythonparser?branch=master )
34
45This repository contains parsers from ** python code** to ** xml/json** and vice versa.
5- There are parsers for ** python2** (see [ pythonparser] ( src/main/python/pythonparser-2.py ) , source code from [ this] ( https://github.com/GumTreeDiff/pythonparser ) repository) and ** python3** (see [ pythonparser3] ( src/main/python/pythonparser-3.py ) , source code from [ this] ( https://github.com/Varal7/pythonparser ) repository and [ this] ( https://eth-sri.github.io/py150 ) project).
6+ There are parsers for ** python2** (see [ pythonparser] ( src/main/python/pythonparser-2.py ) , source code from
7+ [ GumTreeDiff pythonparser] ( https://github.com/GumTreeDiff/pythonparser ) repository) and
8+ ** python3** (see [ pythonparser3] ( src/main/python/pythonparser-3.py ) , source code from [ pythonparser] ( https://github.com/Varal7/pythonparser ) repository and [ 150k Python Dataset] ( https://eth-sri.github.io/py150 ) project).
69
710We are going to support Python 3.8 in ** python3** parser:
811- [ ] [ the "walrus" operator] ( https://docs.python.org/3/whatsnew/3.8.html#assignment-expressions ) ;
@@ -12,7 +15,7 @@ We are going to support Python 3.8 in **python3** parser:
1215[ Here] ( https://docs.python.org/3/whatsnew/3.8.html ) you can read about all new features that Python 3.8 provides.
1316
1417
15- ## Installation
18+ ### Installation
1619- python2:
1720 ` pip install -r requirements.txt `
1821
@@ -21,14 +24,201 @@ We are going to support Python 3.8 in **python3** parser:
2124- python3 tests:
2225 ` pip3 install -r requirements-test.txt `
2326
24- ## Run parser
27+ ### Run parser
2528- python2:
2629 ` python pythonparser_2 path_to_src_file.py `
2730
2831- python3:
2932 ` python3 pythonparser_3 path_to_src_file.py `
30- ## Run tests for pythonparser_3
33+
34+ To run tests for pythonparser_3:
35+
3136` python3 -m pytest `
3237
33- ## TODO:
34- - add some info about tree format
38+ ### Examples
39+
40+ This section describes several examples of ` pythonparser3 ` work.
41+
42+ <details ><summary >First example</summary >
43+
44+ <p >
45+
46+ ``` python
47+ a = 5
48+ b = 16.5
49+ print (a + b)
50+ ```
51+
52+ </p >
53+
54+ <p >
55+
56+ ``` xml
57+ <Module lineno =" 1" col =" 0" end_line_no =" 3" end_col =" 12" >
58+ <Assign lineno =" 1" col =" 0" end_line_no =" 1" end_col =" 5" >
59+ <Name_Store value =" a" lineno =" 1" col =" 0" end_line_no =" 1" end_col =" 1" >
60+ </Name_Store >
61+ <Constant value =" 5" value_type =" int" lineno =" 1" col =" 4" end_line_no =" 1" end_col =" 5" >
62+ </Constant >
63+ </Assign >
64+ <Assign lineno =" 2" col =" 0" end_line_no =" 2" end_col =" 8" >
65+ <Name_Store value =" b" lineno =" 2" col =" 0" end_line_no =" 2" end_col =" 1" >
66+ </Name_Store >
67+ <Constant value =" 16.5" value_type =" float" lineno =" 2" col =" 4" end_line_no =" 2" end_col =" 8" >
68+ </Constant >
69+ </Assign >
70+ <Expr lineno =" 3" col =" 0" end_line_no =" 3" end_col =" 12" >
71+ <Call lineno =" 3" col =" 0" end_line_no =" 3" end_col =" 12" >
72+ <Name_Load value =" print" lineno =" 3" col =" 0" end_line_no =" 3" end_col =" 5" >
73+ </Name_Load >
74+ <BinOp_Add lineno =" 3" col =" 6" end_line_no =" 3" end_col =" 11" >
75+ <Name_Load value =" a" lineno =" 3" col =" 6" end_line_no =" 3" end_col =" 7" >
76+ </Name_Load >
77+ <Name_Load value =" b" lineno =" 3" col =" 10" end_line_no =" 3" end_col =" 11" >
78+ </Name_Load >
79+ </BinOp_Add >
80+ </Call >
81+ </Expr >
82+ </Module >
83+ ```
84+
85+ </p >
86+
87+ </details >
88+
89+
90+ <details ><summary >Second example</summary >
91+
92+ <p >
93+
94+ ``` python
95+ # Test example
96+
97+ from ast import NodeVisitor
98+
99+
100+ class Example (NodeVisitor ):
101+ def generic_visit (self , node ):
102+ print (type (node).__name__ )
103+ NodeVisitor.generic_visit(self , node)
104+ ```
105+
106+ </p >
107+
108+ <p >
109+
110+ ``` xml
111+ <Module lineno =" 1" col =" 0" end_line_no =" 9" end_col =" 45" >
112+ <ImportFrom value =" ast" lineno =" 3" col =" 0" end_line_no =" 3" end_col =" 27" import_level =" 0" >
113+ <alias value =" NodeVisitor" lineno =" 3" col =" 0" end_line_no =" 3" end_col =" 4" >
114+ </alias >
115+ </ImportFrom >
116+ <ClassDef value =" Example" lineno =" 6" col =" 0" end_line_no =" 9" end_col =" 45" >
117+ <bases lineno =" 6" col =" 0" end_line_no =" 9" end_col =" 45" >
118+ <Name_Load value =" NodeVisitor" lineno =" 6" col =" 14" end_line_no =" 6" end_col =" 25" >
119+ </Name_Load >
120+ </bases >
121+ <keywords lineno =" 6" col =" 0" end_line_no =" 9" end_col =" 45" >
122+ </keywords >
123+ <body lineno =" 6" col =" 0" end_line_no =" 9" end_col =" 45" >
124+ <FunctionDef value =" generic_visit" lineno =" 7" col =" 4" end_line_no =" 9" end_col =" 45" >
125+ <arguments lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
126+ <posonlyargs lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
127+ </posonlyargs >
128+ <args lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
129+ <arg value =" self" lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 26" >
130+ </arg >
131+ <arg value =" node" lineno =" 7" col =" 28" end_line_no =" 7" end_col =" 32" >
132+ </arg >
133+ </args >
134+ <kwonlyargs lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
135+ </kwonlyargs >
136+ <kw_defaults lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
137+ </kw_defaults >
138+ <defaults lineno =" 7" col =" 22" end_line_no =" 7" end_col =" 32" >
139+ </defaults >
140+ </arguments >
141+ <body lineno =" 7" col =" 4" end_line_no =" 9" end_col =" 45" >
142+ <Expr lineno =" 8" col =" 8" end_line_no =" 8" end_col =" 34" >
143+ <Call lineno =" 8" col =" 8" end_line_no =" 8" end_col =" 34" >
144+ <Name_Load value =" print" lineno =" 8" col =" 8" end_line_no =" 8" end_col =" 13" >
145+ </Name_Load >
146+ <Attribute_Load lineno =" 8" col =" 14" end_line_no =" 8" end_col =" 33" >
147+ <Call lineno =" 8" col =" 14" end_line_no =" 8" end_col =" 24" >
148+ <Name_Load value =" type" lineno =" 8" col =" 14" end_line_no =" 8" end_col =" 18" >
149+ </Name_Load >
150+ <Name_Load value =" node" lineno =" 8" col =" 19" end_line_no =" 8" end_col =" 23" >
151+ </Name_Load >
152+ </Call >
153+ <attr value =" __name__" lineno =" 8" col =" 14" end_line_no =" 8" end_col =" 33" >
154+ </attr >
155+ </Attribute_Load >
156+ </Call >
157+ </Expr >
158+ <Expr lineno =" 9" col =" 8" end_line_no =" 9" end_col =" 45" >
159+ <Call lineno =" 9" col =" 8" end_line_no =" 9" end_col =" 45" >
160+ <Attribute_Load lineno =" 9" col =" 8" end_line_no =" 9" end_col =" 33" >
161+ <Name_Load value =" NodeVisitor" lineno =" 9" col =" 8" end_line_no =" 9" end_col =" 19" >
162+ </Name_Load >
163+ <attr value =" generic_visit" lineno =" 9" col =" 8" end_line_no =" 9" end_col =" 33" >
164+ </attr >
165+ </Attribute_Load >
166+ <Name_Load value =" self" lineno =" 9" col =" 34" end_line_no =" 9" end_col =" 38" >
167+ </Name_Load >
168+ <Name_Load value =" node" lineno =" 9" col =" 40" end_line_no =" 9" end_col =" 44" >
169+ </Name_Load >
170+ </Call >
171+ </Expr >
172+ </body >
173+ <decorator_list lineno =" 7" col =" 4" end_line_no =" 9" end_col =" 45" >
174+ </decorator_list >
175+ </FunctionDef >
176+ </body >
177+ <decorator_list lineno =" 6" col =" 0" end_line_no =" 9" end_col =" 45" >
178+ </decorator_list >
179+ </ClassDef >
180+ </Module >
181+ ```
182+
183+ </p >
184+
185+ </details >
186+
187+ ### Tree format
188+
189+ This section describes format of tree, that pythonparser-3 produces.
190+
191+ Produced tree is a valid XML document. Each node in the document corresponds to a node
192+ of Python AST. It is necessary to note several nuances of the format:
193+ 1 . Operations are directly included into node tag. They follow the ` underscore ` character.
194+
195+ <details ><summary >Example</summary >
196+
197+ <p >
198+
199+ Node with ` BinOp_Add ` tag is ` BinOp ` (binary operation) node
200+ and operation of that node is addition.
201+
202+ </p >
203+
204+ </details >
205+ 2 . [ Expression context] ( https://greentreesnakes.readthedocs.io/en/latest/nodes.html#Load )
206+ is directly included into node tag. It follows the ` underscore ` character.
207+
208+ <details><summary>Example</summary>
209+
210+ <p>
211+
212+ Node with `Name_Load` tag is `Name` node
213+ and the context of that `Name` is `Load`, which means that we "load" or "read" the
214+ content holden by `Name` node
215+
216+ </p>
217+
218+ </details>
219+ 3 . Attributes ` lineno ` , ` col ` , ` end_line_no ` , ` end_col ` exist in order to determine the position of the token.
220+ 4 . Nodes that represent constants (` Constant ` , ` Num ` , ` Str ` ) have
221+ attribute ` value_type ` , which stores the type of the constant.
222+ 5 . ` ImportFrom ` node has attribute ` import_level ` , which stores integer,
223+ [ level of import] ( https://greentreesnakes.readthedocs.io/en/latest/nodes.html#ImportFrom ) .
224+
0 commit comments