Skip to content

Commit 78948fe

Browse files
authored
Merge pull request #5 from elena-lyulina/inverse-parser
Added CLI for inverse parser
2 parents cb4842e + 0e0934e commit 78948fe

2 files changed

Lines changed: 221 additions & 8 deletions

File tree

README.md

Lines changed: 197 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
1-
# pythonparser
1+
# Pythonparser
2+
23
[![elena-lyulina](https://circleci.com/gh/elena-lyulina/pythonparser/tree/master.svg?style=shield)](https://app.circleci.com/pipelines/github/elena-lyulina/pythonparser?branch=master)
34

45
This repository contains parsers from **python code** to **xml/json** and vice versa.
5-
There are parsers for **python2** (see [pythonparser](src/main/python/pythonparser-2.py), source code from [this](https://github.com/GumTreeDiff/pythonparser) repository) and **python3** (see [pythonparser3](src/main/python/pythonparser-3.py), source code from [this](https://github.com/Varal7/pythonparser) repository and [this](https://eth-sri.github.io/py150) project).
6+
There are parsers for **python2** (see [pythonparser](src/main/python/pythonparser-2.py), source code from
7+
[GumTreeDiff pythonparser](https://github.com/GumTreeDiff/pythonparser) repository) and
8+
**python3** (see [pythonparser3](src/main/python/pythonparser-3.py), source code from [pythonparser](https://github.com/Varal7/pythonparser) repository and [150k Python Dataset](https://eth-sri.github.io/py150) project).
69

710
We are going to support Python 3.8 in **python3** parser:
811
- [ ] [the "walrus" operator](https://docs.python.org/3/whatsnew/3.8.html#assignment-expressions);
@@ -12,7 +15,7 @@ We are going to support Python 3.8 in **python3** parser:
1215
[Here](https://docs.python.org/3/whatsnew/3.8.html) you can read about all new features that Python 3.8 provides.
1316

1417

15-
## Installation
18+
### Installation
1619
- python2:
1720
    `pip install -r requirements.txt`
1821

@@ -21,14 +24,201 @@ We are going to support Python 3.8 in **python3** parser:
2124
- python3 tests:
2225
`pip3 install -r requirements-test.txt`
2326

24-
## Run parser
27+
### Run parser
2528
- python2:
2629
    `python pythonparser_2 path_to_src_file.py`
2730

2831
- python3:
2932
    `python3 pythonparser_3 path_to_src_file.py`
30-
## Run tests for pythonparser_3
33+
34+
To run tests for pythonparser_3:
35+
3136
`python3 -m pytest`
3237

33-
## TODO:
34-
- add some info about tree format
38+
### Examples
39+
40+
This section describes several examples of `pythonparser3` work.
41+
42+
<details><summary>First example</summary>
43+
44+
<p>
45+
46+
``` python
47+
a = 5
48+
b = 16.5
49+
print(a + b)
50+
```
51+
52+
</p>
53+
54+
<p>
55+
56+
``` xml
57+
<Module lineno="1" col="0" end_line_no="3" end_col="12">
58+
<Assign lineno="1" col="0" end_line_no="1" end_col="5">
59+
<Name_Store value="a" lineno="1" col="0" end_line_no="1" end_col="1">
60+
</Name_Store>
61+
<Constant value="5" value_type="int" lineno="1" col="4" end_line_no="1" end_col="5">
62+
</Constant>
63+
</Assign>
64+
<Assign lineno="2" col="0" end_line_no="2" end_col="8">
65+
<Name_Store value="b" lineno="2" col="0" end_line_no="2" end_col="1">
66+
</Name_Store>
67+
<Constant value="16.5" value_type="float" lineno="2" col="4" end_line_no="2" end_col="8">
68+
</Constant>
69+
</Assign>
70+
<Expr lineno="3" col="0" end_line_no="3" end_col="12">
71+
<Call lineno="3" col="0" end_line_no="3" end_col="12">
72+
<Name_Load value="print" lineno="3" col="0" end_line_no="3" end_col="5">
73+
</Name_Load>
74+
<BinOp_Add lineno="3" col="6" end_line_no="3" end_col="11">
75+
<Name_Load value="a" lineno="3" col="6" end_line_no="3" end_col="7">
76+
</Name_Load>
77+
<Name_Load value="b" lineno="3" col="10" end_line_no="3" end_col="11">
78+
</Name_Load>
79+
</BinOp_Add>
80+
</Call>
81+
</Expr>
82+
</Module>
83+
```
84+
85+
</p>
86+
87+
</details>
88+
89+
90+
<details><summary>Second example</summary>
91+
92+
<p>
93+
94+
``` python
95+
# Test example
96+
97+
from ast import NodeVisitor
98+
99+
100+
class Example(NodeVisitor):
101+
def generic_visit(self, node):
102+
print(type(node).__name__)
103+
NodeVisitor.generic_visit(self, node)
104+
```
105+
106+
</p>
107+
108+
<p>
109+
110+
``` xml
111+
<Module lineno="1" col="0" end_line_no="9" end_col="45">
112+
<ImportFrom value="ast" lineno="3" col="0" end_line_no="3" end_col="27" import_level="0">
113+
<alias value="NodeVisitor" lineno="3" col="0" end_line_no="3" end_col="4">
114+
</alias>
115+
</ImportFrom>
116+
<ClassDef value="Example" lineno="6" col="0" end_line_no="9" end_col="45">
117+
<bases lineno="6" col="0" end_line_no="9" end_col="45">
118+
<Name_Load value="NodeVisitor" lineno="6" col="14" end_line_no="6" end_col="25">
119+
</Name_Load>
120+
</bases>
121+
<keywords lineno="6" col="0" end_line_no="9" end_col="45">
122+
</keywords>
123+
<body lineno="6" col="0" end_line_no="9" end_col="45">
124+
<FunctionDef value="generic_visit" lineno="7" col="4" end_line_no="9" end_col="45">
125+
<arguments lineno="7" col="22" end_line_no="7" end_col="32">
126+
<posonlyargs lineno="7" col="22" end_line_no="7" end_col="32">
127+
</posonlyargs>
128+
<args lineno="7" col="22" end_line_no="7" end_col="32">
129+
<arg value="self" lineno="7" col="22" end_line_no="7" end_col="26">
130+
</arg>
131+
<arg value="node" lineno="7" col="28" end_line_no="7" end_col="32">
132+
</arg>
133+
</args>
134+
<kwonlyargs lineno="7" col="22" end_line_no="7" end_col="32">
135+
</kwonlyargs>
136+
<kw_defaults lineno="7" col="22" end_line_no="7" end_col="32">
137+
</kw_defaults>
138+
<defaults lineno="7" col="22" end_line_no="7" end_col="32">
139+
</defaults>
140+
</arguments>
141+
<body lineno="7" col="4" end_line_no="9" end_col="45">
142+
<Expr lineno="8" col="8" end_line_no="8" end_col="34">
143+
<Call lineno="8" col="8" end_line_no="8" end_col="34">
144+
<Name_Load value="print" lineno="8" col="8" end_line_no="8" end_col="13">
145+
</Name_Load>
146+
<Attribute_Load lineno="8" col="14" end_line_no="8" end_col="33">
147+
<Call lineno="8" col="14" end_line_no="8" end_col="24">
148+
<Name_Load value="type" lineno="8" col="14" end_line_no="8" end_col="18">
149+
</Name_Load>
150+
<Name_Load value="node" lineno="8" col="19" end_line_no="8" end_col="23">
151+
</Name_Load>
152+
</Call>
153+
<attr value="__name__" lineno="8" col="14" end_line_no="8" end_col="33">
154+
</attr>
155+
</Attribute_Load>
156+
</Call>
157+
</Expr>
158+
<Expr lineno="9" col="8" end_line_no="9" end_col="45">
159+
<Call lineno="9" col="8" end_line_no="9" end_col="45">
160+
<Attribute_Load lineno="9" col="8" end_line_no="9" end_col="33">
161+
<Name_Load value="NodeVisitor" lineno="9" col="8" end_line_no="9" end_col="19">
162+
</Name_Load>
163+
<attr value="generic_visit" lineno="9" col="8" end_line_no="9" end_col="33">
164+
</attr>
165+
</Attribute_Load>
166+
<Name_Load value="self" lineno="9" col="34" end_line_no="9" end_col="38">
167+
</Name_Load>
168+
<Name_Load value="node" lineno="9" col="40" end_line_no="9" end_col="44">
169+
</Name_Load>
170+
</Call>
171+
</Expr>
172+
</body>
173+
<decorator_list lineno="7" col="4" end_line_no="9" end_col="45">
174+
</decorator_list>
175+
</FunctionDef>
176+
</body>
177+
<decorator_list lineno="6" col="0" end_line_no="9" end_col="45">
178+
</decorator_list>
179+
</ClassDef>
180+
</Module>
181+
```
182+
183+
</p>
184+
185+
</details>
186+
187+
### Tree format
188+
189+
This section describes format of tree, that pythonparser-3 produces.
190+
191+
Produced tree is a valid XML document. Each node in the document corresponds to a node
192+
of Python AST. It is necessary to note several nuances of the format:
193+
1. Operations are directly included into node tag. They follow the `underscore` character.
194+
195+
<details><summary>Example</summary>
196+
197+
<p>
198+
199+
Node with `BinOp_Add` tag is `BinOp` (binary operation) node
200+
and operation of that node is addition.
201+
202+
</p>
203+
204+
</details>
205+
2. [Expression context](https://greentreesnakes.readthedocs.io/en/latest/nodes.html#Load)
206+
is directly included into node tag. It follows the `underscore` character.
207+
208+
<details><summary>Example</summary>
209+
210+
<p>
211+
212+
Node with `Name_Load` tag is `Name` node
213+
and the context of that `Name` is `Load`, which means that we "load" or "read" the
214+
content holden by `Name` node
215+
216+
</p>
217+
218+
</details>
219+
3. Attributes `lineno`, `col`, `end_line_no`, `end_col` exist in order to determine the position of the token.
220+
4. Nodes that represent constants (`Constant`, `Num`, `Str`) have
221+
attribute `value_type`, which stores the type of the constant.
222+
5. `ImportFrom` node has attribute `import_level`, which stores integer,
223+
[level of import](https://greentreesnakes.readthedocs.io/en/latest/nodes.html#ImportFrom).
224+

src/main/python/inverse_parser/inverse_parser_3.py

100644100755
Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1+
#!/usr/bin/env python3
2+
13
# Copyright (c) Aniskov N.
24

5+
import argparse
36
import ast
47
import logging
58
import xml.etree.ElementTree as ET
@@ -8,7 +11,7 @@
811

912
from src.main.python.inverse_parser._node_restorer import _NodeRestorer
1013
from src.main.util.const import LOGGER_NAME
11-
from src.main.util.file_util import get_content_from_file
14+
from src.main.util.file_util import get_content_from_file, create_file
1215
from src.main.util.log_util import log_and_raise_error
1316

1417
logger = logging.getLogger(LOGGER_NAME)
@@ -66,3 +69,23 @@ def get_xml_ast(self) -> ET.Element:
6669
def __init_xml_ast(self, filename: str) -> None:
6770
xml_str = get_content_from_file(filename, to_strip_nl=False)
6871
self.xml_ast_ = ET.fromstring(xml_str)
72+
73+
74+
if __name__ == '__main__':
75+
parser = argparse.ArgumentParser(description='Parse AST of python3 code in XML format to source code')
76+
parser.add_argument('filename', type=str, help='XML file with AST')
77+
parser.add_argument('-o', '--output_file',
78+
help='The name of the file '
79+
'where the generated code will be written'
80+
)
81+
82+
args = parser.parse_args()
83+
84+
inverse_parser = InverseParser(args.filename)
85+
gen_src = inverse_parser.get_source()
86+
87+
if args.output_file is not None:
88+
create_file(gen_src, args.output_file)
89+
90+
else:
91+
print(gen_src)

0 commit comments

Comments
 (0)