Skip to content

bpo-30117: fix lib2to3 ParserIdempotency test#1242

Merged
benjaminp merged 3 commits into
python:masterfrom
appeltel:fix-parser-idem-test
Jan 30, 2018
Merged

bpo-30117: fix lib2to3 ParserIdempotency test#1242
benjaminp merged 3 commits into
python:masterfrom
appeltel:fix-parser-idem-test

Conversation

@appeltel

@appeltel appeltel commented Apr 21, 2017

Copy link
Copy Markdown
Contributor

Fix two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test.

  1. Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig.

  2. If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function

For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM.

For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change.

bpo-30117

https://bugs.python.org/issue30117

@mention-bot

Copy link
Copy Markdown

@appeltel, thanks for your PR! By analyzing the history of the files in this pull request, we identified @loewis, @benjaminp and @warsaw to be potential reviewers.

Fix two spurious failure conditions in the
lib2to3.tests.test_parser.TestParserIdempotency
test_parser test.

1. Use the same encoding found in the initial file
   to write a temp file for a diff. This retains the BOM
   if the encoding was intially utf-8-sig

2. If the file cannot be parsed using the normal grammar,
   try again with no print statement which should
   succeed for valid files using __future__ print_function
Force newline character to be used for linebreaks in
generated file for lib2to3 parser idempotency tests. This
prevents spurious diff failures in OS environments where
other linebreak characters are used.
@benjaminp benjaminp force-pushed the fix-parser-idem-test branch from 56b6674 to cceff18 Compare January 30, 2018 06:12
@miss-islington

Copy link
Copy Markdown
Contributor

Thanks @appeltel for the PR, and @benjaminp for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6.
🐍🍒⛏🤖

@miss-islington

Copy link
Copy Markdown
Contributor

Sorry, @appeltel and @benjaminp, I could not cleanly backport this to 3.6 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 14e976e00e65bf343ba0fca016c3c9132a843daf 3.6

benjaminp pushed a commit that referenced this pull request Jan 30, 2018
Fix two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test.

    Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig.

    If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function

For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM.

For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change..
(cherry picked from commit 14e976e)
@bedevere-bot

Copy link
Copy Markdown

GH-5443 is a backport of this pull request to the 3.6 branch.

benjaminp added a commit that referenced this pull request Jan 30, 2018
…H-5443)

Fix two (in my opinion) spurious failure conditions in the lib2to3.tests.test_parser.TestParserIdempotency test_parser test.

    Use the same encoding found in the initial file to write a temp file for a diff. This retains the BOM if the encoding was initially utf-8-sig.

    If the file cannot be parsed using the normal grammar, try again with no print statement which should succeed for valid files using future print_function

For case (1), the driver was correctly handling a BOM in a utf-8 file, but then the test was not writing a comparison file using 'utf-8-sig' to diff against, so the BOM got removed. I don't think that is the fault of the parser, and lib2to3 will retain the BOM.

For case (2), lib2to3 pre-detects the use of from __future__ import print_function or allows the user to force this interpretation with a -p flag, and then selects a different grammar with the print statement removed. That makes the test cases unfair to this test as the driver itself doesn't know which grammar to use. As a minimal fix, the test will try using a grammar with the print statement, and if that fails fall back on a grammar without it. A more thorough handling of the idempotency test would to be to parse all files using both grammars and ignore if one of the two failed but otherwise check both. I didn't think this was necessary but can change..
(cherry picked from commit 14e976e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants