Skip to content

bpo-37374: Do not escape quotes in minidom inside text segments#14312

Closed
mitar wants to merge 3 commits into
python:mainfrom
mitar:fix-issue-37374
Closed

bpo-37374: Do not escape quotes in minidom inside text segments#14312
mitar wants to merge 3 commits into
python:mainfrom
mitar:fix-issue-37374

Conversation

@mitar

@mitar mitar commented Jun 22, 2019

Copy link
Copy Markdown

@mangrisano

Copy link
Copy Markdown
Contributor

Hi and thank you for the pull request.
Just one question: did you test the change?

@mitar

mitar commented Jun 22, 2019

Copy link
Copy Markdown
Author

Yes, I tested by running:

from xml.etree import ElementTree
text = ElementTree.Element('text')
text.text = 'f&oo"b<a>r'
xml_string = ElementTree.tostring(text)
import xml.dom.minidom as minidom
xml_tree = minidom.parseString(xml_string)
output = xml_tree.toprettyxml(indent='  ')
assert output.splitlines()[1] == xml_string.decode('utf8')
print(output)

which now returns:

<?xml version="1.0" ?>
<text>f&amp;oo"b&lt;a&gt;r</text>

I also ran whole Python test suite.

@mangrisano

Copy link
Copy Markdown
Contributor

Perhaps you should provide it in the test_xml module.
I'm not a core-dev so feel free to not follow my suggestion. :)

@mitar

mitar commented Jun 23, 2019

Copy link
Copy Markdown
Author

You are right. I should add the test.

This PR in fact makes minidom behave exactly the same as ElementTree.tostring. ElementTree.tostring is also not escaping quotes.

@mitar

mitar commented Oct 21, 2021

Copy link
Copy Markdown
Author

I have added tests.

@scoder scoder left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. However, I've rejected the ticket, so I have to reject the PR, too.

I'll leave my review comments anyway, in case it gets reconsidered at some later point.

Comment thread Lib/test/test_minidom.py
def testQuoteEscape(self):
text = ElementTree.Element('text')
text.text = 'f&oo"b<a>r'
xml_string = ElementTree.tostring(text)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not rely on an unrelated library providing the same text string serialisation. Instead, the test should use a plain string to compare against. Otherwise, changes in ElementTree could make this test fail without need.

Comment thread Lib/xml/dom/minidom.py
Comment on lines 1119 to +1120
def writexml(self, writer, indent="", addindent="", newl=""):
_write_data(writer, "%s%s%s" % (indent, self.data, newl))
_write_text_data(writer, "%s%s%s" % (indent, self.data, newl))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked deeply, but my gut feeling would prefer a special case for attribute values, not for text content. Can't say if that's similarly easy to achieve.

@scoder scoder closed this Aug 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants