Python Forum
Write the XML file from elementtree with hexa decimal encoding
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Write the XML file from elementtree with hexa decimal encoding
#1
Hi,

I am new to python development. I am using the Elementtree to parse and manipulate the XML file which contains the special characters with hexadecimal encoding. After the manipulation, I want to write the XML with the same ending format. When I try to write the XML the special characters are printed as decimal numbers by default. Is there any option to write the XML with special characters as hexadecimal values? like UTF-8, ASCII encoding method. Please suggest.

Output:
<p aid:pstyle="TextInd" >In preparing the illustrations for this book, I&#x00A0;have relied on the generosity of private archive owners Aleksandr Lavrent&#x2019;ev, Natalia Galadzheva, Sophia Bogatyreva, Elena Radkovskaia, and Aleksandra Radkovskaia, as well as the helpful, considerate service of the staff at the Russian State Archive of Literature and the Arts (RGALI), The State Museum of V.V. Mayakovsky, The State Central Film Museum in Moscow<?AQ AQ:&#x00A0;Is &#x201C;The&#x201D; part of these names? If so cap ok. If not &#x201C;the&#x201D; should be lowercase?>, the U.S. National Library of Medicine, the Harvard University Archives, Cambridge University Library, Columbia University Library, and F.I.L.M. Archives Inc., New&#x00A0;York.</p>
Thanks,
Dillibabu.
Reply
#2
can you show minimal reproducible example of what you are doing?
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
import xml.etree.ElementTree as ET
import codecs

tree = ET.parse('D:\master_config.xml')
docXMl = ET.parse('D:\Agency_Contingency.xml')
root = tree.getroot()
docRoot = docXMl.getroot()
xx=root.findall(".//addattandbreak")

print(len(xx))
for item in xx:
myXPath="."+item.attrib['xpath'].replace("\"","'")
print(myXPath)
elemt=docRoot.findall(myXPath)
print(len(elemt))
for docItem in elemt:
docItem.set(item.attrib['attname'], item.attrib['attvalue'])
docXMl.write("D:\med-9780199361335_out.xml")
Reply
#4
obviously it's not reproducible - we don't have the input files. Also, please fix your indentation
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#5
import xml.etree.ElementTree as ET
import time
import codecs
now = time.time()
tree = ET.parse('D:\master_config.xml')
docXMl = ET.parse('D:\Agency_Contingency.xml')
root = tree.getroot()
docRoot = docXMl.getroot()
xx=root.findall(".//addattandbreak")

print(len(xx))
for item in xx:
    
    myXPath="."+item.attrib['xpath'].replace("\"","'")
    print(myXPath)
    elemt=docRoot.findall(myXPath)
    print(len(elemt))
    for docItem in elemt:
        docItem.set(item.attrib['attname'], item.attrib['attvalue'])  
        

docXMl.write("D:\med-9780199361335_out.xml")
[inline]<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book SYSTEM "d:\test.dtd">
<book xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
<index xml:id="Ind1">
<info>
<title>Index</title>
</info>
<indexdiv>
<info>
<title>A</title>
</info>
<indexentry xml:id="ind-001"><primaryie>Abbott, Greg <link linkend="pageC4.P62">C4.P62</link></primaryie>
</indexentry>
<indexentry xml:id="ind-002"><primaryie>Abraham, John <linkgroup><link linkend="pageC2.P52">C2.P52</link>–<link linkend="pageC2.P57">C2.P57</link></linkgroup>, <linkgroup><link linkend="pageC2.P61">C2.P61</link>–<link linkend="pageC2.P67">C2.P67</link></linkgroup>, <linkgroup><link linkend="pageC3.P52">C3.P52</link>–<link linkend="pageC3.P54">C3.P54</link></linkgroup></primaryie>
</indexentry></indexdiv>
</index>
</book>[/inline]


Output:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE book SYSTEM "d:\test.dtd"> <book xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"> <index xml:id="Ind1"> <info> <title>Index</title> </info> <indexdiv> <info> <title>A</title> </info> <indexentry xml:id="ind-001"><primaryie>Abbott, Greg <link linkend="pageC4.P62">C4.P62</link></primaryie> </indexentry> <indexentry xml:id="ind-002"><primaryie>Abraham, John <linkgroup><link linkend="pageC2.P52">C2.P52</link>&#x2013;<link linkend="pageC2.P57">C2.P57</link></linkgroup>, <linkgroup><link linkend="pageC2.P61">C2.P61</link>&#x2013;<link linkend="pageC2.P67">C2.P67</link></linkgroup>, <linkgroup><link linkend="pageC3.P52">C3.P52</link>&#x2013;<link linkend="pageC3.P54">C3.P54</link></linkgroup></primaryie> </indexentry></indexdiv> </index> </book>
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  If I open a file write or append, is the file loaded into RAM? Pedroski55 11 1,118 Jan-14-2026, 07:49 AM
Last Post: Pedroski55
  how to write/overwrite data in a txt. file according to inp Quinn 2 1,587 Aug-12-2025, 04:20 PM
Last Post: Quinn
  How can I write formatted (i.e. bold, italic, change font size, etc.) text to a file? JohnJSal 13 36,670 May-20-2025, 12:26 PM
Last Post: hanmen9527
  How to write variable in a python file then import it in another python file? tatahuft 4 2,253 Jan-01-2025, 12:18 AM
Last Post: Skaperen
  [SOLVED] [Linux] Write file and change owner? Winfried 6 3,193 Oct-17-2024, 01:15 AM
Last Post: Winfried
  What does .flush do? How can I change this to write to the file? Pedroski55 3 2,259 Apr-22-2024, 01:15 PM
Last Post: snippsat
  Last record in file doesn't write to newline gonksoup 3 2,852 Jan-22-2024, 12:56 PM
Last Post: deanhystad
  write to csv file problem jacksfrustration 11 8,640 Nov-09-2023, 01:56 PM
Last Post: deanhystad
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 5,858 Nov-09-2023, 10:56 AM
Last Post: mg24
  How do I read and write a binary file in Python? blackears 6 35,977 Jun-06-2023, 06:37 PM
Last Post: rajeshgk

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020