Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1214

xml file from ISO-8859-2 to UTF-8 in python

$
0
0

I need your help to resolve an encoding issue as it seems.

I have a lot of input files that have the same pattern has this below :

<?xml version='1.0' encoding='iso-8859-1'?><root><Module name="ModuleName"><Parameter Value="Data01$|Data02F1F5$|Data03:$|Data04 : $|"/></Module></root>

I need to be able to parse the file but there is a lot of special characters you can see below :

enter image description here

I can't use lxml or beautiful soup.

I tried the different options below but I couldn't find the solution :

from  xml.etree import ElementTreefile = 'StackOverflow.xml'with open(file, 'r', encoding = 'iso-8859-1') as f:    string = f.read()    print(string)with open(file, 'w', encoding = 'utf-8') as f:    f.write(string)with open(file, 'rb') as f :    root = ElementTree.fromstring(f.read())tree = ElementTree.ElementTree(root)tree.write(file, encoding='utf-8', xml_declaration = True)with open(file, 'rb') as f:    parser = etree.XMLParser(encoding = "iso-8859-1")    root = etree.parse(f, parser)string = etree.tostring(root, xml_declaration = True, encoding="utf-8").decode('utf-8').encode('iso-8859-1')with open('file', 'wb') as f:    target.write(string)

Viewing all articles
Browse latest Browse all 1214

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>