Skip to content Skip to sidebar Skip to footer

Parsing Cdata In Xml With Python

I need to parse an XML file with a number of blocks of CDATA that I need to retain for later plotting: <

Solution 1:

Here are two examples of how to do it:

from lxml import etree
import xml.etree.ElementTree as ElementTree

CONTENT = """
<processid="process1"><logname="name1"device="device1"><![CDATA[timestamp value]]></log><logname="name2"device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]></log></process>
"""

def parse_with_lxml():
    root = etree.fromstring(CONTENT)
    for log in root.xpath("//log"):
        print log.text

def parse_with_stdlib():
    root = ElementTree.fromstring(CONTENT)
    for log in root.iter('log'):
        print log.text

if __name__ == '__main__':
    parse_with_lxml()
    parse_with_stdlib()

Output:

timestampvaluetimestampvalue, timestampvalue, timestamptimestampvaluetimestampvalue, timestampvalue, timestamp

The text attribute it handles it in both cases.

Post a Comment for "Parsing Cdata In Xml With Python"