Skip to content Skip to sidebar Skip to footer

Elementtree Displaying Elements Out Of Order

I'm using Python's ElementTree to parse xml files. I have a 'findall' to find all 'revision' subelements, but when I iterate through the result, they are not in document order. Wha

Solution 1:

The documentation for ElementTree says that findall returns the elements in document order.

A quick test shows the correct behaviour:

import xml.etree.ElementTree as et

xmltext = """
<root>
    <number>1</number>
    <number>2</number>
    <number>3</number>
    <number>4</number>
</root>
"""

tree = et.fromstring(xmltext)

for number in tree.findall('number'):
    print number.text

Result:

1
2
3
4

It would be helpful to see the document you are parsing.


Update:

Using the source data you provided:

from __future__ import with_statement
import xml.etree.ElementTree as et

withopen('xmldata.xml', 'r') as f:
    xmldata = f.read()

tree = et.fromstring(xmldata)

for revision in tree.findall('.//{http://www.mediawiki.org/xml/export-0.5/}revision'):
    print revision.find('{http://www.mediawiki.org/xml/export-0.5/}text').text[0:10].encode('utf8')

Result:

‘The Mind 
{{db-spam}
‘The Mind 
'''The Min
<!-- Pleas

The same order as they appear in the document.

Post a Comment for "Elementtree Displaying Elements Out Of Order"