Processing XML in Python

Different libraries exist, the older dom and sax seem to be less commonly used and ElementTree seems to be the way to go.

ElementTree

ElementTree is in the python standard library and does therefore not specially be installed. For a tutorial look see http://effbot.org/zone/element-index.htm and for the official documentation see https://docs.python.org/2/library/xml.etree.elementtree.html

The ElementTree class contains an Element data structure that is basically an XML element. It also and obviously can contain links to other elements. All linked elements should have one top element, the root element. The root element is the entry point and forms a tree, the ElementTree.

Sub-Elements need to be linked to the parent element, this can also be done with the Element method followed by an append or insert method. The insert method lets to insert the element in any position whereas the append method puts it at last element. Instead of append, there is a SubElement method that does both steps in one.

To work with Etree see http://effbot.org/zone/element.htm

A C implementation is also available to be used under python.

lxml

LXML is a python library to work with XML, it expands ElementTree and takes care to be compatible to the ElementTree.

Some examples:

ElementTree writes xml data to files having no line breaks and therefore the data gets ugly when opened with an editor. Using lxml the data can be written to the file using the same command, but additional parameters as pretty_print=True and xml_declaration=True can be set to have a nice looking xml file with xml declaration.

Lxml allows to include comments ElementTree does not allow it.

See: http://lxml.de/for the complete documentation and solutions to your questions (including the use of xpath to access the data).

To work with it do

from lxml import etree

Linurs startpage