Course Data

Course name: High-Performance XML with Python
Course length: 1 day
Remote: Yes
Open course: Yes
In-house: Yes
Course ID: XME
German course here

Course Finder

Find the right course for your needs..

Combining Topics

Company courses allow to assemble trainings combining topics from different courses.

Course Topics Overview as PDF

You can download our flyer. It has an overview of all our course topics.

High Performance XML with Python

This course is given by the main author of lxml_, the leading XML library for Python.

Target Audience

The course targets medium level to experienced Python programmers who want to generate and/or process XML (and, to some extend, HTML) content efficiently. A basic understanding of XML is helpful but not required.

Motivation

Since its early beginnings in 1998, the eXtensible Markup Language, XML, has grown into a standard markup language family for portable data formats. The major document formats, such as the Open Document Format (ODF) known from OpenOffice, or Microsoft’s so-called OpenXML format, are based on XML, just like many application level networking protocols such as XML-RPC, SOAP or Jabber/XMPP. Many interfaces of business applications use either standardized, proprietary or ad-hoc XML formats, and their configuration files are often written in XML, too. And clearly, XML has left its fingerprint on the web through RSS/Atom feeds, Ajax interfaces and configurable browser GUIs (XHTML/XUL).

The support of XML in programming languages has constantly improved over the last decade. Today, developers can grab very efficient tools from their tool box that substantially simplify XML handling. Not surprisingly, the Python programming language has some very powerful tools available that often even beat their main contenders from the Java world in terms of performance, and easily in terms of usability.

The objective of this course if to get an understanding of important XML technologies, and to learn how to use the available tools by example.

Content

Initially, the course will build up a common understanding of XML (specifically the XML Infoset) and some of its applications. The main theme then deals with efficient processing of XML (and a bit of HTML) in Python.

The presented tool set includes the ElementTree library that comes with Python since version 2.5, and the freely available lxml library that combines a compatible Python API with a large set of additional XML features.

Introduction to XML

  • XML and the XML Infoset

  • XML Namespaces

  • Dealing with XML formats

Fast XML processing

  • Parsing and serializing XML files

  • Extracting information from XML documents (tree navigation, XPath, CSS selectors)

  • Processing and transforming XML documents in main memory

  • Generating XML documents

  • Stream processing of large XML files that do not fit into main memory

Advanced topics

  • Creating proprietary XML formats

  • Validating XML formats with schema languages (e.g. RelaxNG, Schematron)

  • Binding XML documents to Python objects (lxml.objectify)

  • Creating application specific XML APIs with lxml

  • Introduction to stylesheet transformations (XSLT processing)

Note that the advanced topics are subject to time constraints. A choice will be made based on the interest of the participants.

Exercises

The participants can follow all steps directly on their computers. There are exercises at the end of each unit providing ample opportunity to apply the freshly learned knowledge.

Course Material

Every participant receives comprehensive materials in PDF format that cover the whole course content as well as all source code.

How to contact us:
Python Academy GmbH & Co. KG
Zur Schule 20
04158 Leipzig / Germany
Tel:+49 341 260 3370
Fax:+49 341 520 4495
mail:info@python-academy.de
How to contact us:
Python Academy GmbH & Co. KG
Zur Schule 20
04158 Leipzig / Germany
Tel:+49 341 260 3370
Fax:+49 341 520 4495
mail:info@python-academy.de