Naive XML Bindings for python#

xsData is a complete data binding library for python allowing developers to access and use XML and JSON documents as simple objects rather than using DOM.

The code generator supports XML schemas, DTD, WSDL definitions, XML & JSON documents. It produces simple dataclasses with type hints and simple binding metadata.

The included XML and JSON parser/serializer are highly optimized and adaptable, with multiple handlers and configuration properties.

xsData is constantly tested against the W3C XML Schema 1.1 test suite.

Getting started#

$ # Install all dependencies
$ pip install xsdata[cli,lxml,soap]
$ # Generate models
$ xsdata tests/fixtures/primer/order.xsd --package tests.fixtures.primer
>>> # Parse XML
>>> from pathlib import Path
>>> from tests.fixtures.primer import PurchaseOrder
>>> from xsdata.formats.dataclass.parsers import XmlParser
>>> xml_string = Path("tests/fixtures/primer/sample.xml").read_text()
>>> parser = XmlParser()
>>> order = parser.from_string(xml_string, PurchaseOrder)
>>> order.bill_to
Usaddress(name='Robert Smith', street='8 Oak Avenue', city='Old Town', state='PA', zip=Decimal('95819'), country='US')

Check the documentation for more ✨✨✨


  • Generate code from:

    • XML Schemas 1.0 & 1.1

    • WSDL 1.1 definitions with SOAP 1.1 bindings

    • DTD external definitions

    • Directly from XML and JSON Documents

    • Extensive configuration to customize output

    • Pluggable code writer for custom output formats

  • Default Output:

    • Pure python dataclasses with metadata

    • Type hints with support for forward references and unions

    • Enumerations and inner classes

    • Support namespace qualified elements and attributes

  • Data Binding:

    • XML and JSON parser, serializer

    • PyCode serializer

    • Handlers and Writers based on lxml and native xml python

    • Support wildcard elements and attributes

    • Support xinclude statements and unknown properties

    • Customize behaviour through config

Changelog: 23.5 (2023-05-21)#

  • Fixed XML meta var index conflicts.

  • Fixed mixed content handling for DTD elements. (#749, #762)

  • Fixed an issue with required attributes turning into optional ones.

  • Fixed calculation of min/max occurs when parsing XML/JSON documents. (#756)

  • Fixed calculation of min/max occurs when parsing DTD choice content types. (#760)

  • Fixed an issue when parsing tail content for compound wildcard elements.

  • Fixed an issue with the code analyzer not fully processing some classes.

  • Fixed an issue with the code analyzer taking forever to process very large enumerations. (#776)

  • Fixed an issue in the JSON parser with optional choice elements.

  • Updated the transformer to silently ignore malformed JSON files. (#750)

  • Updated the override attribute handler to fix naming conflicts.

  • Updated the override attribute handler to allow wildcard overrides.

  • Updated conditions on extensions flattening (over-flattening). (#754)

  • Updated Group, AttributeGroup handling, skipping a few cases.

  • Updated how min/max occurs are calculated with nested containers.

  • Updated handling of element substitutions to treat them as choices. (#786)

  • Updated Pycodeserializer to skip default field values.

  • Updated flattening restriction base classes when sequence elements are out of order.

  • Updated docformatter to v1.6.5.

  • Added support to override compound fields.

  • Added support for multiple sequential groups in a class.

  • Added support for non-list compound fields.

  • Added support to mix list and non-list fields with sequence groups.

  • Added an option to include headers in generated files. (#746)

  • Added an option to cache the initial load and mapping of resources.

  • Added support for regular expressions in config substitutions. (#755)

  • Added a pretty print indentation option in the serializer config. (#780)

  • Added an option to set the encoding in the SOAP Client. (#773)

  • Added a CLI flag to show debug messages.

  • Added a debug message for possible circular references during code generation.

  • Added support to generate prohibited fields when they restrict parent fields. (#781)

This release is bigger than intended and includes many major changes, that’s why it took so long.

Why naive?

The W3C XML Schema is too complicated but with good reason. It needs to support any api design. On the other hand when you consume xml you don’t necessarily care about any of that. This is where xsData comes in, to simplify things by making a lot of assumptions like the following one that started everything:

All xs:schema elements are classes everything else is either noise or class properties