Dependencies

ttml-validator uses as few dependencies as possible outside what is bundled with Python.

Todo

add url links to the dependencies

Encoding checks: charset-normalizer

One of the issues we have frequently observed is incorrect encoding. Since many workflows produce TTML documents whose content originated in older binary formats such as STL, that use an 8 bit character set, the toolchain needs to convert the character encoding into Unicode, and then into UTF-8. Many tools do not do this properly, with results such as:

  • Documents labelled as UTF-8 but actually contain character content encoded in some other character set

  • Documents with an unexpected byte sequence at the beginning, such as a Byte Order Mark (BOM) that was prepended to a string representation of the documents and was then encoded as UTF-8

  • Documents containing illegal byte sequences, or null bytes, that have no decoding in UTF-8

The charset-normalizer library provides a mechanism for identifying the potential actual encodings of the document, and allowing it to be queried.

XML Schema checks: xmlschema and XSDs

A classical method for validating generic XML documents is to use a schema and process the document against that schema, to see if it conforms. Different schema languages exist; XML Schema Definition (XSD) documents are available for TTML, IMSC, EBU-TT-D and DAPT.

The xmlschema library is able to parse an XSD and use it to validate an XML document parsed as an ElementTree.

Note that XML Schema validation alone is unable to check all potential conformance requirements.

Two externally defined XSDs are copied into this repository:

See the schemas package.

Registries

Where value sets are defined in a managed list, or a “registry”, the list is held in the code as a JSON file. In the case of DAPT, those files are copied from the W3C specification repository at https://github.com/w3c/dapt/tree/main/registries . In the case of the TTML ttm:role registry, a JSON file has been synthesised from the specification and equivalent wiki page at https://www.w3.org/wiki/TTML/RoleRegistry .

See the registries package.

External test suites

An external test suite for DAPT exists at https://github.com/w3c/dapt-tests The DAPT validation code is tested against this suite, which is included as a git submodule in the test package.

Development dependencies

  • Documentation: sphinx.

  • Test coverage: coverage.

  • Python documentation string validation: pydocstyle.

  • Python source code checking/linting: flake8.

  • Debug adaptor: debugpy.