Generating Third-Party Attribution Documents

In this post, I'll show you how to use the qtattributionsscanner tool and Python to generate an attribution document for third-party code in Qt.

Qt and Third-Party Code

Qt is available under both opensource and commercial licenses. While commercial users are free to hide their use of Qt, GPL and LGPL mandate that the use of Qt is properly attributed. In addition, several Qt modules contain third-party code that are bundled under their own licenses. We make sure that the licenses are liberal, so that they do not restrict the use of Qt too much. However, several of them do require attribution, too. Typically, in the documentation of the final product.

We actually go to a great length to make sure that third-party code is correctly attributed in our documentation. We have a process in place that requires any change of third-party code to be carefully reviewed. Additions of new third-party code to a Qt library, or significant updates, need to be approved by Lars Knoll, the chief maintainer of the Qt Project. We use tools like Fossology to double check correct attributions. All third-party code will be listed in the documentation of the respective Qt module, but also in the Licenses used in Qt overview section of the documentation. Significant changes of third-party modules are listed in the respective modules change log. Starting with Qt 5.11, we also will list changes in a separate documentation page.

If you bundle Qt in your application or device, you also need to provide these attributions - usually in the form of an attribution section of your documentation. The most straightforward way to create such a document is to copy the sections in the Qt documentation. However, this is a lot of work and can lead to errors. Luckily, there is now a cleaner way. Since Qt 5.9, the attributions are not written in plain qdoc markup. Instead, they are saved as qt_attribution.json files that are located close to the actual source code. The format of these JSON files is simple and documented (see QUIP-7). Now we just need an application that finds the files and generates attribution documents in a format of your choice.

qtattributionsscanner

It turns out that we already have an application for the collection of qt_attribution.json files: The qtattributionsscanner command line tool. You can find it in the bin directory of Qt.

When you specify a source directory for the tool, it recursively searches for qt_attribution.json files in there, and parses them. In addition, it takes an -output-format parameter, which currently supports qdoc and json as arguments. qdoc output is the default, because the tool was developed for generating Qt documentation. json generates a document in the same format as the input files. As a result, the attributions from the directory tree are placed into one document.

You might ask yourself, how can you generate other formats besides json and qdoc? My first hack, when I had the need to generate a PDF, was to add a markdown argument to qtattributionsscanner. Markdown is easy to read by itself, and there are tools available that convert it to HTML, which in turn can be used to generate a PDF. Anyhow, this is just one format, and the exact layout and content were hard-coded to my needs, so I surely didn't want to merge this. Wouldn't it be more flexible to use a template engine, so that you can generate attribution files with the exact format and layout you want?

qtattributionsformatter

What we want is an application that generates documents out of a JSON document and a template file. In the spirit of "not reinventing the wheel", let's use a language and library that makes this easy. I've been going for Python, using the Jinja2 templating engine. This is the Python file for setting up the engine:

import argparse
import io
import json
import os
import sys
from jinja2 import Environment, FileSystemLoader, Template
 
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file", nargs="?", help="qt_attribution.json file",
                    type=argparse.FileType('r'), default=sys.stdin)
parser.add_argument("-o", "--out", nargs="?", help="file to write to",
                    type=argparse.FileType('w'), default=sys.stdout)
parser.add_argument("-t", "--template", help="template file to use", default="markdown.md")
parser.add_argument("-v", "--verbose", action="store_true")
args = parser.parse_args()
 
# Set up jinja2
env = Environment(
  loader = FileSystemLoader(os.path.dirname(os.path.realpath(__file__)) + "/templates")
)
template = env.get_template(args.template)
 
# Read in qt_attribution.json file
attributions = json.load(args.file)
 
# Load content of license files directly into the structure
for entry in attributions:
  if entry['LicenseFile']:
    if (args.verbose):
      sys.stderr.write("Loading " + entry['LicenseFile'] + "...\n")
    with io.open(entry['LicenseFile'], mode='r', encoding='utf-8', errors='replace') as content:
      entry['LicenseText'] = content.read()
 
# Render
result = template.render(attributions = attributions)
args.out.write(result.encode('UTF8'))

And this is the markdown.md file template:

# Attributions for Qt Libraries
 
{% for entry in attributions -%}
{%- if "libs" in entry.QtParts -%}
## {{entry.Name}}
 
Copyright
 
```
{{entry.Copyright}}
```
 
License: {{entry.License}}
 
```
{{entry.LicenseText}}
```
{%- endif %}
{%- endfor %}

With {%- if "libs" in entry.QtParts -%} ... %{-endif}, we filter out attributions that are not part of Qt libraries, but examples, tools, or tests.

You can also find the sources at https://git.qt.io/kakoehne/qtattributionsformatter.

Putting It Together

To generate a list of attributions in Qt in Markdown, we can now combine the qtattributionsscanner for the collection part and the new qtattributionsformatter.py for the formatting part:


qtattributionsscanner --output-format json qt-everywhere-src-5.11.0-rc1/ | python qtattributionsformatter.py > qt_attributions.md

Outlook

This was easy enough. So you might ask yourself why isn't this part of Qt yet, in this form or in a similar form? That is because there are some shortcomings:

  • Chromium uses its own file format, which qtattributionsscanner does not understand yet. Therefore, the output does not include third-party code inside Qt WebEngine yet.
  • You need a Qt source code checkout. I think we should package the JSON files as part of the build artifacts so that you can also generate documentation for a binary installation of Qt.
  • The attribution text is unnecessarily long, because a lot of license texts are duplicated.
  • When running on the default Qt checkout, the tool will pick up attributions for all Qt libraries, even if your application does not use a particular library. For now, you can limit this by either tweaking the source tree or adding some basic filtering to Python. Int the long run, this should be done automatically for you. The deployqt tools (windeploqt, macdeployqt ...) can determine which libraries to package and similar logic could be used for generating code attribution documentation.
  • Even if you just take the libraries into account, there might be code that your Qt build doesn't use. A lot of the third party code is in fact optional and only used on some platforms or in some setups.

The last issue raises some interesting questions. Should we use the Qt configuration system to select the necessary attributions? Or should we use Quartermaster, which uses a build graph to determine which attributions are needed? How big of a problem do 'false positives' present?

Even though more research is needed before making the generation of attribution files an official feature, I hope anybody struggling with this issue will find this proof-of-concept helpful.


Blog Topics:

Comments