This is an update for the earlier Qt 5.12 performance blog post from last November. Measurements have been done with Qt 5.12.3 release, which contains further improvements to Qt Quick and QML performance compared to the Qt 5.12 beta used in the previous post.
QML Memory usage
Applications use memory for multiple things and to optimize the overall memory usage a good start is typically to look into the graphics assets, leveraging asset conditioning, the Qt Lite feature system to remove unused functionality and other techniques to optimize the memory usage. As it is very difficult for an application developer to affect the memory used by the QML engine itself, we have taken a deep look into how it can be reduced.
One important optimization done with Qt 5.12 LTS are to the data structures used in conjunction with compiled QML. The QML engine creates cache files that contain a dump of internal data structures on the fly when loading a .qml or .js file at run-time (named .qmlc and .jsc). If found, these files are mapped into the application process and used directly without requiring a compilation step. We have an option to create those cache files ahead of time while compiling the application. In that case, we don’t have quite the same degree of visibility into the type system as the QML engine has at run-time. This leads to a staged approach where we can’t resolve quite as much as when generating the data structures at run time, requiring us to do some additional work on top of the pre-generated cache files.
In Qt 5.12 LTS two changes are done to reduce the memory consumption when using such ahead-of-time generated cache files:
- Avoid the re-creation of the indexed string table for the in-memory generated compilation unit and instead fall back to the memory mapped string table provided from the cache file.
- Reduce the size of data structures so that the size of the files as well as their memory consumption is reduced.
In addition to the changes described above, we have done also multiple smaller improvements to reduce the memory consumption of the QML engine, while still keeping the performance on a good level (and actually improving quite a bit in some areas).
Actual memory usage of a Qt Quick applications depends a lot upon the application itself. In the comparison chart below, we have measured the QML engine related memory consumption for two different applications. We have used the Qt Quick Controls example to represent a Qt Quick application.
For the example application we can see that Qt 5.6.3 uses 36.3 MB, Qt 5.9.7 uses 18.5 MB and Qt 5.12.3 uses 13.0 MB RAM (using pre-compiled QML). This means that memory usage with Qt 5.12 is 30% lower than Qt 5.9 and 64% lower than Qt 5.6 with the example application. These savings are available for all users of Qt. For commercial users Qt 5.6 and 5.9 offered a separate tool called Qt Quick Compiler that can be used for pre-compiling QML. Using the Qt Quick Compiler, the memory usage of Qt 5.6 drops to 14.4 MB (60% lower RAM usage than without). So even when using the Qt Quick Compiler with 5.6, upgrading to Qt 5.12 will lead to a very nice 10% improvement in memory usage.
As all applications are different it is best to check how big the benefit is for your application. We compared the QML engine memory use of Qt 5.6 and 5.9 using Qt Quick Compiler and Qt 5.12 using the pre-generated QML with a rather complex real-life Qt Quick application. In this measurement Qt 5.9 used 1.1 MB less RAM than Qt 5.6 (both with Qt Quick Compiler), and Qt 5.12 used a whopping 9.8 MB less RAM than Qt 5.6 (with Qt Quick Compiler). Especially for embedded systems with constrained resources or ones that run multiple applications this type of RAM savings in the QML engine, with no changes to the application code, are very beneficial.
To highlight the improvement with Qt 5.9 and 5.12 we have normalized the result of Qt 5.6 to 100.
Overall improvement (TotalScore) of Qt 5.9.7 compared to Qt 5.6.3 is 1785% and Qt 5.12.3 compared to Qt 5.6.3 the improvement is 2138%. Compared to Qt 5.6.3, the biggest performance improvement is seen due to supporting JIT on 64-bit ARMv8 architecture used in many modern embedded processors. With Qt 5.6 JIT is not supported on 64-bit ARM.
Overall Qt Quick Performance
Because it is so easy to break performance, we regularly run several different kinds of performance tests. These also help in analyzing how the various optimizations work across different platforms and to make sure feature development and error corrections do not have unintended side effects to performance. Some of these results are visible at testsresults.qt.io, for example the QML Bench tests.
To compare the effect of Qt versions, we have run the full QML Bench test set with Qt 5.6.3, Qt 5.9.7 and Qt 5.12.3. To improve the comparability of results, we are running Qt Quick Controls 1 tests on Qt 5.6.3, as it does not have Qt Quick Controls 2 functionality. Otherwise the functionality tested by the QML Bench is same between these Qt versions. For these tests, QML Bench was run on an Nvidia Jetson TX1 development board (64-bit ARM processor) running the Nvidia supported Linux for Tegra Ubuntu 16.04.
QML Bench has been developed with typical application use cases in mind, trying to cover as many of those as possible in a comprehensive suite. But as with any benchmark, this still means that the improvements in your application might vary depending on how QML is being used there – so it is always recommended to gather performance data with your own application to see how much its performance improves.
In this post the focus was in the Qt Quick performance, where we have improved the performance of Qt 5.12 LTS significantly compared to the earlier LTS releases of Qt in three important areas:
- QML memory usage
- Overall Qt Quick performance
In addition to these, Qt 5.12 LTS provides many other performance improvements as well. One important improvement in Qt 5.12 LTS is the possibility to use pre-generated distance fields of fonts. Earlier Qt has created the distance fields during the application startup, which can consume a lot of CPU cycles especially for non-latin fonts (that have many glyphs) and complex latin fonts (that have a lot of different shapes). With Qt 5.12 LTS, we release a tool called “Qt Distance Field Generator”, which allows you to pregenerate the distance fields for either a selection of the glyphs in a font or all of them.
Other areas where we have been improving performance include, for example, Qt 3D CPU usage and Qt 3D memory consumption (especially beneficial for Qt 3D Studio applications). Qt 5.12 LTS also introduces a completely re-designed TableView, which takes the performance of large tables to completely different level than the old TableView implementation of Qt Quick Controls 1. Of course, Qt 5.12 LTS also benefits from the compressed texture support introduced with Qt 5.10 and improved with Qt 5.11, as well as multiple other improvements done since the previous LTS release of Qt.
If you have not yet looked into Qt 5.12 LTS, please take it for a spin.