Tuukka Turunen

Significant Performance Improvements with Qt 3D Studio 2.4

Published Tuesday June 18th, 2019
16 Comments on Significant Performance Improvements with Qt 3D Studio 2.4
Posted in Biz Circuit & Dev Loop, Design, Graphics, Performance, Qt 3D Studio

Speed of the 3D rendering is essential for a 3D engine, in addition to efficient use of system resources. The upcoming new Qt 3D Studio 2.4 release brings a significant boost to rendering performance, as well as provides further savings on CPU and RAM utilization. With our example high-end embedded 3D application the rendering speed is improved whopping 565%, while the RAM use and CPU load are down 20% and 51% respectively. 

Performance is a key driver for Qt and especially important for being able to run complex 3D applications on embedded devices. We have been constantly improving the resource efficiency with earlier releases of Qt 3D Studio and with the upcoming Qt 3D Studio 2.4 takes a major step forward in rendering performance. The exact performance increase depends a lot on the application and used hardware, so we have taken two example applications and embedded hardware for a closer look in this blog post. The example applications used in this post are automotive instrument clusters, but similar improvement can be seen in any application using Qt 3D Studio runtime.

Entry-level embedded example with Renesas R-Car D3

The entry-level embedded device used in the measurement is Renesas R-Car D3, which has the Imagination PowerVR GE8300 entry-class GPU (https://www.imgtec.com/powervr-gpu/ge8300/) and one ARM Cortex A53 CPU core. Operating system is Linux.

The example application used is the low-end cluster, available at https://git.qt.io/public-demos/qt3dstudio/tree/master/LowEndCluster. The low-end cluster example is well optimized, as described in a detailed blog post about optimizing 3D applications.

lowendcluster

 

In order to make the application as lightweight as possible, only the ADAS view is created as a real-time 3D user interface. Other parts of the instrument cluster are created with Qt Quick. This allows having a real-time 3D user interface even on a entry-class hardware like Renesas R-Car D3.

High-end embedded example with NVIDIA Tegra X2

The high-end embedded device used in the measurement is NVIDIA Jetson TX2 development board equipped with Tegra X2 SoC, which has 256-core NVIDIA Pascal™ GPU and Dual-Core NVIDIA Denver 2 64-Bit as well as Quad-Core ARM Cortex-A57 MPCore CPUs. Operating system is Linux.

The example application used is the Kria cluster, available at https://git.qt.io/public-demos/qt3dstudio/tree/master/kria-cluster-3d-demo. The Kria cluster example is made intentionally heavy with large and not fully optimized textures, high resolution etc.

kria3dclusterdemo

In the high-end example all the gauges and other elements are real-time 3D, rendered with the Qt 3D Studio runtime. There are very few Qt Quick parts and these are brought into the 3D user interface using texture sharing via QML streams.

Rendering performance improvement

The biggest improvement with the new Qt 3D Studio 2.4 release is to the rendering performance – getting the same application to render more Frames Per Second (FPS) on the same hardware. As always with Qt we aim to run steady 60 FPS, but on embedded devices pure performance is not enough. When there are items like heat management and tackling different usage scenarios it typically pays off not to run on the very edge of the SoC’s graphics capabilities. In the case of an application such as an instrument cluster, the performance needs to be smooth in all operation conditions, including under maximum load of the system. For measurement purposes with the high-end example we have disabled vsync, allowing the system to draw as many frames it can. In a typical real-life application there always is the vsync set, so anything that we can go over 60 FPS means saved processing resources.

The graphs below show the measured Frames Per Second with the high-end example on NVIDIA TX2 (vsync off) and with the low-end example on Renesas R-Car D3 (vsync on):

studio_fps

High end example: With the new Qt 3D Studio 2.4 we see a a whopping 565% improvement in the rendering performance. With Qt 3D Studio 2.3 the application was running only at 20 FPS, but the new Qt 3D Studio 2.4 allows the application to run 133 FPS. This is measured turning off vsync, just to measure the capability of the new runtime. In practice running 60 FPS is enough, and the additional capacity of the processor can be leveraged to have a larger screen (or another screen) or more complex application – or simply by not using the maximal capacity of the SoC to save on power.

Low-end example: The improvement is 46% because the maximum FPS is capped to 60 FPS by Qt Quick. With Qt 3D Studio 2.3 the application achieved 41 FPS, and with the new 2.4 runtime it reaches 60 FPS easily. Just like with the more powerful high-end hardware the excess capacity of the SoC can be used for running a more complex 3D user interface, or simply left unused.

CPU load improvement

The overall CPU load of an application is a sum of multiple things, one of them being the load caused by the 3D engine. In embedded applications it is important that using 3D in the application does not cause excessive load for the CPU. If the application exceeds the available CPU, it will not be able to render at target FPS and stuttering or other artefacts may appear on the screen.

The graphs below show the measured CPU load with the high-end example on NVIDIA TX2 and with the low-end example on Renesas R-Car D3:

studio_cpu

High-end example: With the new Qt 3D Studio 2.4 we see a hefty 51% improvement in the CPU load compared to Qt 3D Studio 2.3 while at the same time the FPS goes from 20 FPS to 133 FPS. The overall load with the Runtime 2.3 is 167% (of total 400%) and with the Runtime 2.4 the load drops to 81%. Note that the increased rendering speed has its effect on the CPU load as well. With the vsync on and FPS capped to 60 FPS, the CPU load is 74%.

Low-end example: We see only a modest 5% improvement in the CPU load, mainly due to the application being mostly Qt Quick. But this is with FPS going from 41 FPS up to 60 FPS at the same time. It should also be noted that the CPU of R-Car D3 is not very powerful, so the increased FPS of the overall application has its effect to the overall CPU load.

Memory usage improvement

For any graphics and especially 3D it is the assets that typically takes most of the RAM. There are ways to optimize, most notably avoiding unnecessary level of detail and leveraging texture compression. For the purposes of this blog post, we do not leverage any specific optimization methods. The measurements are done with exactly the same application, no other changes than using a different version of the Qt 3D Studio runtime.

The graphs below show the measured RAM use with the high-end example on NVIDIA TX2 and with the low-end example on Renesas R-Car D3:

studio_ram

High-end example: With the new Qt 3D Studio 2.4 we see a reduction of 48MB compared to Qt 3D Studio 2.3. This is 20% reduction to the overall RAM usage of the application.

Low-end Example: In the simpler example the reduction of RAM use is 9MB when using the new 2.4 runtime. Percentage-wise this is is a 15% reduction to the overall RAM usage of the application.

How was this achieved?

The improvements are really big especially on embedded, so one may wonder what was changed in the new version? What we did is to use the same runtime architecture as with Qt 3D Studio 1.x releases instead of running on top of Qt 3D. The core logic of the 3D engine is still the same as before, but it is running directly on top of OpenGL instead of using Qt 3D. This provides significantly improved performance especially on embedded devices, but also on more powerful desktop systems. By running Studio’s 3D engine directly on top of OpenGL we avoid overhead in rendering and simplify the architecture. The simpler architecture translates to less internal signalling, less objects in memory and reduced synchronization needs between multiple rendering threads. All this has allowed us to make further optimizations over the Qt 3D Studio 1.x – and of course to bring the new features developed in the Qt 3D Studio 2.x releases on top of the OpenGL based runtime.

The change in 3D runtime does not require any changes for most projects. Just change the import statement (import QtStudio3D.OpenGL 2.4 instead of import QtStudio3D 2.3) and then recompilation with new Qt 3D Studio 2.4 is enough. As API and the parts of the 3D engine relevant for the application are the same as earlier, all the same materials, shaders etc work just like before. In the rare cases where some changes are needed e.g. for some custom material, these are rather small.

Get Qt 3D Studio 2.4

If you have not yet tried out the Qt 3D Studio 2.4 pre-releases, you should take that for a spin. It is available with the online installer under the preview node. Currently we have the third Beta release out and soon provide the Release Candidate. Final release is targeted to be out before end of June. Qt 3D Studio is available under both the commercial and open-source licenses.

Do you like this? Share it
Share on LinkedInGoogle+Share on FacebookTweet about this on Twitter

Posted in Biz Circuit & Dev Loop, Design, Graphics, Performance, Qt 3D Studio

16 comments

Eli says:

Cool!
Can we expect Qt3d to leverage these changes as well? The current performance of the Qt3d is bad. The renderer is nowhere near the performance of other 3d engines like Godot for big scenes with many objects.

Cheers

@Eli: Unfortunately it is not possible to leverage these improvements with Qt 3D due to the differences in structure.

DEDIU IONUT says:

“What we did is to use the same runtime architecture as with Qt 3D Studio 1.x releases instead of running on top of Qt 3D. The core logic of the 3D engine is still the same as before, but it is running directly on top of OpenGL instead of using Qt 3D.”

Does this mean the end of Qt3D? Can’t we have a clear intuitive declarative 3d layer that strikes the right balance between ease of use and performance? Working directly with OpenGL is like programming in assembly when you could do it in Qt C++… In the programming languages analogy: I am not advocating with the ease of use obtained through Java or worse javascript (the spineless language so unstructured so agile so edgy that the best framework for it is called BACKBONE :)))) Is Qt3D not IT, not that magical balance? So sad…

@Dediu: No, this has no implications to Qt 3D. It continues to be fully supported module with its own programming API. Just the Qt 3D Studio 2.4 runtime is no longer using Qt 3D, but direct adaptation to OpenGL. No affect to any other users of Qt 3D.

Eli says:

Hi Tuukka Turunen thanks for answering our questions 🙂

It is a bit strange to me that Qt3d studio now has a separate rendering backend completely decoupled from qt3d instead of fixing the current design. From my point of view this can only mean that the current qt3d architecture has some flaws. Having to maintain two separate rendering backendends is a bit strange.

I hope I do not sound to harsh

Cheers

TTGil says:

It seems like the dev priorities in Qt3D were upside down for a while, with features like rigged character animations being added while the fundamentals were buggy and in many cases unusable. It has gotten a lot better lately (Qt 5.13 has noticeable quality improvements). I hope the investment in Qt3D continues. Even if the performance is not equal to using raw OpenGL, the long term benefit of having one code base that runs on GL, Metal, Vulkan, etc. is worth pursuing. Would love to see more examples of Qt3D used for AR/MR, especially around efficient processing of frames coming from the likes of ARKit with minimal performance penalty.

@TTGil: Yes, Qt 3D is receiving improvements and new features. For example the soon-to-be-released Qt 5.13 brings support for glTF 2.0 scene import to Qt 3D.

Christian Feldbacher says:

Hi TTGil,
you can find an AR solution for Qt provided by Felgo here:
https://felgo.com/cross-platform-development/qt-ar-why-and-how-to-add-augmented-reality-to-your-mobile-app

Cheers, Chris

York says:

I don’t know why the Nvidia engine does not have better performance in 2D rendering. Many customers are using 2D clusters.

Laszlo Agocs Laszlo Agocs says:

This (Qt 3D Studio 2.4) is the NVIDIA engine (more or less). 2D interfaces should continue using Qt Quick, as shown in https://blog.qt.io/blog/2019/04/02/optimizing-real-time-3d-entry-level-hardware/

M3GG1D0 says:

I thought Qt3D is going to benefit from these improvements. Then I studied the blog and comments, and realized that it is not the case. Qt3D API is nice and has lots of potential.

Uwe says:

Is there anything special about Qt 3D Studio or can we expect having a similar boost for any other application simply by dropping Qt 3D ? And why is the performance of Qt 3D so bad ?

@Uwe: The Studio 2.4 engine is created and optimized for its purpose, and especially on embedded it typically pays off. Note that these numbers are not comparing anything Qt 3D as such, just the case where Qt 3D Studio 2.3 engine was run on top of it. If your application is using Qt 3D directly and it is well fit for your use case and meets the performance goals, it is completely fine to continue with it.

Uwe says:

@Tukka: sure if performance does not matter the more convenient programming API of Qt 3D makes it an interesting option.

But what about applications, where performance matters. Every application has a specific purpose, where it needs to be optimized for and your blog gives the impression, that Qt 3D prevents you from doing these optimizations.

Actually it is a similar conclusion one gets from the fact, that you solved the performance problems of your controls being implemented in QML by a new implementation in C++ ( Quick Controls 1 -> Quick Controls 2 ).

In both cases you solve performance problems by avoiding the technology you recommend to your users.

@Uwe: I do not see the logic behind the claim of “In both cases you solve performance problems by avoiding the technology you recommend to your users.” Qt Quick Controls 2 is a new version and certainly recommended to be used rather than Qt Quick Controls 1 (which is deprecated). Using the new runtime of Qt 3D Studio 2.4 is simple and recommended update for all Qt 3D Studio users. What comes to Qt 3D it continues unchanged and can be used just like before. As already pointed out the optimizations done for the Qt 3D Studio 2.4 are not applicable for Qt 3D due to differences in structure.

Uwe says:

@Tuukka: the recommended technology I was talking about is QML – not Controls 1 vs. 2.

The main reason behind the bad statistics of Controls 1 is, that it is completely implemented in QML. That’s why the Qt development decided to do most of Controls 2 in C++.

Commenting closed.

Get started today with Qt Download now