Eskil Abrahamsen Blomfeldt

Insanity is shaping the same text again and expecting a different result

Published Monday March 1st, 2010
21 Comments on Insanity is shaping the same text again and expecting a different result
Posted in C++, OpenGL, Painting, Performance, Qt

Albert Einstein has been quoted as saying that “insanity is doing the same thing over and over again and expecting a different result.” Apparently this is a misquote, and the original quote actually belongs to Rita Mae Brown, but that’s not important right now. What’s important is that most Qt applications are crazy.

I’ll explain. Some readers may remember Gunnar’s excellent blog series about graphics performance, how to get the most of it in Qt. He mentioned the fact a few times, that text rendering in Qt is slower than we’d like.

To see why text rendering is so slow, we need to look at what happens when you pass a QString into QPainter::drawText() and ask it to display it on screen. A QString is just an array of integer values which are defined to signify specific symbols in specific writing systems. How these symbols should actually look on the screen is defined by the font you have selected on your painter.

So the first step of drawText() is to take the code points and turn them into index values which reference an internal table in the font. The indices are specific to each font, and have no meaning outside the context of the current font.

The second step of drawText() is to collect data from the font which describe how the glyph should be positioned in relation to the surrounding glyphs. This step, the positioning of each glyph is potentially very complex. Several different tables in the font file need to be consulted, with programs and instructions that e.g. do things like kerning (allowing parts of certain glyphs to “hang over” og “stretch underneath” other glyphs) and placing one or more diacritical marks on the same character. Some writing systems also allow complex reordering of glyphs based on context of the surrounding characters, as explained by Simon in his blog from 2007. This complex shaping of the text is currently handled by the Harfbuzz library in Qt.

The third step applies only if the text has a layout applied to it. The layout would be the part which breaks text into nicely formatted lines. In Qt, this could be based on HTML code, using QTextDocument or WebKit, or it could be a simpler layout, just making the text wrap and align within a bounding rectangle. The former isn’t supported by QPainter::drawText(), so I’ll focus on the latter. Using information from the shaping step, the text layout calculates the width of unbreakable portions of the text and tries to format the text in a way which looks nice on screen but which does not expand beyond the bounds set by the user.

In the fourth and final step, the paint engine takes over. Its job is to draw the symbols retrieved in the first step at the positions calculated in the second and third step. In most of Qt’s performance-sensitive paint engines, this is done by caching a pixmap representation of the glyph the first time it is drawn, and then just redrawing this pixmap for every call. This is potentially very quick.

While these four steps may be slightly intertwined in Qt today, this is in principle what happens every single time you call drawText() and pass in a QString and a bounding QRect. Yet, in very many cases, both the text, the font and the rectangle remains completely static for the duration of your application, or at least for the main bulks of it. And this is the insane part: a lot of time is wasted here. Qt already provides QTextLayout as a way to cache the results of the first three steps and pushing this directly into the paint engine. However, QTextLayout is somewhat complicated to use, it has overheads related to its other use cases, and it stores a lot more information than what is needed specifically for putting the symbols on the screen, making it unsatisfactory in very memory sensitive settings.

We decided there was a need for a specialized class to solve this problem. We named it QStaticText, and it will be available in Qt 4.7. QStaticText has been optimized specifically for the use case of redrawing text which does not change from one paint event to another. We’ve tried to keep the memory footprint to a minimum, and currently it has an overhead of approximately 14 bytes per glyph (including the 2 bytes per unicode character in the string, which would assumably already be part of the application), as well as about 200 bytes of constant overhead.

In the rest of this blog, I’ll show some graphs to illustrate the benefits of using QStaticText for drawing text. QStaticText is supported by the raster engine (the software renderer used as default on Windows), the opengl engine and the openvg engine. For now, I’ll focus the attention of this blog on the raster engine and the opengl engine. I’ll also focus on the following platforms: Windows/desktop, Linux/desktop and the N900 (also running Linux, of course.) Note that the hardware on the Windows and Linux machines is different, so the results will not be comparable from platform to platform.

Benchmarks for fifty character, single-line text
The benchmark I’m running is this: drawing the same 50 character string over and over again in each paint event and measuring how many “glyphs per second” we can achieve using different techniques to draw the text. I am testing the following text drawing mechanisms:

  • A call to QPainter::drawText() with no bounding rectangle.
  • A call to QPainter::drawStaticText() with no bounding rectangle.
  • Caching the entire string in a pixmap before-hand and drawing this in each paint event using QPainter::drawPixmap().
  • When testing on the OpenGL paint engine, the graph will also contain results for QStaticText with the performance hint QStaticText::AggressiveCaching. This is a hint to the paint engine that it is allowed to cache its own data, trading some memory for speed. It is currently used by the OpenGL engine to cache the vertex and texture coordinate arrays that are passed to the GPU when drawing the glyphs.

    On Windows
    Lets start off with the results for the raster engine on Windows. As I said, the measurement is in “glyphs per second”, i.e. the number of symbols we can put to the screen during a second of running the test. The measurement is based on the frame rate of the test, which is taken as the average of nine seconds of execution per test case. Note that cleartype rendering was turned off in the OS during the test. The difference between a drawPixmap() result and a drawStaticText() result would be larger with cleartype turned on, but cleartype is not generally supported when caching the text in a pixmap, since the pixmap will inevitably need to have a transparent background, and you can’t do subpixel antialiasing on top of a transparent background. Therefore all the benchmarks are run without subpixel antialiasing to get a better comparison.


    As you can see, the fastest way to draw text is to cache it in a pixmap and draw this, as pixmap drawing is extremely fast on modern hardware. However, in many circumstances you don’t have the memory to spare for this kind of extravagance, and drawStaticText() pushes over half as many glyphs per second as the equivalent drawPixmap() call. It is also three times faster than a regular drawText() call.

    Using the OpenGL paint engine instead, performance of drawPixmap() shoots through the roof:


    The other bars look small in comparison, but drawStaticText() using the aggressive caching performance hint in fact pushes out 5,6 million glyphs per second in this benchmark, while a regular drawText() call manages a measly fifth of that.

    On Linux
    Similar numbers occur on Linux:


    Using drawStaticText() gives you more than a 2x performance boost over using drawText(), and drawPixmap() is a little bit less than 1,5 times the speed of drawStaticText(). When using the OpenGL engine, the difference is smaller:


    As you can see, drawing a cached pixmap on Linux desktop is only slightly faster than drawing the static text item when aggressive caching is used. The hardware and the driver both play a part here, but at the very least we can see that both outperform drawText() by seven or eight times.

    On N900
    All the benchmarks so far have been on the desktop, where memory is cheap. Caching a few text items as pixmaps may not be the proverbial drop on those platforms, and as we have seen, using pixmap caching has the potential of being really fast. On an embedded device, however, we need to be a little bit more careful when we allocate big chunks of memory, so something like QStaticText, which is both lean and fast, can be a great tool on these platforms. So lets look at a few benchmarks for the N900 as well.

    For the raster engine on the N900, the drawText() baseline performance on the N900 is currently nothing less of horrible, as you can see from the following chart:


    This is of course a puzzle which will be investigated closer, as there’s no reason why it should be this much slower to call drawText(), but for now we recommend using the native engine or a QGLWidget viewport on this device. At least it makes the other bars look really large in comparison. A more interesting result is that drawStaticText() can push as much as two thirds the number of glyphs per second as when just drawing a single pixmap that covers the same area, so we have a pretty good ratio of performance on this device.

    As we see from the following chart, similar numbers can be achieved when using the OpenGL engine:


    The benchmark results displayed here so far are for a single-line piece of text, thus there is no need for the third step in the overview from earlier, where the text is formatted based on a layout. This has some implications, namely that the drawText() call can skip the third step as outlined in the beginning of the blog, as it does not need to do any high level text layout. On text which requires this in addition, performance will be even worse with drawText(), but approximately the same with drawStaticText() and drawPixmap(), since the layout step has already been done in advance. Another thing to note is that the text is fairly long and fairly dense. For shorter texts, and/or text which has more space (such as a multi-line string might have), the performance of drawStaticText() may very well be greater than that of drawing a pixmap, since the number of pixels touched becomes a greater factor in the equation.

    An interesting measurement which is not included here, is the CPU load of the different functions also. We don’t have any formal benchmarks for that at the moment, but since less time is spent on CPU intensive work when using drawStaticText() over drawText(), the CPU will have more free time to do other stuff, which is a good thing. And another pleasant discovery we made while benchmarking QStaticText on the N900, is that you have to increase the number of draw-calls made per frame to a pretty high number for it to visibly factor into the time spent in the paint event. This means that even with, say, fifty strings, the drawStaticText() calls should not be any considerable impact on the performance of the application. Swapping the front and back buffers will still be the main bottle neck, which is a suitable ideal.

    So the bottom line is: If you are using drawText() in your application to draw text that is never or very rarely updated, then you might consider using QStaticText instead when you start building against Qt 4.7, and we’d love to hear what you think about the API and the performance once you get a chance to try it out.

    Do you like this? Share it
    Share on LinkedInGoogle+Share on FacebookTweet about this on Twitter

    Posted in C++, OpenGL, Painting, Performance, Qt


    domgenest says:

    Is the approximate release date of Qt 4.7 known?

    scorp1us says:

    What about QGraphicsText items?

    scorp1us says:

    What about affine drawing of QGraphicsTextItems? It would be handy if this was supported since scaling is problematic. If you scale QGraphicsTextItems you see pixels when using QGraphicsTextItems. Using paths is better, it keeps it pixel perfect, but how does one ideally cache that? Since scaling is common on graphics view, I really have need for it. I currently cache the painter path and draw that. Is there a better way?

    Tim says:

    Any idea why linux is so much slower than windows? Was it the same hardware?

    André says:

    Interesting development. What is perhaps a bit weird is the maximum size part of the API. Other parts of the Qt API allow you to set a maximum width, and get the required height back for the text. Why does this class work differently? What happens if the size you specify is not big enough to fit the text? Perhaps this should be reviewed before it is finalized. I hope I’m not required to first calculate the height using QTextLayout, and then recalculate that same height using QStaticText?

    IMHO, the API of this class should match the other text layout API’s in Qt as closely as possible.

    Anonymous says:

    The quote definitely fits Einstein, because that’s exactly what happens in experiments where quantum effects dominate: one does the same experiment again and again, and get different results. Consequently, Einstein argued, it can’t be exactly the same experiment that’s repeated. There must be a difference somewhere. Well, actually his argument was more complicated, but this is already too off topic as it is 🙂

    Fazer says:

    > Posted by Tim
    > Any idea why linux is so much slower than windows? Was it the same hardware?

    I’ll just quote the article:
    “Note that the hardware on the Windows and Linux machines is different, so the results will not be comparable from platform to platform.”

    Eskil Abrahamsen Blomfeldt says:

    André: The API of the maximum size is meant to emulate the drawText() case where you pass a bounding and clipping rectangle for the text. Since QStaticText was meant as an alternative to drawText(), we wanted to match the behavior and API as closely as possible. The bounding rectangle passed into the drawText() call is usually not the size you want the text to be, but its upper limit (hence the name “maximumSize”), such as the boundaries of the widget. The function is there to allow the text to wrap if it extends e.g. beyond the edge of the window.

    Edit: On second thought, setMaximumWidth() would probably serve the same purpose and make more sense in the API. We’ll consider this.

    Philippe says:

    Very interesting. Could that be used to improve existing controls such as QLabel? (without changing our code).

    Strahinja Markovic says:

    Will QStaticText be used in the QtWebkit port? I’m hoping for a rendering speed increase, it’s currently damn slow…

    lainwir3d says:

    What about QListView? Is there a way to make it use either drawStaticText or draw a CachedPixmap? Scrolling a QListView on my s3c2440 arm9 board is really slow at the moment so I thought that maybe this could improve things.

    dubik says:

    Would be nice if you would check how quickly those platforms can recreated cached image.

    Eskil Abrahamsen Blomfeldt says:

    dubik: We haven’t done any very thorough benchmarks for that yet, but we did test it, and recreating the cached image was measured as about 3/4 the speed of altering the static text on the N900. Altering the static text should be similar in performance to the difference between the drawText() and drawStaticText() bars in the graphs, although that’s just my educated guess, as I do not have numbers to back it up at the moment.

    To everyone who asked: I do think it makes sense to use QStaticText in several locations in Qt, but at the moment it has neither been tested nor discussed in any detail. The only downside I can see would be the slight increase in memory use for the widgets. It will most likely not happen for Qt 4.7.

    Strahinja: WebKit needs to have more low-level control over the text layout, so it will probably use a different but related approach. See for further details. I don’t know if they will be able to use QStaticText in the mean time.

    sdm says:

    What results are to be expected on Mac OSX ?

    espenr says:

    Cool! Could you write another post that goes a bit more in depth on how it actually works? Also, what are our plans on optimizing existing Qt widgets to make ues of this? (yes I could just ask you, but you weren’t in the office :D)

    Someone says:

    Lets gossip about the class name: The class QStaticText does not sound like any existing class ? Is it a graphicsitem – then name it QGraphicsStaticTextItem. But then, there is already one (and the feature is actually only worth a parameter) – lets keep it:

    QGraphicsTextItem( …, bool static = false )

    as should be

    QPainter::drawText( QString, bool static = false )

    In my humble opinion its more C++/Qt’ish..

    Eskil Abrahamsen Blomfeldt says:

    sdm: The results on Mac OS X (raster and OpenGL) should be similar to the ones in Linux.

    espenr: Yep, I’ll try to get around to that later =) Currently plans to optimize existing Qt widgets are pending until higher priority matters are settled, but the class was definitely written with those optimizations in mind. (I’m in my office now, if you have further questions 😉

    Someone: We require an object to keep the cache in, so a drawText() with an optional bool would not suffice (the Qt-way would probably be to have an enum instead of a bool, since a true/false at the end of an arbitrary function call is just mysterious). It is not a graphics item, either. It is a piece of text which should remain mostly static for the duration of the application. “Text” in Qt usually indicates that it’s a visual representation of text, and we prepended “static” to indicate its usage. Seeing as how it is not a graphics item (in which case it would of course have a name that included this information), I don’t understand how the name is different from the names of other stand-alone classes in Qt. We considered a range of names, but the great majority favored QStaticText over the alternatives.

    Someone says:

    Ok, ok – I hail to anybody who voted in oslo for “QGraphicsStaticTextItem” ?
    a) It’ll be in line with QGraphicsScene when start typing into assistant-qt4 “QGraphicsS..”.
    b) It would establish a QGraphicsStaticItem as a base for ..expensive drawing *’s… whatever …

    …great anyway that q*scene performance is so up in the priorities !

    Eskil Abrahamsen Blomfeldt says:

    Someone: Actually, the functionality provided by QStaticText is required in other use cases than just the Graphics View framework. Therefore it is currently not a QGraphicsItem, but a more low-level class. QGraphicsStaticTextItem would potentially be an additional class which used the functionality provided by QStaticText in the Graphics View framework. Personally, I would prefer it to be a property of QGraphicsTextItem instead. However, this class or function is currently not on the road map for Qt 4.7.

    Romain says:

    Very interesting article, raises a couple of questions:

    – I’m (positively) surprised to see that the OpenGL version of drawText() outperforms the Raster version on both Windows and Linux, whereas it’s always been assumed that text drawing in OpenGL had very bad performance, supposedly even worse than Raster. Is that a consequence of other improvements in the OpenGL paint engine maybe ?

    – I guess the numbers depend a lot of the OpenGL driver and version used. Could you mention this as well ? (OpenGL1 or OpenGL2 for windows/linux ? N900 is likely OpenGL/ES2) and which gfx card and driver on linux ?

    Anyway thanks, that’s very good news – and perfect timing 😉

    Julien says:

    This class looks very interesting ! However, looking at the API, it seems like the only way to draw formatted text is to use rich text (HTML). Have you considered an API similar to QTextLayout::setAdditionalFormats() ?

    Also, something like QTextLine::xToCursor() would be useful. It doesn’t need to be fast, but this is something that’s very hard to do manually, and is required to make parts of the text “clickable” for example.

    Basically the more of the QTextLayout API you support (as long as it doesn’t go against performance or memory use), the more people will be able to switch away from QTextLayout, and enjoy the performance benefits !

    Commenting closed.

    Get started today with Qt Download now