gunnar
Qt
Painting
Graphics Dojo
Posted by gunnar
 in Qt, Painting, Graphics Dojo
 on Monday, September 22, 2008 @ 13:07

Sorry guys…

I know you have spent a lot of money on buying faster machines to get Qt to run fast enough for the software you are creating, but I’m now sad to inform you that Qt won’t make full use of these machines. In fact it will use your CPU a little bit less.

Jokes aside… We’ve been running a few optimizations internally, under the code name “Qt Falcon” (its going to make Qt fly!), and I wanted to share some of the current results. These things are not yet integrated into Qt Main, but they should be in place by the time Qt 4.5 goes Tech Preview.

Personally I sit on Windows for most of my daily work, so the benchmarks are from a Windows XP machine, and during the falcon initiative we did take the time to iron out a few things that has bothered us with the windows code paths along the way. Windows uses a software rendering engine, internally referred to as the “Raster Engine”. This is also the engine used for embedded and QImage drawing on X11 and Mac.

Lets start at the beginning, QPainter::begin(), a function that is called at the start of every single paint event in the history of Qt. When we originally designed the paint engines, we aimed at them being shared amongst different instances of the same subclass. e.g. all QWidget painting was done using a single paint engine. For this reason, we put most of the initialization logic into QPaintEngine::begin(), because the actual device changed all the time. With the introduction of the backingstore in 4.2 all widgets are actually being drawn to the same device so this begin() initialization started to make less sense, but the design stayed. With 4.5, we make raster engines be one per image and one per backingstore. The initialization is done outside of begin(), in the constructor if you can believe that.

On Windows, we also checked, in QRasterPaintEngine::begin(), if the system had switched its cleartype settings since the last time. This check was actually costing ~25% of the call to QPainter::begin() (ouch!). By listening for the system event when the users changes the settings instead of polling the value in the registry, we could kill those 25% and also support the feature that Qt switches cleartype when the user press “Apply” in the control panel, which we previously didn’t do.

A comparison of a plain QPainter::begin() / end() looks like this:

begin_end.png

The graph shows, in microseconds, the time to create a QPainter on a device and call end on it

We had a fair idea that save / restore was very costly when clipping was enabled in the 4.4. Part of the reason for this is that the communication between QPainter / QPaintEngine was done via a flat state update. A restore was performed by replaying the previous stack element, so if you consider the case of

p.setClipRect(rect1, Qt::ReplaceClip);
p.scale(2, 2);
p.setClipRect(rect2, Qt::IntersectClip);
p.rotate(10);
p.setClipRect(rect3, Qt::IntersectClip);
p.save();
p.drawStuff();
p.restore();

In the last line, restore(), QPainter would replay three clip operations to the underlying engine. Horrible you may think, and it is a piece of code that has been troubling me since I wrote it in the early 4.0 days. With Falcon, the engines can be made aware of QPainters stack, making it possible to cache the results on each level. That means that restore() becomes just a stack pop(). The results look like this:

save_restore.png

The graph shows, in microseconds, the time it takes to run save, followed by a state change, followed by restore

Another Windows thing that had bugged me for a while is the text drawing. Two separate things came together in QRasterPaintEngine::drawTextItem() as a bit of a mess. Again, I’m much to blame and the code has been troubling me for some time, but I didn’t have time to get back and fix these things. Until now, anyway… Point one, was that the only way to draw nice fonts on windows is using GDI, it does (in my opinion, I know people disagree ;) ) the nicest font rendering of all the systems, so using FreeType or another method would not be acceptable visually, as Qt would look worse than other apps on the platform. So we had to mix GDI and our own raster engine together to do clipping, textured / gradient text. The other point is that any pixel touched by GDI will have 0 as its alpha channel. The raster engine relies on premultiplied alpha so this basically destroys all rendering done afterwards. *sigh*

The solution to the two was to use GDI to render into a buffer and sample the values back into the raster engines buffer and at the same time patching the alpha channel. For cleartype this was even worse as the separate buffer first needed to be filled with the background (cleartype pixels depend on the background as well as the foreground). It ran “ok”, but it we were quite aware of that it could be done better. With Falcon, we introduce a mask-texture for each font engine, which generates the glyphs once using GDI. The approach supports both normal and cleartype text drawing and the cleartype approach even does the full RGB blend with Gamma correction (thanks to Samuel who spent an entire day with me to get the gamma correction of three-component alphablending proper). The speedups are quite noticeable. The results are measured in milliseconds:

text_windows.png

The graph shows, in milliseconds, the time it takes to draw the text “abcdefg” on a QPixmap using the default font with either cleartype or non-cleartype

We also removed some of the overhead of drawPixmap() on the raster engine. The problem was that there was a bit of set-up before we could get into the actual pixel-by-pixel blending. Instead of going through the generic rendering pipeline, we introduced a faster path for unclipped pixmaps (or pixmaps that fit inside a rectangular clip). I won’t bore you with the details, as this blog is already twice as long as I intended it to be. The results for 4.4 and Falcon fall together and become identically, speed wise, at around 200×200, but for the icon-size pixmap drawing, there is quite a visible difference.

draw_pixmap.png

The graph shows, in microseconds, how long time it takes to draw a QPixmap, solid or semi-transparent, at the specified sizes

What I’ve mentioned above, are a bunch of separate small things, and you may be thinking that how does this affect me in the real-world. You probably don’t do for-loops of begin/end() or save/restore. So I’ll finish up with some real-world examples. The numbers below are taken from a benchmark of a few widgets where we simply run “repaint()” a bunch of times. The QLabel numbers don’t have much relation to QComboBox etc, so don’t pay too much attention to that, but rather how each widget changes from 4.4 to Falcon.

overall.png

The graph shows, in milliseconds, how long repaint of a few widgets take

Now, much of the overall speedup here, probably over 50% of it is caused by some awesome work that Bjørn Erik has done in the backingstore. The rest is due to small things here and there, like the ones mentioned above.

There has been quite a bit of refactoring in the works here, and some work is still ahead of us, (like re-enabling GDI based glyph-generation of transformed text, which will be vector path based in the upcoming TP) but I hope that the changes we’ve made will benefit most users and that those that see problems with the new approaches let us know so we can look at those too.

Cheers!

17 Responses to “Sorry guys…”

» Posted by Shriramana Sharma
 on Monday, September 22, 2008 @ 15:35

Hello all this is about Qt/Win. Is there any improvement in the speed of rendering on Qt/X11? I find that KDE 4’s graphics rendering (which is done by Qt 4.4, if I am not mistaken) is much slower and less clean than KDE 3’s rendering done by Qt 3. And this is on Linux/X11.

» Posted by Pete
 on Monday, September 22, 2008 @ 16:18

Sounds exciting. Can’t wait to hear about X11 performance.

» Posted by David W
 on Monday, September 22, 2008 @ 23:14

As awesome as this all is, my reading of this post hit multiple speedbumps as I tried to parse graphs with no Y-axis label. :)

» Posted by Björn
 on Tuesday, September 23, 2008 @ 07:54

That sounds great. Any chance that this also helps with Stylesheet performance?

» Reply from gunnar
 on Tuesday, September 23, 2008 @ 08:04
gunnar

@Shririamana: Qt is responsible for the interior of all the widgets in KDE 3 and KDE4. The window compositing and top-level transparency and blurring is done by the KDE window manager, X and XRender. From my limited use of KDE 4, it feels like the top-level window composition is a bit sluggish, but I’ve only had minimal exposure. In addition, KDE 4 makes use of many of the more advanced QPainter features for the window interior rendering like gradients and transparency and these will naturally be more costly than the solid colors used in KDE 3. Gradients are especially bad on Qt / X11 currently, but I think this topic merits a separate blog entry… stay tuned…

@David W: Thanks for the comment about the graphs. Its the first time I did graphs in Open Office, and I only found the y-axis label set’er halfway into the process. I’ve added a small comment to all the graphs now, so I hope that clarifies the numbers.

» Posted by damien m
 on Tuesday, September 23, 2008 @ 17:21

yes what about styleSheet performance wich are “not so fast”

» Reply from gunnar
 on Wednesday, September 24, 2008 @ 06:35
gunnar

@Björn/damien: I haven’t looked at style sheet performance in spesific, but the state change and pixmap optimizations should make a bit of a difference.

» Posted by Frédéric COIFFIER
 on Wednesday, September 24, 2008 @ 08:53

Will these optimizations improve performance on a remote connection (like the standard “export display” or with an FreeNX/NX protocol) ?

» Posted by Tsiolkovsky
 on Thursday, September 25, 2008 @ 14:03

I’m also interested to see these banchmarks from Linux/X11 to see how much improvement we can expect there. Also looking forward to read about the gradients on X11. Are there any improvements for this being prepared for Qt 4.5? Anyways, I’m very excited about all those graphics performance improvments and animation API that are comming with Qt 4.5. Keep up the excellent work people!

» Reply from gunnar
 on Friday, September 26, 2008 @ 06:24
gunnar

@Frédéric: All of the above mentioned optimizations are minimizing work done in the raster engine, our software backend for QPainter, so it is very unlikely that this will affect exported displays.
@Tsiolkovsky: Thanks, we’ll have more information on Linux / Mac in a while…

» Posted by Guest
 on Monday, September 29, 2008 @ 00:33

When you say Win do you mean XP ? What about Vista ? I guess they have different desktop rendering engines. Will Vista benefit aswell ?
And what about X11 performance ? X11 is “platform” that really needs this framework but KDE 4.1.1 is insanely sluggish.

» Posted by Michael "Windows" Howell
 on Monday, September 29, 2008 @ 02:02

@Guest: “Will Vista benefit as well?”
I’m pretty sure that Qt uses the same stuff on XP as it does on Vista. IMO, Microsoft needs to do some optimization work on Vista more than QSW does…

» Posted by Guest
 on Monday, September 29, 2008 @ 10:57

@Michael: IMO, Microsoft needs to do some optimization work on Vista more than QSW does…
Hmm, why then Vista UI works flawless comparing to KDE ?

» Posted by Michael "Mouth" Howell
 on Friday, October 03, 2008 @ 04:20
» Posted by Guest
 on Saturday, October 04, 2008 @ 00:35

@Michael: Vista takes advantage of HW acceleration KDE does not.
And what integrated composition manager does then ? And why KDE cannot use the same hardware acceleration ?

» Posted by anonymous
 on Tuesday, October 07, 2008 @ 20:45

yeah, I want to see benches on linux/x11/kde4, too! time for another blog entry ;)

» Posted by Michael "Pain in the butt" Howell
 on Tuesday, October 07, 2008 @ 22:17

@Guest: Actually, it is the use of “accelerated” rendering (XRender) that makes Qt/X11 run slower on certain hardware (where the drivers absolutely SUCK at XRender). On drivers that properly support XRender (think Intel), the rendering is equal or faster.



© 2008 Nokia Corporation and/or its subsidiaries. Nokia, Qt and their respective logos are trademarks of Nokia Corporation in Finland and/or other countries worldwide.
All other trademarks are property of their respective owners.