I attended a meeting October last year in Boston where various parties dealing with text layouting met. This included members of the Pango/Gtk+, KOffice, OpenOffice/ICU and Scribus projects. Right now we duplicate a fair amount of work to support complex scripts and exotic languages. Only Pango and Qt share some code already that originates from the FreeType project for interpreting OpenType tables. On that meeting we agreed that it would be nice to share more of the work, to work on a common layer that can support not only OpenType but also SIL or Apple’s AAT. And most importantly to provide a single place in the free desktop software stack to add support for new complex languages and provide consistent behavior across applications when it comes to text shaping.
As a first important step Trolltech decided to relicense and contribute their existing code. Lars and I have been working on this in the past month(s) and we now finally got around pushing out our changes to a git repository on freedesktop.org. Based on the existing HarfBuzz code we’ve created a first version of a common API, ported our shaping engines that operate on top of OpenType and provide some fall backs if a font does not provide the necessary tables for shaping.
So now it’s screenshot time to demonstrate the beauty of shaped complex text. Thanks to frequent recent visits at a Sri Lankian restaurant here in Oslo and Girish’s help I decided to use some Tamil words for demonstration. We usually start the meal with a some of non-sweet Donut called Vadai (வடை). The “main” meal then consists of a kind of crêpe called Dosa (தோசை) where pieces are teared off with your hands and dipped into some chutney or Sambar (சாம்பார்). You’d be surprised to see how well Trolltech engineers manage to eat with just their hands! Here’s a picture linked in from Wikipedia that shows this kind of pan-cake:
With our changes Harfbuzz now shapes the word Dosa like this (rendered using FreeType):

If a toolkit renders the characters just individually it’ll appear incorrect and look like this:

As you can see by comparing the rendered glyphs they sometimes need to be swapped. This re-ordering can happen on the level of characters (think QChars in a QString) as well as on a level of glyph indices. Without this procedure the resulting glyphs form just garbage and don’t make any sense. (Well, they don’t make sense to me either way because I unfortunately can’t read Tamil, but Girish confirmed that only the first form is correct).
The delicious Sambar that we dip Dosa pieces in looks shaped with Harfbuzz like this:

This is the incorrect unshaped rendering for comparison

… where you can see that the dot marks are not placed nicely above the letters where they belong to.
Now none of this is a really new feature, the free software desktops as well as Mac OS X and Windows have been able to support these kind of advanced text rendering techniques for a while (although it’s sad to see that for example Ubuntu’s default shipped Firefox seems to render this correctly only if I set my locale to some indic one while Konqueror gets it always right). But as scripts change and as the software needs to be adapted sometimes it is even more important that this can be done in one central place. For example we’ve recently received patches for Qt that adapt tables in the shaper to recent developments in the Bengali script to permit certain previously disallowed combinations of vowels and consonants. If we succeed with Harfbuzz then it will only be necessary to adjust the software at one place instead of patching Qt, OpenOffice, Pango and perhaps others.
6 Responses to “Working towards a unified text layout engine for the free desktop software stack”
On my windows box, IE gets it right but not Firefox and Opera.
On my Fedora Core 6 system, all 3 of these get it right:
Konqueror from kdebase-3.5.6-0.1.fc6
Konqueror from kdebase4-3.80.3-4.fc6 (my own package I built for kde-redhat unstable)
firefox-2.0.0.3-1.fc6.remi (from Rémi Collet’s repository)
The reason Firefox gets it right for me and not for others is probably that the Firefox builds you’re using are not built with Pango support.
Lynx in Konsole gets it almost right, but the last dot in “Sambar” is in the wrong position (below instead of above) for some reason, the first one is correctly placed. I’m not sure whether Lynx or Konsole is at fault there.
I am glad to see this finally happen. I study a number of “excotic” languages that use Indic and Arabic scripts and I have run into a variety of problems on different distribution and with different programs over the last years. Most issues have been straigtened out and now Linux is actually the far superior system (compared to Windows and Mac) for me, since X.org’s kbd allows for absolutely free mapping of characters to the keyboard. Therefore, I can write an infinite number of scripts on the standard US keyboard, perfectly mapped onto it for easy typing. The problems that are left, however, are all related to the rendering of Urdu’s highly complex version of Arabic (called Nastaliq) in which words appear cursive (see e.g. http://en.wikipedia.org/wiki/Image:Nastaliq.png ). This still does not work on my Linux system even though a number of fonts have appeared over the last years that render well on Windows machines. I really hope that this will be fixed in the future, not only for my sake, but also for the sake of over 160 million (!) Pakistanis that cannot properly use their language (well, one of them) in Linux.
Mutlu , one of the best ways to get the freedesktop to support your local region is to actively help out. Many developers are from countries where they aren’t aware of the issues regarding fonts that you have and so it is probably not something that will magically happen with out some one who does, helping them out.
For example you could pick your favourite application or DE which doesn’t render your particular font correctly. Then contact the various maintainers and subsystem maintainers to see what issues prevent them from being able to render the font. Then you can submit a bug report to the appropriate project, or in some cases, you may have to go further up the tree. Eventually you will find the project that you can report the bug, along with a font or some kind of way to test it.
There are obviously people who want to make the desktop accessable to everyone , sometimes you just need to meet them part of the way. 1.6 million people is lot, but if you look at how many people in that 1.6 million have a) a computer b) use Linux you will probably find that out of them, very few will be developers and even less will be developers capable of hacking on font-subsystems. So who ever is willing to do it, is most definately going to need some local help ![]()
In Ubuntu, Firefox is built with Pango support, but Pango gets disabled if locale is not set to one of the languages that may need it. This is due to performance complaints (Pango rendering being much slower).
See /usr/bin/firefox script for details (you can set environment variable MOZ_DISABLE_PANGO=0 to enable it again)