In digital systems, an N-bit adder can be implemented by N-1 full-adders and one half-adder. When accumulating numbers, carry save adder is an interesting alternative since it is faster. As explained somewhere else already, the same technique is also useful if we want the average of 16-bit numbers encoded in 32-bit numbers. Or, for all that matters, 8-bit in 32-bit. This of course fits nicely as we average two colors in ARGB color space, where each component takes 8 bits. The code is as simple as (0xfefefefe is there to mask out the bit not to overflow and falsify the next 8-bit):
quint32 avg = (((c1 ^ c2) & 0xfefefefeUL) >> 1) + (c1 & c2);
This is faster to compute, rather than taking the alpha, red, green, and blue of the first and second colors, average each component invidually, and then combine them again to find the final result, like the messy lines below:
quint32 avg = qRgba((qRed(c1) + qRed(c2)) >> 1,
(qGreen(c1) + qGreen(c2)) >> 1,
(qBlue(c1) + qBlue(c2)) >> 1,
(qAlpha(c1) + qAlpha(c2)) >> 1);
But how fast is faster? I decided to write an example that uses the above mentioned trick to speed up downscaling an image to half its original size. Usually you do this using QImage::scaled() function. If you pass Qt::SmoothTransformation as the transformation mode for this function, then halfscaling the image is the same as taking every 2×2 pixels, average their color values, and use the result as the final color. On the other hand, Qt::FastTransformation will just sample one of those 4 pixels. Surely it means that it is faster (the name implies that), however the lack of box filter there also means the quality is not as good as using Qt::SmoothTransformation. Here comes the trick of ARGB32 pixel averaging, which allows us to write QImage halfSized(const QImage&) that is really really fast compared to QImage::scaled() with Qt::SmoothTransformation, but still give the same visual quality. Using the new benchmark feature in our beloved test framework, here is the speed comparison (longer is better).
"Normal" refers to scaling with Qt::SmoothTransformation, whereas "Optimized" is our custom halfSized() function. The numbers represent the iterations for every 10e12 CPU ticks in order to halfscale a 10-megapixel image. As you can see, the improvement is about an order of magnitude. Impressed?
The code is still fresh at Graphics Dojo repository under the subdirectory halfscale. Take a look and have a try. Do not forget also the catches: potential round-off error and even columns and/or rows. If you can get away with the loss of up to two bits and cutting one last (or in the middle) vertical and horizontal pixels, then this halfSized() function is your new friend.
Now carefully examine following screenshot. Just like other previous examples, you can always drag and drop an image from the file manager or web browser. For this one, I used Gianni’s Urban solitude picture (Creative Commons NC ND). As you can see, when you stick with FastTransformation, there are jagged lines and some effect like Moire patterns in the downscaled image. This problem disappear when you use SmoothTransformation. In addition, the optimized half scaling method presented here gives a result just like when you use SmoothTransformation.
If you start to ask why all this halfscaling seems to be important at all, just watch this blog and see what will come next. Hint: you might guess it already if you were at my last DevDays talk.
8 Responses to “50% scaling of (A)RGB32 image”
Apparently your code does not consider gamma. You can check this with the picture of the Dalai Lamma from http://www.4p8.com/eric.brasseur/gamma.html This page also explains what goes wrong when one uses simple averaging for scaling of RGB values.
@Christoph: And nor does QImage::scaled() function. I did not present a different approach, just a faster one.
Well there is no reason why this shouldn’t be a painter option.
Given a scale of any reciprocal power of two (50%, 25%, 12.5%, 6.25%, etc) you should be able to call halfSized() any consecutive numer of times.
Scaling for any image size less than 50% should a modified “russian peasant method of multiplication”/division to obtain fast results. To scale to 1/3rd (33%), you first take a 50% fast scale, then multiply the denominators. Given a picture of 100×100,
100pix/3 = 33 pix (final desired size)
100pix/2= 50pix (50% scale operation)
final scale = 1 /(2*3)=.66
50 pix scaled to 66% = 33pix
The while you have to do two (or more) scale operations, the first is extremely fast and gets pixels out of the subsequent scales. The fist pass reduces the number of pixels to 25%. For our sample 100×100 image, that’s 10,000 pixels down to 2500 pixels. Then the remaining 25% get SmoothScaled.
Now, I simplified by using ideal numbers, but you get the point.
OK. However it would be nice to have a correct routine within Qt.
@Christoph: more-than-8-bit per channel image format is something we would like to do. However, no idea when this would be accomplished. After that, easier to take the gamma into account.
@ariya: I regularly work with 16, 12. and 10 bit channels, but not in Qt. We have to pay for a special library for those bit depths. it certainly would be nice, and would allow Qt into more niche places. I’d like to see it, but it no small undertaking.
Ariya, how do those benchmarks compare to “CheatTransformation” mode you talked about in Munich?

