Another day, another HDR rendering trick and some hope for the future.

Today I’m going to talk about an idea I came up with in Boston, at Siggraph 2006, while attending a couple of very inspiring lectures given by Jason Mitchell, Gary McTaggart and Chris Green (Valve Software).

They have played over the years with a few different HDR rendering schemes and one of the key insights from their work is that we can happily de-couple exposure and tone mapping computations, deferring the latter to the next frame (actually this idea was first suggested to me by Simon Brown, that’s another story though..).

This simple concept allowed Valve guys to remove the classic full screen tone mapping rendering pass and to embed it directly in their single pass shaders using as exposure a value computed in the previous frames, thus completely eliminating the need to output HDR pixels.

Since at that time there was basically no hardware around that could handle MSAA on floating point render targets (oh gosh, just a few years ago!) they also got their HDR rendering implementation running with MSAA on relatively old hardware! Moreover their method executes tone mapping and MSAA resolve in the proper correct order (tone mapping first, followed by AA resolve) with no extra performance cost, something that a lot of modern games can’t still get right today.

If you were not aware of Valve’s method you are now probably asking yourself how they managed to compute an exposure value to be used in a tone mapping operator if no HDR data is ever dumped to the frame buffer. Through image segmentation techniques they ‘simply’ try to determine if the previous frame has been under or over exposed and a new exposure value is adjusted to compensate for problems with the previous frame(s).

While this method is very clever I have some problems with it. For instance many tone mapping operators require to determine exposure computing the average logarithmic luminance of a relevant portion of the image, but it’s not possible to reliably determine this value using Valve’s approach. HDR data is lost and while in theory we might be able to compute a plausible exposure value performing a search over multiple frames, in practice this is not easy at all. We might need to change the search direction over the exposure space to get closer to the exposure value we are looking for and this would make our image overall brightness swing back and forth for a few frames, like a pendulum around its rest configuration. Monotonic searches are possible too but they can get you only so close to the value we are looking for, especially if the image content is constantly changing!

Having debated this issue with current and former colleagues I know this is a controversial point, some agree with me, some think is not a big deal (and who knows, maybe they are even right!). On the other hand while playing Valve’s masterpieces (this method was first introduced in HL2 Lost Coast) I can’t stop noticing how sometime portions of the image are flat and seem to have lost their color details, giving me an overall flat and over or under saturated feeling (again, this is just a very personal and subjective opinion, feel free to disagree with me). This problem might be caused by a poor/overly simplified tone mapping operator (Valve games run great on not so powerful hardware and trade offs have to be made) and/or by an incorrect exposure (gotcha!).

After this long introduction I wouldn’t be surprised if you have already got my same idea: get rid of the exposure search through previous frames feedback and compute it the proper way!

The feedback/image segmentation method has been adopted because no HDR data is available, but even without re-introducing a floating point buffer (or some funky color space technique, see Christer Ericson’s blog entry about some of the work I did on Heavenly Sword and his very clever take on it) we can still generate the data we need using destination alpha. The idea is simple: compute logarithmic luminance on a per pixel basis, encode it in some special format and output it to the alpha channel. 

If we decide to support a certain luminance range [2^-minLogLum, 2^maxLogLum] we can compress and encode logarithmic lumimance in our single pass shaders using some fairly simple math:

float invLogLumRange =  1 / (maxLogLum + minLogLum);
float logLumOffset = minLogLum * invLogLumRange;
float log_luminance =  get_log_luminance( HDR_color ) * invLogLumRange + logLumOffset;

invLogLumRange and logLumOffset are constants that can precomputed so we just need a 3-way dot product, a scalar MADD and a logarithm to evaluate this formula. Explicitely clamping this expression between 0.0 and 1.0 is not necessary as the ROPs will do it for free anyway.

Since we only applied an affine transform to encode our log luminance is still correct to compute its average with multiple reduction passes as we do when we generate a mip map chain, down to a 1×1 render target, as long as we remember to invert the encoding to retrieve a proper average logarithm luminance value. Actually it’s a good idea to do this last step on the CPU (since this computation can be deferred one or two framaes we should be able to lock this specific resource and read it back with the CPU without stalling either processor) so that we can set our exposure for the next frame color pass as a pixel shader constant, removing any extra math and texture sampling from the 1×1 log luminance texture.

Unfortunately almost no trick comes for free, if we use destination alpha to encode logarithmic luminance we can’t use it for other useful operations such as alpha blending and alpha to coverage (alpha test is still doable as long as we implement it in our shaders invoking kill() or discard() ). I’m not particularly worried about alpha blending, we can simply compute our average logarithmic luminance before we render transparent objects, those won’t contribute to the exposure computations but I suspect this is not a big deal in many cases. The same trick can be applied for alpha to coverage objects, though I wouldn’t advocate it if we know we are going to render a lot of alpha coverage stuff on screen (for example think about lots of trees, it’s not probably going to work well if we are working on a Robin Hood game..)

Now we are free to implement a lot of different tone mapping operators in our single pass shaders, even if we are working on a deferred renderer, as long as its architecture can shade an opaque pixel for an arbitrary number of lights in a single pass, like in the ingenious scheme proposed by Pål-Kristian Engstad at Naughty Dog.

One last note: while I love (and I always will..) finding new and unexpected ways to use graphics hardware it’s clear to me things are going to change soon, very soon. Shaders allow us to do almost anything, but they are still encapsulated in a rendering pipeline that dates back to the late 80s and that has gone almost unchanged for the last twenty years. When I was a student I once used to write my own rendering pipeline (my beloved Amiga didn’t have a GPU..) which wasn’t always based on z-buffer and rasterization (though I wrote so many rasterizers I lost count of them..) and I’m glad of the cyclical nature of hardware development as we are now about to go back to the future and once again develop our own custom rendering architecture on top of recent years advancements. Only this time is going to be even more fun!

5 Responses to “Another day, another HDR rendering trick and some hope for the future.”

  1. Vincent Scheib Says:

    Marco, this is a great idea. I’ve been considering giving the Valve technique a try and was lamenting the more involved process of exposure measuring. I’d like to experiement with this approach.

  2. Marco Salvi Says:

    Thanks Vincent!
    I also tried to use this approach in the late Heavenly Sword development stages and well, it was simply too late to make it work properly with the rest of our pipeline, I battled with a few days and then I had to abandon it (though it was only 2-3% faster than my logluv framebuffer encoding technique, it had the advantage of resolving our 4x MSAA HDR frame buffer in the proper order giving to the image an outstanding quality in high contrast areas)
    My other major problem with Valve’s HDR technique is that if your renderer is gamma aware (and if it’s not it’s better go back to the drawing board!) and it performs shading in a linear RGB space you also need to apply gamma correction before writing out your tone mapped pixels. This is not a big problem per se (and some next gen console out there does it for free) but it might be a problem if you need to composite your opaque pixels with other layers or special effects.
    In theory your single pass material should also perform tone mapping, colour correction, DOF, motion blur, lens flares, gamma writes and all sort of stuff. DOF and motion blur can’t be obviously done at that stage so you have to sacrifice again some correctness. Trying to add some full screen passes for special effects and to move gamma correction at the end of the rendering pipeline won’t work as you already tone mapped your fragments out to a 4 bytes per pixel render target which doesn’t have enough dynamic range to avoid awful banding introduced by gamma correction (well, that’s the reason why some clever clog invented gamma encoding in the first place!).

  3. Gary McTaggart Says:

    One of the things that we tried before settling on what we have for Lost Coast is to store another range in another render target via MRT. Worked well, except for killing MSAA. We couldn’t afford to render again to get the other range, which is what I beieve Halo 3 ended up doing.

    We also tried storing various flavors of luminance in dest alpha, but aborted on that due to problems with alpha-blending. If the blend units were a bit more general with respect to what happens with alpha, it would have been a nice solution. It also didn’t help that we were already using dest alpha for refraction factors for water, etc.

    I wish we had been able to end up with HDR data for blooming when using LDR render targets, but we thought what we ended up with wa a reasonable compromise.

  4. Marco Salvi Says:

    Hi Gary,
    Thanks for your post!

    Ultimately we had the same concerns for Heavenly Sword, it wasn’t possible to adopt the idea I came up with because it was clashing with other ‘creative’ uses of various render targets and obscure hardware features. It’s a kind of route one has to take in early development stages.

    I found myself in this situation so many times, it’s definitely a recurring pattern according my personal experience. That’s why I’m looking forward to the second (or the third?) coming of software renderers that should allows us to develop more creative and exotic solutions that are also more robust and less hacky..

  5. chris green Says:

    .. and yet another thing we tried in lost coast was using two eight bit render targets as MRT’s. All shaders would write both (r,g,b) and (1/16)*(r,g,b) into the two render targets. A final shader would combine the two frame buffers at a target exposure value, including hdr blooming.

    This would work on hardware which supported mrts but didn’t fully support high-bit render targets (ATI at the time). Of course, MSAA was also the drawback with this approach.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: