Tuesday, July 7, 2009

Light Prepass: Dual Paraboloid Shadow Mapping

Shadows are quite essential in helping perceive depth and distance. Some games even deliberately exaggerate them in order to deliver emotions, intrigues and/or the dramas to the player (e.i Bioshock, Dead Space, etc). Unfortunately, simulating shadow is one of those expensive luxury or but to some is a necessity. Shadow mapping is one of the common technique on shadow rendering. A shadow map is rendered in the perspective of the light's eye, storing the depth value in a texture which then be used on depth test on the shadow projection pass. Simple as it may sound but when it comes to omni-directional lights (imagine a sun, lighting on every direction) it drastically increase its complexity.

One solution is to render shadow maps with 6 primary 3D axis (up, down, left, right, forward and backward). This would mean one must render the light frustum on each axis 6 times, 600% of the time spent on a single light source, not to mention 6 shadow map textures, heavy on speed and memory (EDIT: Mei de Koh, a friend of mine added that its possible to use a virtual cube map shadow map as not to render the scene 6 times... I haven't study this one though). Enter Dual Paraboloid Shadow Mapping.

Dual Paraboloid Shadow Mapping is a form of optimizing omni-light shadows. Instead of 6 maps, it will only use 2. How? Imagine curving the lens up to the point that if you render the scene pointing forward then pointing backward, you will get a close to perfect spherical vista of the scene. This can be used for environmental mapping (or reflections) but this for another time. The image you see above is using this technique. As you may notice, the shadow penumbra differs based from position and distance to the light source. Although based from developers of STALKER article in GPU Gems 2 they avoided this technique as they are using a deferred rendering (very similar to Light Prepass which I am using). I don't really know why they stated that, but it seem to work on my side. Hopefully, this will be just enough for our game requirements.

(Technical clue: It does really pay off when your native space is the View space.)

There is one technical drawback on my implementation though. As I curve the scene when I'm rendering the lights perspective into a paraboloid, some mesh don't bend well (I'm doing this in the vertex shader). Example is the plane that is the image above. I spent a bit of time on trying to solve this. When the plane was bent, a weird black appears on the edges of the plane. My solution was to flash the shadow map with WHITE first, and do a multiplicative blending (src=DestColor; dest=Zero) when you are rendering the shadow depth. Then presto! no more black edges.

Next, I need to optimize my shadow implementation. If you notice, there still some tweaking to do on the shadow edges. Cheers!


Alejandro said...

I have been reading your blog, impressive work you have been doing I must say!.

I have some questions regarding Light PrePass and ShadowMapping in general (I'm very very confused):

1. Is the shadow occlusion term multiplied after all the lighting calculations? (i.e. after the final lighted pixel is retrieved at the forward pass) or before, that is, starting with a shadowed color (if it was in shadow) and then build the final lighted pixel color. My confusion is because any light affecting a pixel that was shadowed by another light may as well bring the color back, but this seems to go against the shadow map where a pixel in shadow is in shadow and hence black (or any color you may like).

2. Do you calculate the shadow occlusion at the same time the geometry is forward rendered for lighting? Or in a deferred manner (to another shadow occlusion render target)? Or maybe (don't know if its possible) after (or before, as in the first question) the lighting accumulation pass? (i.e. Render all the lights to the accumulation buffer, then multiply all those values by the shadow, when the forward render samples the light buffer, some of then will be already darkened).

3. I'm wrong about the multiplication and the shadow should then "substract" color and the order would not matter?

Thanks in advance!
Also, I would really love a more in depth post about your screen space gi method! It looks impressive! (I guess you use the trick you came up with 2D SSAO)

vidextreme said...

Hi Alejandro,

Thanks for the compliment. Lately, I've been busy with other 'none-graphics' related programming.

Anyway, here's my answers to your questions...

Q: Shadow occlusion term multiplied after or before?
A: Neither. You compute the shadow occlusion term while processing the lighting. And you do an ADDITIVE blend each time you do this (this is assuming you are working with an HDR setup). With ADDITIVE blend, when a pixel which was in the shadow(say color(0,0,0)) got lit, it will receive the light-shadow value of the current light. The already lit pixel that will receive additional lighting will just be added together, which entails a pseudo increase of intensity.

Additional trick is you flash/clear the Light-Shadow buffer with your AMBIENT COLOR before process your first light. That way adding the ambient color is already free.

Q: When do you do the shadow occlusion and which render target?

A: From my previous answer, you do the Shadow occlusion term while/in Light accumulation buffer. All in the same pass. You process the Light and its Shadow. The beauty of this is, as the nature of deferred rendering lighting which you only render the light meshes instead of whole screen in the light accumulation pass, processing the shadow term is also regional. This saves a lot of fill rate with multiple lights casting shadows.

Q: Does the order matter?
A: No. You just accumulate lights-shadows by ADDITIVE not subtracting.

Note, its ADDITIVE blending so its alpha-blending. It may cause some penalties but its still is lighter than sampling.
SrcBlend = One;
DestBlend = One;
BlendOp = ADD;

I hope I answered your questions. If you have more questions, just a way.


Alejandro said...

Thanks for the soon response!

> You compute the shadow occlusion term while processing the lighting.

Oh my, I don't know how or where I decided to treat the shadow maps incorrectly. The shadow map that is being processed belongs to the light that generated it in the first place! So everything is clearer now, shadow occlusion is calculated inside the shadow enabled light that is being processed with it's corresponding shadow map.

> (...)You just accumulate lights-shadows by ADDITIVE(...) ((this is assuming you are working with an HDR setup) (...)

You mean to avoid too early saturations?
I read your first LPP post that pointed to a trick used to avoid too fast saturations writing exp2(-lightValue) and retrieving it with -log. Using a LDR surface (in this case not using ADDITIVE blending but MULTIPLICATIVE), which leads me to the next new question.

> Additional trick is you flash/clear the Light-Shadow buffer with your AMBIENT COLOR before process your first light.

What would the alpha channel of this light be(the specular component)? Should I assume zero? (since this light won't contribute to specular).

And also, if using the exp2() method, everything remains unchanged except that the device clear color will be 2^-AMBIENTCOLOR, right?

> I hope I answered your questions.

Absolutely! They were all answered and pointed me in the right direction! Thank you very much. Have a nice day!

PS: I see that you have taken a lot of effort working with shadows and shrinking filtering operations (even the 2D SSAO that I can't get my head around it!). Maybe this is of interest (In case you haven't seen it yet). http://gpupro.blogspot.com/2009/10/fast-conventional-shadow-filtering.html . It says that a 8x8 visibility sample would carried out by 49 PCF operations, but only by 16 texture samplings with their approach.

vidextreme said...

Hi Alejandro,

The exp2/-log2/multiplicative blend is just another way of storing/accumulating the light-shadow accumulations. Though I remember I reached a road block with this regarding pure black shadows. I dropped this approach because of my heavy use of the light accum buffer. Also with this technique, you have to decode each time you need to sample from this format. Plus AFAIK, adding the AMBIENT COLOR is no longer free here as you need to also encode it.

Though I'm still impress with this technique on how much it can retain light color blending. Yes, this avoids early saturations of color and thus preserving color blends better. Here's the link....

Look for Drillian's posts. Also, you can check our Pat Wilson's article on ShaderX7, with a different approach on this by storing the light into another color space.

Regarding the ALPHA channel of the Light-Shadow Accumulation, Mr. Wolfgang Engel's blog has something about this. In my case, I just store the unprocessed specularity, meaning without POW function.

This is the minor problem of Light Prepass approach. You have no way identifying the material type onces your processing your lights. Some, stores the material id(or material specularity level) in the ZNormal buffer.

What I did is when I'm rendering the forward pass and applying the lights, this is the time I POW the unprocessed specularity as during this pass I have access on the material id/specularity map.

There are different approaches on this, so far I have only tried this.

Thanks for the link. I hope to get back to graphics once my misc tasks are done. Pls send me some screenshots if its possible. I would love to see your progress.


Alejandro said...

Hi John,
Thanks for the explanations.

Guess I'll likely go with recovering the specular at the forward geometry pass too, that's one of the best benefits I can see in that type usage for LightPrePass: Total material freedom per geometry. It may be even possible to use precalculated texture maps to map the light response from the light buffer to some crazy custom ones...

I'll owe you the screenshots! It's still too early to show some progress, most of the design is still on "paper" (read as: in my head :p).
I'm hopping/leaping between shaders, game logic, as simple and readable as I can reusable classes, reading books, etc. I'm still at the shadow map/rendering part of the process, there are just too many algorithms, ESM, VSM, PCF, the cascaded versions for all of them, etc. Add then stochastic/jittered grid sampling for them... a lot to do, a lot to learn!.

I'm using XNA by the way, and it has a lot of constraints, some of them directly chop off the best benefits for these types of renderings... But since I decided to learn to code and make games for the first time, XNA/C# was the most accessible choice at the time (and I think it still is!). I would be totally lost in another environment.

As you can see, I'm in no way trying to reinvent the wheel, but oh my!, have I spent time trying to understand how it rolls! (And that wheel rolls more complicated each passing day).

As soon as I have something to show, I'll post back to you so you can see some progress!

Regarding color spaces I'm trying not to go too deep until I finally have something working on standard spaces ("standard" is debatable, let's just say RGB). However notes taken! Thanks for the tips.

For the notebook: Roughly what I found, LUV space is how it would be possible to reconstruct the specular term in the forward pass. Guess I'll have to get my hands on that ShaderX7 book!.

Reading about NAO32 (an HDR 8bit per channel format) it's basically a LogLUV space and appears to have more sense in the color pass and not in the light/shadow buffer pass.

My question to you is: blending the lights in LUV space is straight forward (ADD)? In LogLuv it seems to be hassle (guess it's only because of the log conversion).
Is that how you implemented it? The Luminance value is close enough to N.L * Att, so a per object specular value is recoverable? Then you POW it in the forward pass...

The problem with those spaces is that I still don't understand the linearity treatment, and then comes DX9 with applying gamma curves to the output, or the XBOX applying linear piece-wise gamma that can't be correctly linearized straightforward (plus you can't choose what will be gamma corrected on XBOX, it depends on the surface format source and target, don't know which yet).

So I guess when I finish there, I may be knocking back here! Hopefully to show the ending result and not questions...

Thanks again for all the insights.
Can't wait to see what will you come up with when you take on the graphics matter again!.