Friday, August 21, 2009

Screen Space Global Illumination: Screen Space Gone Mad

It seems when SSAO (Screen Space Ambient Occlusion) was discovered, the SS goodness kept coming. Back in the days, to only get AO or GI (global illumination) one would either pre-process it then store in vertex color or lightmaps (or go RNM - Radiosity Normal Map), or use Ray-tracing techniques which is still too framerate heavy to be used in realtime application.

Global Illumination, in simple summary, is an approximation of bounced light to surfaces due to indirect lighting. As light rays hits an object, they bounce from surface to surface. Each bounce, a single ray changes color and intensity based from the material or inherent color of surfaces. Its the same reason why we can still see inside a house (with windows) even if the sun is directly above our roof tops. The same reason why rooms with light color paints on the walls tends to be brighter. (GI is actually the proper value of ambient color we usually add to our lighting)

There are several algorithms how to compute GI in 3D graphics, but all have the same concept in mind.. its either an approximation of an approximation or just plain approximation (hehehehhe).

Enter, Screen Space Global Illumination. I've read through one implementation which they aptly named SSDO or Screen Space Direct Occlusion which is an image based approximation of GI. I must say, its really impressive, minus some maths that goes over my head but thats my fault. Mr always-have-something-brilliant-in-mind Wolfgang Engel, wrote a short but interesting post in his blog regarding a simplier extension of SSAO to SSGI implemention. And since it fits sooooo well in Light Prepass rendering design, trying it out.... (it's) was inevitable (with Agent Smiths' echos).

Understanding how SSAO works, its just easy to extend it to get atleast a single light bounce of indirect illumination. By doing the same thing in SSAO for GI, it is safe to say that a particular pixel on a surface is close enough to receive radiated color to another surface if the occlusion test succeeds. Hence by sampling the albedo color of that surface and averaging it you will get the average radiosity that pixel receives. Now the question now is, where do we get the pre-bounce intensity?

Working at Zealot Digital here in Singapore, my Lead Programmer is Alberto Demichelis, the author and maker of Squirrel scripting languange (AAA games are using it now btw, one big dead-walking title). He gave me a great idea that I had overlooked in Mr Engel's post. Using the Light Accumilation buffer as intensity of the bounce. By combining(this is tricky) projected shadow term and light accumilation and transforming them into black and white, we can use this as the radiosity intensity. Right now, what I had used is lerping though this value between the pixel that receives bounced color and original albedo.


// SS Magic:
for(int i=0; i>NUM_SAMP; i++)
{
// here u do the AO generation stuffs
if(occNorm > occ_thres)
{
float3 sampleAlbedo = tex2D(albSamp, uv + offsetnoise);
float intesity = dot(tex2D(lightAccum, uv + offsetnoise), 1);
resultRad += lerp(sampleAlbedo, curAlbedo, intensity);
}
} resultRad /= NUM_SAMP;
// pls note that I'm just recoding this through recollection but the idea is here.

We can further extend this by going through the pass again, but this time using the current GI as the intensity to simulate multiple light bounce. Buuuuuuut I didn't bother to try, I think single bounce will suffice in a game application.


After implementing this, I realized, that's its possible to 'fit' this in the SSAO pass. This would save blur passes to remove the graininess (screenshot isn't the combined SSAO and SSGI implementation yet). (screenshot of this implementation will follow... hopefully).


Tuesday, July 28, 2009

SSAO Blurring: Making It Less Smart But Low In Carbohydrates

As the title points out... making a slimmer and less of a genius smart ssao blur.

Screen Space Ambient Occlusion, commonly uses two passes. First the ambient occlusion generation (see 'my' Accurate 2D SSAO) and then the blur pass to remove the graininess of the AO. Unfortunately, the blur pass is not your average toolbox blurring. Its all because of the 'edge' of the models or of the relief normals. The blurring must be 'smart' enough not to blur over edges otherwise bleeding will occur. The common idea around the game dev community is make a smart blur by using a similar delta depth/normal check in the AO generation. (If its beyond a threshold, its an edge). This would mean however, to do this every sample, which is typically more than once to get that proper smoothness. The result is the blur pass is more complicated and heavier than the actual AO generation.


Hence, I came up with a simple solution. Lessening the calorie of the SSAO blur pass, by reusing the data already computed by the AO pass. How? The delta (depth or normal comparison). Using that delta compare it with an edge_threshold. This means we are doing this while in the AO sampling. Let me explain.

// SSAO: AO generation pass
for(int i=0; i>NUM_SAMP; i++)
{
// here u do the AO generation stuffs
// use if(deltaN > edge_threshold) if u want finer details
// deltaN = 1-dot(N, Nsample) or deltaZ = depth - depthSamp
if(deltaZ > edge_threshold) { edge++; }
}
edge /= NUM_SAMP;

The result is you have a gradient data of the edges. Although, one price we pay is to encode AO in a dual channel... one for the occlusion and one for the edge data (encode it by 1-edge). Now for the kicker... we will use this data NOT as a toggle flag on which to blur or not to blur... but as a size factor of the blur radius.

// SSAO: Blur pass
float2 origSamp = tex2D(AOSamp, IN.uv).xy;
// x=occlusion; y=edge_data

float2 blurKern = InvTextureSize * radius * origSamp.y;
// the edge data resizes the the kernel as it goes closer/further away from the edge
// when origSamp.y=ZERO, these means theres no offset
// therefore there's no blur, no edge bleed!
for(int i=0; i>NUM_SAMP; i++)
{
float2 offsetUV = IN.uv + (samples[i] * blurKern);
ret += tex2D(AOSamp, offsetUV).x;
...

If you notice, this just uses one extra sample for channel where the edge data is stored and one multiplication.... that's it! We have just save tons of operations on the common smart blur pass. Less smart but low in carbs!

Friday, July 17, 2009

Accurate 2D SSAO: A new implementation?


Well, I hope it is...lol. SSAO or Screen Space Ambient Occlusion is a way to approximate global illumination light and shadow. Basically, its like shadow created by indirect lighting or bounced light. SSAO first presented by Crytek's back a few years ago with their CryEngine 2 for the game Crysis (too much Cry lol). I first implemented pure 2D SSAO (depth compare) by Arkano22 which is quite straight forward. It basically compares depth of a random offsetted sample. I also did the from the GameRendering website which uses a projected randomized offsetted normals and project it back to image space. I find these two implementation very interesting.

The projected technique is, I would say, the correct computation of SSAO, as it compares occlusion from the projected normals. But the price of projecting and transforming it back to texture space is just too heavy especially this is done in every sampling (8 or 16).

In terms of speed nothing beats pure 2D SSAO of course, but this is only an estimate because the angle of the normals are not taken account of... in short, this is would not work in extreme cases. This becomes obvious when the scene is rotated in a axis, the AO shrinks and expands.
Hence, I came up with a different approach of computing the SSAO. This is the sucker-punch question, why do I need to project the normals back to image space each sample if I'm already plotting data in a 2D space? Pure 2D was correct, I agree on this techniques assumption. Projecting normals is correct as it is the proper estimation of occlusion.

My implementation in the screenshot above is working in a pure 2D SSAO but with the normals taken accounted WITHOUT projecting to texture space. We know normals are directions. By simply extending it a few units what we get is the expanded version of the normal, as if we are scaling the vertexes up. With that in mind, even if this is 3D space, if you imagine looking at the offset is really just a two-dimensional information (as in deferred uv as data). Then, dividing it with the current depth so we offset properly and PRESTO, we got ourselves the UV offset of the sample based from the extended normals. Now do an original normal vs offsetted normal when sampling and you get the same occlusion of the projected normals. The unique part here is that I compute the offsets of the normal prior to sampling, which means I am doing a projected normal comparison in pure 2D space. (My hunch is that this is also posible in depth comparison instead of normals, I'll probably test this tommorrow.)
I don't know if this is a new implementation, but the result is as if I am just doing a 16 sampled filter or simple Depth of Field or simpler. Its is more optimized than projected normals as Mr Iñigo's implementation. No matrix multiplication, no reflect function, no sign function, etc and the result is theoretically the same. With a simple multiplication of the normals and using that to be the offset position prior to sampling it has achieved a similar result.

ADD:

I look into nvidia's sample of their own implementation of SSAO. But the mathematics is way beyond me so I don't bother trying it out. Plus considering they had a lot of mathematical instructions to go through so I bet its heavier.

ADD: 20/07/09 Accurate 2D SSAO with 'not so good' noise texture

Tuesday, July 7, 2009

Light Prepass: Dual Paraboloid Shadow Mapping

Shadows are quite essential in helping perceive depth and distance. Some games even deliberately exaggerate them in order to deliver emotions, intrigues and/or the dramas to the player (e.i Bioshock, Dead Space, etc). Unfortunately, simulating shadow is one of those expensive luxury or but to some is a necessity. Shadow mapping is one of the common technique on shadow rendering. A shadow map is rendered in the perspective of the light's eye, storing the depth value in a texture which then be used on depth test on the shadow projection pass. Simple as it may sound but when it comes to omni-directional lights (imagine a sun, lighting on every direction) it drastically increase its complexity.

One solution is to render shadow maps with 6 primary 3D axis (up, down, left, right, forward and backward). This would mean one must render the light frustum on each axis 6 times, 600% of the time spent on a single light source, not to mention 6 shadow map textures, heavy on speed and memory (EDIT: Mei de Koh, a friend of mine added that its possible to use a virtual cube map shadow map as not to render the scene 6 times... I haven't study this one though). Enter Dual Paraboloid Shadow Mapping.

Dual Paraboloid Shadow Mapping is a form of optimizing omni-light shadows. Instead of 6 maps, it will only use 2. How? Imagine curving the lens up to the point that if you render the scene pointing forward then pointing backward, you will get a close to perfect spherical vista of the scene. This can be used for environmental mapping (or reflections) but this for another time. The image you see above is using this technique. As you may notice, the shadow penumbra differs based from position and distance to the light source. Although based from developers of STALKER article in GPU Gems 2 they avoided this technique as they are using a deferred rendering (very similar to Light Prepass which I am using). I don't really know why they stated that, but it seem to work on my side. Hopefully, this will be just enough for our game requirements.

(Technical clue: It does really pay off when your native space is the View space.)

There is one technical drawback on my implementation though. As I curve the scene when I'm rendering the lights perspective into a paraboloid, some mesh don't bend well (I'm doing this in the vertex shader). Example is the plane that is the image above. I spent a bit of time on trying to solve this. When the plane was bent, a weird black appears on the edges of the plane. My solution was to flash the shadow map with WHITE first, and do a multiplicative blending (src=DestColor; dest=Zero) when you are rendering the shadow depth. Then presto! no more black edges.

Next, I need to optimize my shadow implementation. If you notice, there still some tweaking to do on the shadow edges. Cheers!