Friday, August 21, 2009

Screen Space Global Illumination: Screen Space Gone Mad

It seems when SSAO (Screen Space Ambient Occlusion) was discovered, the SS goodness kept coming. Back in the days, to only get AO or GI (global illumination) one would either pre-process it then store in vertex color or lightmaps (or go RNM - Radiosity Normal Map), or use Ray-tracing techniques which is still too framerate heavy to be used in realtime application.

Global Illumination, in simple summary, is an approximation of bounced light to surfaces due to indirect lighting. As light rays hits an object, they bounce from surface to surface. Each bounce, a single ray changes color and intensity based from the material or inherent color of surfaces. Its the same reason why we can still see inside a house (with windows) even if the sun is directly above our roof tops. The same reason why rooms with light color paints on the walls tends to be brighter. (GI is actually the proper value of ambient color we usually add to our lighting)

There are several algorithms how to compute GI in 3D graphics, but all have the same concept in mind.. its either an approximation of an approximation or just plain approximation (hehehehhe).

Enter, Screen Space Global Illumination. I've read through one implementation which they aptly named SSDO or Screen Space Direct Occlusion which is an image based approximation of GI. I must say, its really impressive, minus some maths that goes over my head but thats my fault. Mr always-have-something-brilliant-in-mind Wolfgang Engel, wrote a short but interesting post in his blog regarding a simplier extension of SSAO to SSGI implemention. And since it fits sooooo well in Light Prepass rendering design, trying it out.... (it's) was inevitable (with Agent Smiths' echos).

Understanding how SSAO works, its just easy to extend it to get atleast a single light bounce of indirect illumination. By doing the same thing in SSAO for GI, it is safe to say that a particular pixel on a surface is close enough to receive radiated color to another surface if the occlusion test succeeds. Hence by sampling the albedo color of that surface and averaging it you will get the average radiosity that pixel receives. Now the question now is, where do we get the pre-bounce intensity?

Working at Zealot Digital here in Singapore, my Lead Programmer is Alberto Demichelis, the author and maker of Squirrel scripting languange (AAA games are using it now btw, one big dead-walking title). He gave me a great idea that I had overlooked in Mr Engel's post. Using the Light Accumilation buffer as intensity of the bounce. By combining(this is tricky) projected shadow term and light accumilation and transforming them into black and white, we can use this as the radiosity intensity. Right now, what I had used is lerping though this value between the pixel that receives bounced color and original albedo.


// SS Magic:
for(int i=0; i>NUM_SAMP; i++)
{
// here u do the AO generation stuffs
if(occNorm > occ_thres)
{
float3 sampleAlbedo = tex2D(albSamp, uv + offsetnoise);
float intesity = dot(tex2D(lightAccum, uv + offsetnoise), 1);
resultRad += lerp(sampleAlbedo, curAlbedo, intensity);
}
} resultRad /= NUM_SAMP;
// pls note that I'm just recoding this through recollection but the idea is here.

We can further extend this by going through the pass again, but this time using the current GI as the intensity to simulate multiple light bounce. Buuuuuuut I didn't bother to try, I think single bounce will suffice in a game application.


After implementing this, I realized, that's its possible to 'fit' this in the SSAO pass. This would save blur passes to remove the graininess (screenshot isn't the combined SSAO and SSGI implementation yet). (screenshot of this implementation will follow... hopefully).


Tuesday, July 28, 2009

SSAO Blurring: Making It Less Smart But Low In Carbohydrates

As the title points out... making a slimmer and less of a genius smart ssao blur.

Screen Space Ambient Occlusion, commonly uses two passes. First the ambient occlusion generation (see 'my' Accurate 2D SSAO) and then the blur pass to remove the graininess of the AO. Unfortunately, the blur pass is not your average toolbox blurring. Its all because of the 'edge' of the models or of the relief normals. The blurring must be 'smart' enough not to blur over edges otherwise bleeding will occur. The common idea around the game dev community is make a smart blur by using a similar delta depth/normal check in the AO generation. (If its beyond a threshold, its an edge). This would mean however, to do this every sample, which is typically more than once to get that proper smoothness. The result is the blur pass is more complicated and heavier than the actual AO generation.


Hence, I came up with a simple solution. Lessening the calorie of the SSAO blur pass, by reusing the data already computed by the AO pass. How? The delta (depth or normal comparison). Using that delta compare it with an edge_threshold. This means we are doing this while in the AO sampling. Let me explain.

// SSAO: AO generation pass
for(int i=0; i>NUM_SAMP; i++)
{
// here u do the AO generation stuffs
// use if(deltaN > edge_threshold) if u want finer details
// deltaN = 1-dot(N, Nsample) or deltaZ = depth - depthSamp
if(deltaZ > edge_threshold) { edge++; }
}
edge /= NUM_SAMP;

The result is you have a gradient data of the edges. Although, one price we pay is to encode AO in a dual channel... one for the occlusion and one for the edge data (encode it by 1-edge). Now for the kicker... we will use this data NOT as a toggle flag on which to blur or not to blur... but as a size factor of the blur radius.

// SSAO: Blur pass
float2 origSamp = tex2D(AOSamp, IN.uv).xy;
// x=occlusion; y=edge_data

float2 blurKern = InvTextureSize * radius * origSamp.y;
// the edge data resizes the the kernel as it goes closer/further away from the edge
// when origSamp.y=ZERO, these means theres no offset
// therefore there's no blur, no edge bleed!
for(int i=0; i>NUM_SAMP; i++)
{
float2 offsetUV = IN.uv + (samples[i] * blurKern);
ret += tex2D(AOSamp, offsetUV).x;
...

If you notice, this just uses one extra sample for channel where the edge data is stored and one multiplication.... that's it! We have just save tons of operations on the common smart blur pass. Less smart but low in carbs!

Friday, July 17, 2009

Accurate 2D SSAO: A new implementation?


Well, I hope it is...lol. SSAO or Screen Space Ambient Occlusion is a way to approximate global illumination light and shadow. Basically, its like shadow created by indirect lighting or bounced light. SSAO first presented by Crytek's back a few years ago with their CryEngine 2 for the game Crysis (too much Cry lol). I first implemented pure 2D SSAO (depth compare) by Arkano22 which is quite straight forward. It basically compares depth of a random offsetted sample. I also did the from the GameRendering website which uses a projected randomized offsetted normals and project it back to image space. I find these two implementation very interesting.

The projected technique is, I would say, the correct computation of SSAO, as it compares occlusion from the projected normals. But the price of projecting and transforming it back to texture space is just too heavy especially this is done in every sampling (8 or 16).

In terms of speed nothing beats pure 2D SSAO of course, but this is only an estimate because the angle of the normals are not taken account of... in short, this is would not work in extreme cases. This becomes obvious when the scene is rotated in a axis, the AO shrinks and expands.
Hence, I came up with a different approach of computing the SSAO. This is the sucker-punch question, why do I need to project the normals back to image space each sample if I'm already plotting data in a 2D space? Pure 2D was correct, I agree on this techniques assumption. Projecting normals is correct as it is the proper estimation of occlusion.

My implementation in the screenshot above is working in a pure 2D SSAO but with the normals taken accounted WITHOUT projecting to texture space. We know normals are directions. By simply extending it a few units what we get is the expanded version of the normal, as if we are scaling the vertexes up. With that in mind, even if this is 3D space, if you imagine looking at the offset is really just a two-dimensional information (as in deferred uv as data). Then, dividing it with the current depth so we offset properly and PRESTO, we got ourselves the UV offset of the sample based from the extended normals. Now do an original normal vs offsetted normal when sampling and you get the same occlusion of the projected normals. The unique part here is that I compute the offsets of the normal prior to sampling, which means I am doing a projected normal comparison in pure 2D space. (My hunch is that this is also posible in depth comparison instead of normals, I'll probably test this tommorrow.)
I don't know if this is a new implementation, but the result is as if I am just doing a 16 sampled filter or simple Depth of Field or simpler. Its is more optimized than projected normals as Mr IƱigo's implementation. No matrix multiplication, no reflect function, no sign function, etc and the result is theoretically the same. With a simple multiplication of the normals and using that to be the offset position prior to sampling it has achieved a similar result.

ADD:

I look into nvidia's sample of their own implementation of SSAO. But the mathematics is way beyond me so I don't bother trying it out. Plus considering they had a lot of mathematical instructions to go through so I bet its heavier.

ADD: 20/07/09 Accurate 2D SSAO with 'not so good' noise texture

Tuesday, July 7, 2009

Light Prepass: Dual Paraboloid Shadow Mapping

Shadows are quite essential in helping perceive depth and distance. Some games even deliberately exaggerate them in order to deliver emotions, intrigues and/or the dramas to the player (e.i Bioshock, Dead Space, etc). Unfortunately, simulating shadow is one of those expensive luxury or but to some is a necessity. Shadow mapping is one of the common technique on shadow rendering. A shadow map is rendered in the perspective of the light's eye, storing the depth value in a texture which then be used on depth test on the shadow projection pass. Simple as it may sound but when it comes to omni-directional lights (imagine a sun, lighting on every direction) it drastically increase its complexity.

One solution is to render shadow maps with 6 primary 3D axis (up, down, left, right, forward and backward). This would mean one must render the light frustum on each axis 6 times, 600% of the time spent on a single light source, not to mention 6 shadow map textures, heavy on speed and memory (EDIT: Mei de Koh, a friend of mine added that its possible to use a virtual cube map shadow map as not to render the scene 6 times... I haven't study this one though). Enter Dual Paraboloid Shadow Mapping.

Dual Paraboloid Shadow Mapping is a form of optimizing omni-light shadows. Instead of 6 maps, it will only use 2. How? Imagine curving the lens up to the point that if you render the scene pointing forward then pointing backward, you will get a close to perfect spherical vista of the scene. This can be used for environmental mapping (or reflections) but this for another time. The image you see above is using this technique. As you may notice, the shadow penumbra differs based from position and distance to the light source. Although based from developers of STALKER article in GPU Gems 2 they avoided this technique as they are using a deferred rendering (very similar to Light Prepass which I am using). I don't really know why they stated that, but it seem to work on my side. Hopefully, this will be just enough for our game requirements.

(Technical clue: It does really pay off when your native space is the View space.)

There is one technical drawback on my implementation though. As I curve the scene when I'm rendering the lights perspective into a paraboloid, some mesh don't bend well (I'm doing this in the vertex shader). Example is the plane that is the image above. I spent a bit of time on trying to solve this. When the plane was bent, a weird black appears on the edges of the plane. My solution was to flash the shadow map with WHITE first, and do a multiplicative blending (src=DestColor; dest=Zero) when you are rendering the shadow depth. Then presto! no more black edges.

Next, I need to optimize my shadow implementation. If you notice, there still some tweaking to do on the shadow edges. Cheers!

Tuesday, June 30, 2009

Light Prepass Cascaded Shadow Mapping, continued

Ah... shadow map filtering, I didn't know before that there's a lot of them. (The screenshot were using a 512x512 resolution shadow map to easy compare the various filtering I implemented.) In my older screenshots, I'm using the 5x5 PCF (middle) ( short for Percentage Closer Filtering) which, of course better in smoothing out the shadows but the heaviest of all implementation I did. The 4 Tap PCF (left) is the fastest but the ugliest, not very nice if your shadow map resolution is low. I also tried Gaussian blur and random filters(not in screenshot). I also tried non PCFiltering like Variance Shadow mapping but considering I'm doing Cascaded Shadow Mapping, VSM has a bigger memory appetite considering it needs two channels to store depth and depth*depth. Which brings me to my own implementation.
Similar to what I've been doing in the past, and sometime good at it... I dub or name stuffs. I dubbed my implementation as 8 Tap O-PCF or Occasional-PCF. It may sound funny but thats only half of the point. My real point or reason why I name such is because how it self-optimized itself. Let me explain: (inhaled intensely)
The square first 4 tap of 8 will have enough information to proceed with the other 4 taps. By dissecting the texel into 3x3, the center is literally ignored due to the fact that the size and amount of the texels sampled are enough already. Sampling is basically 1/3 of the texel around the texel and should not go outside the texel. The wonderful thing about this is that I'm only sampling 8 times but only the first 4 if shadow test failed. If you closely examine its inner penumbra of the shadow, you'll notice the smoothness of the penumbra, almost to that point it appears to be using a higher shadow map resolution. Of course the outer penumbra will still be jagged but hey considering the 3 tones I added, and the edge are lightest, with a good Depth of Field, this will even look like 2048x2048 shadows. I like the flexibility of this filtering, much so that it can fake shadow bleeding already, beating the soft PCFs in terms of speed and control. So there you have it 8 Tap O-PCF... sounds like those nifty items in Monkey Island like Mug O'Grog or Spit O'Matic. Just say, 8-Tap-O'Pac-eF.... (windblowin tumble weed) ...nah that was corny. Ha!
I wouldn't say I'm finish with this topic... cause definitely, I'll be revisiting this when the renderer is formalized. Next stop, Dual Paraboloid Shadow Mapping for our indoor and outdoor shadowing daily needs.

Wednesday, June 24, 2009

Light Prepass Cascaded Shadow Mapping


Updates, updates, updates (actually, it's only an update. Singular). The image you see above is Light Prepass rendering with Cascaded Shadow Mapping. Each of the color changes in the image above represents the splits of the CSSM based on distance from the camera.

I first implemented the Nvidia's CSSM implementation (which I intentionally didn't post the results here because how I implemented it was too embarassing to show). I can't seem to stabilize that implementation. Then I tried Wolfgang Engel's [ShaderX5] CSSM, which I then simplify based from Michael Valiant's [ShaderX6]. Although, I didn't fully implementated exactly as he did. Primarily on MEC (minimal enclosing circle) part. I instead used a transformed-axis bounding box on each frustum split. With bounding box, I don't have to recompute the split even if I rotate or translate the camera. Depth can also be constant, depends on the requirements. But what I did is the just using the radius of the BB plus the length of the difference between center of BB and original light position. The center of the BB is the untransformed axis. I call it Camera-Frustum-Split Bound Depth. It's a just my fancy way of saying, 'getting enough' precision from an orthographic shadow kinda' thing. In simple terms, imagine a light tripod attached on top of a helmet beaming in a certain direction downward with a special gimbal ignoring panning, yawing and rolling of the head (rolling heads... o_O). The distance of the light from the eyes are constant. Though I think this is only applicable to outdoor sun type scene.

For the shadow view matrix, I preserved the light direction in the BB(transformed) space so whenever I turn or move the camera the depth and shadow view angle are constant.

I still have to work on the filtering and fading-into-the-next shadow split. The initial part I've done which is the transition of cascades are based from pixel depth(view space) distance vs the far distance of each split. My plan is to gradually fade in between splits. I choose this way because it make sense, atleast for me, that transitions are based from a field of view of the eye. I read somewhere, I can't remember where, that this also avoids shadow split transition popping.

Btw, I store the splits in per channel rather than a spliting a single channel texture. I don't know how this will affect my rendering or if this is better or worse, but good enough... for now.

Friday, June 12, 2009

Light Prepass Shadow Mapping



Ok, update on my Light Prepass journey. Shadow mapping is actually new to me, let alone ever using it in a deferred rendering. In fact this my first time to probably nail this thing to its head. Currently, its a simple shadow mapping... no magic.. no fancy footwork. The image you see here has a point light(w/ attenuation) casting shadow with 5x5 PCF (percent closer filtering). At this point, its all raw-brute-coding, definitely screaming for optimization. And obviously, I'm hiding all the shadow errors with a neat camera angle.

The model rendering however exhibits sound optimation. I've managed to remove all matrix computation in the pixel shader on the light passes (full screen quad and light convex mesh - I call these guys 'light blobs'). The mathematics here was such a nose bleed. The key here is making View Space your battle ground. Good thing the graphics gurus are around... MJP, Drilian, Engel, etc. Here's the link.

In terms of prepass packing, specifically the normals, the Cryengine 3 suggestion seem to produce some inaccuracies when I error check it with the fresh untouch normals. I added a weird value just to remove the error. Here's the code.

float4 PackDepthNormal(in float z, in float3 normal)
{
float4 output; normal = normalize(normal);
normal.x *= 1.000000000000001f; //<--- my nasty mod
output.xy = normalize(normal.xy) * sqrt(normal.z * .5f + .5f);
return PackDepth(output, z);
}

Anyway, I need to investigate this further. I find Pat Wilson's idea on converting normals to spherical coord better. I haven't profiled this but this seem to be a more optimized approach.

Back to shadows, my target is to use Cascaded Shadow Maps. Hopefully, in a few days time I can post the results.

Thursday, May 28, 2009

My First Deferred Shading: Light Prepass Rendering

Ahh the teapot mesh... the always used but pne of many useful mesh for rendering objects. Especially in prototypes which I am currently doing. Now, without further ado, I give you, MY FIRST DEFERRED RENDERING (drum-rolls).

I'm doing a Light PrePass rendering by good'olde Wolfgang. The process is remarkably simple once when you understand it. Similar to Deferred Rendering, (well this IS deferred rendering), which only renders normal/depth in the first pass. The lights are then rendered using the first pass normal and depth buffer. The light pass are accumulated and then applied on the gather pass.

The image above is just a teapot rendered with two point light. Not much I can show right now. But the key thing to do here is how you pack the data in the buffers. Currently, I tested out 3 ways of packing the normal/depth and 2 ways of light accumulation passes. I find Pat Wilson's suggestion on transforming the normals to spherical coordinates you can mind-blowingly pack the 3 floats into a 1 and a partial half(just enough to store the sign of the normal.z). I find Reltham via Drilian's (in http://www.gamedev.net/) suggestion also interesting on how the light accumulation pass is done, which is to do a multiplicative blending, instead of standard alphabend. The colors are sharper and the mid-blend of two lights seems to look more 'realistic'. I haven't done any tone mapping or normal mapping, I'm quite excited on that after I nail the packing of buffers.

Monday, May 18, 2009

Advent Rising: A Game Taken Me By Surprise

When Nazarene (my loving other-half) and I went to Sim Lim Square to buy some stuffs, we saw this game store which sells a wide variety of PC games. Much to my delight, they also sell old games which I missed playing. Best of all, they were on sale (and my GF said she'll pay for it). Buy 2 get 1 free. A good deal, I should say. I took FEAR 1 and Broken Sword 3. Then I needed to choose the free game.

Then I saw Advent Rising. I was a bit hesitant, at first. I only heard about this game from my ex-colleague in Emerging Entertainment, Charles, a game designer who's a frantic RPG-er. Honestly, I only chose it to be the free game because of its intriguing orange-green DVD casing cover.

When I got home, I installed FEAR and Broken Sword 3 first. Installing Advent Rising to me was just an after thought. A week later, I pop in Advent Rising in my laptop. And after a few hours of playing it, I was taken a back... This game is a hidden gem!

I have to admit, the game is not easy to get into. The controls were quite complicated, but the next thing I know, they became 2nd nature to me. It's story.... one word... SURPRISINGLY AWESOME! (ok that was 2 words)

How this game intriques me was the way it's experience changes as it progress. If I could put it in a title, I'll dubbed it as "Game Evolved Within A Game". The game has also a final trick on its sleeves after the credits, but I won't it spoil for you.

Advent Rising was intended to be a trilogy. The game ends in a cliff hanger. But like the Back to the Future, with its famous 'To be continued...', the story satisfies its objectives within itself. Unfortunately, news said the sequel were cancelled due to its poor sales. But for now, I'll be positive about this, they WILL make the sequels.

Why didn't it sell that good? Maybe poor marketing strategy on the part of its publisher, Majesco or THQ. Kudos to the developer Glyphx, though. Maybe because of some technical hiccups the game has. But for me, I was bought.

I read some of the reviews of this game. Most of them shout 'rip-offs' or 'cliches'. But in reality, no ideas are new under the sun. I don't see, even if they're similar to other story, it intentionally ripping off other games. Uniqueness shouldn't be a standard but a plus. Otherwise, anything will be a rip off of something.

Underrated games are out there. Most of them were overshadowed by big titles when they come out (example: Psyconauts). My advise to everyone, don't just be interested on overly-hyped games. There are hidden gems out there. In this 'case', its an orange-green diamond.

Wednesday, February 25, 2009

Cooking Programmer (ver 0.01) "Home Super-duper Burger"

After having much free time at home, I made this unhealthy but perfect
hunger quencher meal. Enjoy....

Cooking Programmer's
Home Made Super-duper Burger
with Spicy Cajun Fries

Burger
- beef burger patties
- Smoked Sliced Cheese
- finely chopped Onion mixed with tomato catsup
- cucumber pickles
Grill burger patties (brush with butter to enhance taste).Slightly grill the burger bun.
Place the cheese on top of the burger patties while on grill to slightly melt it. Assemble the burger!

Fries
- 1 - 2 pcs (julienned) potatoes soaked in slightly salted water
- 1 tbps Cajun powder
- 1/2 tsp chili powder
- 1/2 - 1 tsp salt
Fry potatoes in hot oil. Mix spices with potatoes after.

Dip
- 1-2 tbsp mayonaise (low-fat will do)
- dash of pepper
- crushed fresh Basil (or dry-powdered)

Tuesday, January 13, 2009

New Wolverine Game... Awesome Screenshot!

X-Men Origins: Wolverine



Well... atleast how it was presented. I just hope the game (and even the movie with the same title) lives up to its hype.