Monday, July 18, 2011

Obligatory FXAA Post

I did a quick search through AltDev and I don't think anyone else has really talked about FXAA on AltDevBlogADay yet, but the time is long overdue. I'm a little late to the game trying out Timothy Lottes' post-process anti-aliasing technique, but I had been noticing people saying great things about it when I finally decided to give it a serious look over 4th of July weekend. I had already been thinking about writing about my experiences with it when I saw Eric Haines had posted up a glowing review of it over on realtimerendering.com. So if you don't trust the judgement of some intern that doesn't even have a college degree completed, please refer to all the more qualified people that are having similar experiences to me.

The Problem at Hand

For those of you that might be reading this and haven't drank the graphics programming Kool-Aid, I'm going to fill you in a little bit as to why we care about anti-aliasing.

In our line of work, the end result of a rendered scene is typically a 2-dimensional array of colors that is displayed on the user's monitor. Just like how audio (which is originally analog), is quantized when it is made digital, the image is built into a discrete number of pixels during the process of rasterization (i.e. filling in each triangle). The measure of how many pixels are used to display the image is of course known as the resolution, which is why things become increasingly blocky as you lower a game's resolution, due to there being fewer dots per inch.

Even at high resolutions, individual pixels can still often be identified by the viewer, because the lines and edges of triangles form distracting "jaggies". This is noticeable along the outline of an object being displayed due to rasterization, but can also occur on interior surfaces for other reasons such as shadows and texture map resolution. The user typically picks the resolution their monitors are set at, but most console games provide 720p or 1080p images to the TV, and monitors tend to follow suit (my laptop is set to ~720p). Other than relying on hardware to pack more pixels into the same physical area by increasing the dots per inch (which is something that Apple is claiming with it's retina displays, but check out this post for an interesting look at that), we have to find ways to smooth the transition between pixels. Perhaps the simplest strategy to dealing with aliasing is to render into a texture twice as large as the target resolution and then downsampling it to actual output resolution. This allows you to take the average of every four pixels, softening out the boundaries by averaging the edge colors together. However, this also just straight up sucks for performance. You end up processing 4x as many fragments from the higher resolution, and you have 4x the memory usage for the buffer. This is way too high a cost to pay for some smooth edges.

The Hardware Option

There is a hardware accelerated method, MSAA (Multi-Sample Anti-Aliasing). This operates by computing additional samples when rendering the frame instead of rendering additional pixels and downsampling. The fragment shader is only run once for each group of samples. The problem is that it doesn't work with the increasingly popular deferred rendering techniques, because you can't really take more samples of a buffer you've already rendered. In short, the damage is already done, the data has been discretized to a particular resolution. Furthermore, I've always found that MSAA is still pretty expensive (but then again, the only way to make your rendering take 0 ms is to quite doing rendering and switch to a job in finance).

New Maps of AA-land

The desire to use deferred rendering has pushed alternate forms of anti-aliasing to get a lot of attention (I mean who doesn't need more acronyms, right?). Perhaps the most prevalent that you may have heard of is MLAA, but most of the techniques that I'm referring to here are post-processing techniques that rely on detecting and softening edges. By doing anti-aliasing as a post-process, it will work seamlessly with deferred rendering, and pretty much anything else for that matter. If the technique only needs the color buffer access, then it can even be applied to video/screenshots/whatever of existing games, which I think researchers in this area have found as a great method of showing off how their work.

MLAA was originally a CPU based technique developed by Intel, that has since then been adopted to be done on the GPU. This has been outlined in GPU Pro 2, Game Developer Magazine, and around the net, but if you're not familiar with it I'll make a few brief points about it. It essentially boils down to storing edges in a texture with different colors indicating which side of the pixel the edge is located at, and an additional buffer is used to calculate the blending weights for the blurring.

Impressively, they found results falling somewhere between 4x and 8x MSAA, while being 11x faster than 8x MSAA. The memory footprint of the technique is 1.5x or 2x the size of the back buffer depending on the hardware (2x for the Xbox 360 for all you console devs who probably have already heard everything I'm saying). That's pretty good for something that solves the problems encountered with deferred rendering at the same time. This is undoubtably why the technique has garnered so much attention, and I really recommend the article in GPU Pro 2 if you want a clear view of all of the details.

Enter: FXAA

As you may know from reading other posts, my big side-project/hobby/thing is that when I give a new technique a try, I do it in Unity because it a) is usually not straightforward and requires actually understanding the technique to get it working and b) can be evaluated using the many projects I've done in Unity previously. From looking over the details of MLAA, I knew it would probably take a full weekend to get it right, and I had been procrastinating quite a bit with getting around to doing it.

When the third iteration of FXAA rolled out I decided to take a look at what it entailed. I knew in the back of my mind that FXAA was a strictly luminosity based technique, which is interesting to me. The authors of MLAA recommend using depth to determine edges for best results and performance. However, luminosity based techniques offer the advantage/disadvantage of smoothing boundaries that exist in places other than depth, such as with aliasing on texture maps and with shadows. The downside is that this can produce results that are too blurry in places you don't want blur, such as with text on a prop in 3D space. I once tried a very, very simple luminosity based AA filter, that resulted in too many cons (especially with blurry text) for me to seriously use. I was curious if FXAA would give me similar problems.

With the intention of just looking over the code briefly, I suddenly found myself staring at a very simple and easy to use code base offering a ton of well explained pre-processor options for target platform and quality. Being around midnight when I started looking, I quickly decided that porting the higher-quality PC version of the HLSL code to CG/Unity would be fun. There were two steps involved:

1) At the end of all the other post-processing, calculate Luminosity and slam it into the alpha channel (super easy to do).
2) Perform the FXAA pass. Porting mostly involved fixing texture look-up syntax.

No extra buffers involved. This cuts down on the extra memory needed for MLAA, and the code seemed simple enough that it would probably be pretty fast. I ported all the code within 2 hours and then went to sleep. The next morning I finished setting it up for use in Unity... and was blown away by the speed and the results. Here's a breakdown of what I got running in the Unity editor at ~720p, using Dust, a previous project of mine, to test it. These are PNG's cropped down at native resolution:

Shot 1: No Anti-aliasing



Shot 2: FXAA3



Shot 3: 6x MSAA



It takes FXAA3 only ~1 ms [Note: corrected from an erroneous order of magnitude type when I first posted this article] on my laptop (MacBook Pro with an NVIDIA GeForce 320M) to be completed (including the luminosity calculation), and as I mentioned, no additional memory either. I'll pay that millisecond any day of the week for that quality of anti-aliasing. Furthermore, the blurriness on text was much more acceptable than the fast blur I had tried previously. Note that this text is not really meant to be read, but rather recognized to match the voice over so the player can understand that the voice is that of the journal's author. I wish I could credit the sources where I put together the fast blur from, but it's been more than a few months since I implemented/ditched it. Here's a comparison:

Shot 1: No Anti-aliasing



Shot 2: FXAA3



Shot 3: Fast Blur



This is getting a bit more to the personal opinion end of things, but even though FXAA3 does blur the text, I feel like the fast blur makes it almost uncomfortable to look at. I get the sensation that I'm getting tested for a new prescription of glasses, and have to recognize out of focus letters. To me, that seems like a pretty good indication that your luminosity based AA technique isn't up to snuff if you get that type of blurring. FXAA3 on the other hand seems acceptable enough that it'll definitely be enabled in future builds of Dust that get pushed up onto my website.

What did we learn?

Hopefully we learned that FXAA is both fast and doesn't require additional memory, and is trivially simple to implement or port. There are versions for the 360, PS3, and PC, as well as HLSL and GLSL variants. It's a luminosity based solution, and comes with the associated pros/cons those techniques have, but I've found FXAA to minimize the cons. It should literally take you at most a few hours to get it up and running in your game and evaluate if it's a good fit for what you're doing. Furthermore, it is in the public domain (credit to the post by Eric Haines for this twitter snippet):



Although at this point, I'd the amount of quality you're getting for the effort of implementation, it might as well be under the beer license. Finally, I would point out that Timothy Lottes is still improving on the code, and has already released version 3.9 in the time since I first touched it. You can find the latest source links and updates on his blog: http://timothylottes.blogspot.com/.