SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL)

SMAA: Subpixel Morphological Antialiasing

SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL), capable of handling subpixel features seamlessly, and featuring an improved and advanced pattern detection & handling mechanism.

The technique focuses on handling each pattern in a very specific way (via look-up-tables), in order to minimize false positives in the pattern detection. Ultimately, this prevents antialiasing features that are not produced by jaggies, like texture details. Furthermore, this conservative morphological approach, together with correct subsample area estimation, allows to accurately combine MLAA with multi/supersampling techniques. Finally, the technique has been specifically designed to clone (to a reasonable extent) multisampling reference results.

This code is licensed under the MIT license, with a clarification to avoid copyright notices on binary releases (see below).

Checkout the paper for more info!

Thanks To

Stephen Hill ‒ for its invaluable support.

Alex Fry ‒ for its priceless help with the devkit.

Naty Hoffman ‒ for helping us to touch base with the game developer community.

Jean-Francois St-Amour ‒ for providing us great images for testing.

Johan Andersson ‒ for providing the fantastic BF3 image and clearing important questions.

Andrej Dudenhenfer ‒ for creating the SMAA injector.

Dmitriy Jdone ‒ for porting the code to GLSL.

Weibo Xie ‒ for the suggested optimizations.

Alexander Reshetov ‒ for creating MLAA, and opening our mind.

Everyone on the SIGGRAPH course ‒ for the incredible inspiration.

Usage

See SMAA.hlsl for integration info (despite the extension, note that it's OpenGL compatible).

You'll also need some precomputed textures, which can be found as C++ headers (Textures/AreaTex.h and Textures/SearchTex.h), or as regular DDS files (see Textures directory). If you want to see where they came from, you can check out the Scripts directory.

The directories DX9 and DX10 contain integration examples for DirectX 9 and 10 respectively.

Bug Tracker

Found a bug? Please create an issue here on GitHub!

https://github.com/iryoku/smaa/issues

Authors

Jorge Jimenez http://www.iryoku.com/

Jose I. Echevarria http://cheveone.blogspot.com/

Tiago Sousa https://twitter.com/#!/CRYTEK_TIAGO

Belen Masia

Fernando Navarro

Diego Gutierrez http://giga.cps.unizar.es/~diegog/

Copyright and License

Copyright © 2013 Jorge Jimenez ([email protected])

Copyright © 2013 Jose I. Echevarria ([email protected])

Copyright © 2013 Belen Masia ([email protected])

Copyright © 2013 Fernando Navarro ([email protected])

Copyright © 2013 Diego Gutierrez ([email protected])

Permission is hereby granted, free of charge, to any person obtaining a copy this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. As clarification, there is no requirement that the copyright notice and permission be included in binary distributions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Comments
  • add defines to deal with GL vs. DX v-coordinate direction

    add defines to deal with GL vs. DX v-coordinate direction

    There is probably another nicer way to fix this, but when implementing this algorithm in GL in my engine it wasn't working correctly. It seemed that adjusting for the fact that +V is up in GL vs. +V is down in DX could fix the issue. Doing A/B comparisons with the sample app and my GL app along the way, it did indeed fix the issue. Maybe flipping the Area/Search textures and flipping my inputs could also fix it, but doing it this way keeps all the inputs and outputs consistent with the existing engine code.

  • GLSL Compilation Errors

    GLSL Compilation Errors

    I tried to compile the last SMAA.hlsl with the following:

    define SMAA_GLSL_3 1

    define SMAA_PRESET_MEDIUM 1

    define SMAA_PREDICATION 1

    My (Nvidia Geforce GT430) GLSL compiler complains:

    0(614) : error C7011: implicit cast from "float" to "bool" 0(615) : error C7011: implicit cast from "float" to "bool" 0(1269) : error C1115: unable to find compatible overloaded function "SMAAMovc(bool, vec4, vec4)" 0(1270) : error C1115: unable to find compatible overloaded function "SMAAMovc(bool, vec2, vec2)" 0(1271) : error C7011: implicit cast from "float" to "vec2"

    EDIT: corrected line number, basically are simple cast errors

    these error are easily fixable: 614 : SMAA_FLATTEN if (bool(cond.x)) variable.x = value.x; 615 : SMAA_FLATTEN if (bool(cond.y)) variable.y = value.y; 1269: SMAAMovc(vec4(horizontal), blendingOffset, float4(a.x, 0.0, a.z, 0.0)); 1270: SMAAMovc(vec2(horizontal), blendingWeight, a.xz); 1271: blendingWeight /= dot(blendingWeight, vec2(1.0));

    EDIT2: there are many more cast errors if compiled in ULTRA or HIGH (const float to vec2), but once fixed i get a very strange link error:

    (0) : fatal error C9999: *** exception during compilation ***

    This error popup only in ULTRA or HIGH mode, I don't know I can possibly fix this.

  • Added SMAA_CG mode for compiling with NVIDIA CG Toolkit

    Added SMAA_CG mode for compiling with NVIDIA CG Toolkit

    Luckily it was only a few lines thanks to your defines !

    Note that despite my commit message, I have tested it now. It compiles on OpenGL using profiles arbfp/arbvp1 and up.

  • OpenGL + SMAA 4X

    OpenGL + SMAA 4X

    Dear All,

    I've integrated the SMAA S2X and 4X modes for OpenGL. Getting the integration going was relatively easy (just follow the integration notes in SMAA.hlsl) and don't forget to add some missing defines to SMAA.hlsl in the #if defined(SMAA_GLSL_3) || defined(SMAA_GLSL_4) section:

    #define SMAATexture2DMS2(tex) sampler2DMS tex #define SMAALoad(tex, pos, sample) texelFetch(tex, pos, sample)

    Once I got it running, the 4X mode didn't look as good as expected compared to T2X - some edges look better, but others look worse. I think the reason for this is the difference in MSAA2x subsample positions between DirectX and OpenGL.

    The integration notes in SMAA.hlsl call for the scene to be rendered with D3D10_STANDARD_MULTISAMPLE_PATTERN for SMAA 4X. This allows the subsamples to match the order in the @SUBSAMPLE_INDICES table. When the scene is rendered this way, you have the following subsample positions in DirectX:

           * Sample positions DirectX:
           *   _______
           *  | S1    |  S0:  0.25    -0.25
           *  |       |  S1: -0.25     0.25
           *  |____S0_|
           *
    

    On the other hand, for MSAA2X in OpenGL you get the following subsample positions (positions adjusted for pixel centre that is reported at (0.5,0.5) - see below):

            * Sample positions OpenGL:
            *   _______
            *  |    S0 |  S0:  0.25     0.25
            *  |       |  S1: -0.25    -0.25
            *  |_S1____|
            *
    

    I've been puzzling on how to adjust the subsampleIndices to make SMAA 4X work in OpenGL. Does anyone have some insight here ? In the DX10 demo it is mentioned that the indices have the following layout : Indices layout: indices[4] = { |, --, /, \ }. How should that be interpreted ?

    I suspect I also need to adjust the camera jitter, since with the recommended jitter of (0.125, 0.125) and (-0.125, -0.125) for SMAA4X I would end up with a net jitter in OpenGL of:

               *   ________
               *  |      S0|  S0:  0.3750    0.3750
               *  |    S2  |  S1: -0.1250   -0.1250
               *  |  S1    |  S2:  0.1250    0.1250
               *  |S3______|  S3: -0.3750   -0.3750
               *
    

    As a test, I modified the camera jitter offset to use the original, unadjusted subsample positions of MSAA2X instead: (0.75, 0.75) and (0.25, 0.25) (obtained with glGetMultisamplefv, (0.5, 0.5) is the pixel centre). That improved things quite a bit, but I can't help but think I'm still missing something?

  • Fix incorrect calculation of local contrast adaption in color edge detection

    Fix incorrect calculation of local contrast adaption in color edge detection

    In SMAAColorEdgeDetectionPS(), the calculations of left-left and top-top deltas are performed as:

        t = abs(C - Cleftleft);
    

    and

        t = abs(C - Ctoptop);
    

    However, these are not differences of neighboring pixels, so should be modified as follows:

        t = abs(Cleft - Cleftleft);
    

    and

        t = abs(Ctop - Ctoptop);
    

    Thanks!

  • Support WebGL 1 / OpenGL ES 2

    Support WebGL 1 / OpenGL ES 2

    This adds support for OpenGL ES 2 and WebGL 1 via SMAA_GLSL_ES2 as well as support for OpenGL 2 via SMAA_GLSL_2 (only tested via OpenGL ES - the difference here is that support for while loops is not required in ES 2 and is disallowed by WebGL). The output is incorrect when using medium-precision float in the fragment shader, so highp should be used.

  • Add mingw support

    Add mingw support

    Allows building the demos with mingw. Supports both 32- and 64-bit versions. By default it's set up to cross-compile on Linux but should also work if building natively on Windows.

  • OpenGL integration lookup textures confusion

    OpenGL integration lookup textures confusion

    Hi there,

    i am integrating this great SMAA in my own OpenGL test framework. I am a bit confused if i handle correctly the lookup textures.

    First i had to apply this patch to get the edge detection itself working: https://github.com/pushrax/smaa/commit/cd8f88cd1548a2966eb1c4bfe0677427fa885140

    Also i noticed that OpenGL and DirectX have different coordinate system locations in terms of uv-coordinates. So i inverted the Y-axes of the fullscreen triangle texture coordinates to get a buttom up rendering. But then my confusion starts. I use bmp files dumped by the provided python scripts. After reading them in i mirror them, so the Y-coordinate is flipped. After that step i get reasonable results but still different ones compared to the DX10 demo.

    So i have no idea what causes the differences in my edge, weight and result pass. I adjusted the images to fit the lower left corner. So direkt comparison should work.

    The edges seems to be shifted a little bit, but only some of them. dx10demoultraedges oglportedges

    The weights seem to have flipped red and green parts as well as different intensity. This is the point where i am mostly confused. Is this flipping of red and green parts a result of failed usage of the lookup textures or is it a result of the very small differences from the first pass? dx10demoultraweight oglportweight

    The reults itself looks reasonable and quite good, but on some parts still different. Everything seems thicker in the OpenGL results. Good visable at the wire for example. dx10demoultraresult oglportresult

    So it seems to be related to my questions asked at the old opengl port project: https://github.com/scrawl/smaa-opengl/issues/3

  • Custom decode velocity function compilation error

    Custom decode velocity function compilation error

    When defining SMAA_DECODE_VELOCITY to point towards a function that takes a float4 vector an error is generated at line 1323 because the input will be a float2 from swizzling the rg components of the texture sampling, whereas in the other three cases there is no swizzling so a float4 is passed.

Related tags
A small dx11 base program I use to test shaders and techniques
A small dx11 base program I use to test shaders and techniques

Dx11Base A small DirectX 11 program I use to test shaders and techniques (windows only). It is meant to be simple and straightforward. Nothing fancy t

Jul 6, 2022
A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input

GLFW Introduction GLFW is an Open Source, multi-platform library for OpenGL, OpenGL ES and Vulkan application development. It provides a simple, platf

Nov 30, 2022
This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer GPU libraries.

OpenGL Cube Demo using PVR_PSP2 Driver layer GPU libraries This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer

Oct 31, 2021
GPU cloth with OpenGL Compute Shaders
 GPU cloth with OpenGL Compute Shaders

GPU cloth with OpenGL Compute Shaders This project in progress is a PBD cloth simulation accelerated and parallelized using OpenGL compute shaders. Fo

Jul 27, 2022
A legacy OpenGL simulator for OpenGL 4.4, written in C++.

the-ancient-tri A legacy OpenGL simulator for OpenGL 4.4, written in C++. Why? My Uni forces us to use legacy OpenGL (eww!), and I didn't want to lear

Feb 10, 2022
Deno gl - WIP Low-level OpenGL (GLFW) bindings and WebGL API implementation for Deno.

deno_gl WIP Low-level OpenGL (GLFW) bindings and WebGL API implementation for Deno. Building Make dist directory if it doesn't exist. Build gl helper

Jun 11, 2022
An implementation of OpenGL 3.x-ish in clean C
An implementation of OpenGL 3.x-ish in clean C

PortableGL "Because of the nature of Moore's law, anything that an extremely clever graphics programmer can do at one point can be replicated by a mer

Nov 27, 2022
A very simple and light-weight drawing app made with qt and C++.
A very simple and light-weight drawing app made with qt and C++.

Blackboard A very simple and light-weight drawing app made with qt and C++. It supports tablet and pen pressure with the help of QTabletEvents. So you

Nov 15, 2021
A C++ commandline for use in servers and chat software. Provides very simple asynchronous input/output.
A C++ commandline for use in servers and chat software. Provides very simple asynchronous input/output.

commandline A C++ commandline for use in servers and terminal chat software. Provides very simple asynchronous input/output. Supports reading and writ

Oct 9, 2022
Ipsys Particle System Yey letS go, very cool particle system generator and fast renderer
Ipsys Particle System Yey letS go, very cool particle system generator and fast renderer

ipsys - Ipsys Particle System Yey letS go About Ipsys is a piece of software that focuces on running and displaying cool randomly generated particule

May 26, 2022
Simple and efficient screen recording utility for Windows.

simple and efficient screen recording utility for Windows

Nov 28, 2022
Tiny and efficient graph abstractions.
Tiny and efficient graph abstractions.

Tiny and efficient graph abstractions. Usage See tinygraph-example.c Interface See documentation in tinygraph.h Building The tinygraph library require

Aug 15, 2022
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup

A minimalist and platform-agnostic interactive/real-time raytracer. Strong emphasis on simplicity, ease of use and almost no setup to get started with

Oct 5, 2022
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.

Legion-LLRI Legion-LLRI, or “Legion Low Level Rendering Interface” is a rendering API that aims to provide a graphics API agnostic approach to graphic

Aug 13, 2022
Optimized GPU noise functions and utilities

Optimized GPU noise functions and utilities

Dec 2, 2022
Utility on top of the Flutter Driver API that facilitates measuring the performance of your app in an automated way created by Very Good Ventures 🦄
Utility on top of the Flutter Driver API that facilitates measuring the performance of your app in an automated way created by Very Good Ventures 🦄

Very Good Performance Developed with ?? by Very Good Ventures ?? Utility on top of the Flutter Driver API that facilitates measuring the performance o

Nov 29, 2022
A very stupid window manager.

vswm - very stupid window manager ================================= Probably the most stupid window manager ever created, written over an ancient rel

Sep 27, 2022
Canny edge detection, one of the efficient edge detection algorithms is implemented on a Zedboard FPGA using verilog
Canny edge detection, one of the efficient edge detection algorithms is implemented on a Zedboard FPGA using verilog

In this project, Canny edge detection, one of the efficient edge detection algorithms is implemented on a Zedboard FPGA using verilog. The input image is stored on a PC and fed to the FPGA. The output processed image is displayed on a VGA monitor.

Nov 16, 2022