Optimized GPU noise functions and utilities

//
//	Code repository for GPU noise development blog
//	http://briansharpe.wordpress.com
//	https://github.com/BrianSharpe
//
//	I'm not one for copyrights.  Use the code however you wish.
//	All I ask is that credit be given back to the blog or myself when appropriate.
//	And also to let me know if you come up with any changes, improvements, thoughts or interesting uses for this stuff. :)
//	Thanks!
//
//	Brian Sharpe
//	brisharpe CIRCLE_A yahoo DOT com
//	http://briansharpe.wordpress.com
//	https://github.com/BrianSharpe
//
Similar Resources

What I'm doing here is insane GPU driver prototype for @GreenteaOS

What I'm doing here is insane GPU driver prototype for @GreenteaOS

NjRAA Work-in-progress Driver Foundation [nee-jee-ray] What I'm doing here is a GPU driver for Linux as a prototype for future graphics stack of the @

Jan 22, 2022

A Hydra-enabled GPU path tracer that supports MaterialX.

A Hydra-enabled GPU path tracer that supports MaterialX.

A Hydra-enabled GPU path tracer that supports MaterialX.

Nov 21, 2022

A low-level, cross-platform GPU library

vgpu is cross-platform low-level GPU library. Features Support for Windows, Linux, macOS. Modern rendering using Vulkan and Direct3D12. Dependencies U

Jul 28, 2022

GPU cloth with OpenGL Compute Shaders

 GPU cloth with OpenGL Compute Shaders

GPU cloth with OpenGL Compute Shaders This project in progress is a PBD cloth simulation accelerated and parallelized using OpenGL compute shaders. Fo

Jul 27, 2022

GPU Texture Generator

GPU Texture Generator

Imogen GPU/CPU Texture Generator GPU Texture generator using dear imgui for UI. Not production ready and a bit messy but really fun to code. This is a

Nov 21, 2022

Source Code for "Ray Tracing Gems: High-Quality and Real-Time Rendering with DXR and Other APIs" by Eric Haines and Tomas Akenine-Möller

Apress Source Code This repository accompanies Ray Tracing Gems: High-Quality and Real-Time Rendering with DXR and Other APIs by Eric Haines and Tomas

Dec 2, 2022

Horde3D is a small 3D rendering and animation engine. It is written in an effort to create an engine being as lightweight and conceptually clean as possible.

Horde3D Horde3D is a 3D rendering engine written in C++ with an effort being as lightweight and conceptually clean as possible. Horde3D requires a ful

Nov 28, 2022

Lightweight and modular C++11 graphics middleware for games and data visualization

Magnum — Lightweight and modular C++11/C++14 graphics middleware for games and data visualization Looking for an open-source library that gives you gr

Nov 28, 2022

ANSI C library for NURBS, B-Splines, and Bézier curves with interfaces for C++, C#, D, Go, Java, Lua, Octave, PHP, Python, R, and Ruby.

TinySpline TinySpline is a small, yet powerful library for interpolating, transforming, and querying arbitrary NURBS, B-Splines, and Bézier curves. Th

Nov 26, 2022
Comments
  • Syntax errors

    Syntax errors

    Hi there,

    I had to make a few modifications to make this work on my ATI GPU.

    Smearing floats isn't supported, so things like 0.0.xxx needed to be changed to vec3(0.0) Casting boolean vec3 to float vec3 needs to be explicit, so I had to wrap all greaterThan(...) and lessThan(...) calls as vec3(greaterThan(...)) etc.

    The code below is edited to compile on this GPU.

    Thanks

    // // Code repository for GPU noise development blog // http://briansharpe.wordpress.com // https://github.com/BrianSharpe // // I'm not one for copywrites. Use the code however you wish. // All I ask is that credit be given back to the blog or myself when appropriate. // And also to let me know if you come up with any changes, improvements, thoughts or interesting uses for this stuff. :) // Thanks! // // Brian Sharpe // brisharpe CIRCLE_A yahoo DOT com // http://briansharpe.wordpress.com // https://github.com/BrianSharpe //

    // // NoiseLib TODO // // 1) Ensure portability across different cards // 2) 16bit and 24bit implementations of hashes and noises // 3) Lift various noise implementations out to individual self-contained files // 4) Implement texture-based versions // 5) 4D noises //

    // // Permutation polynomial idea is from Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise // http://www.itn.liu.se/~stegu/GLSL-cellular // // http://briansharpe.wordpress.com/2011/10/01/gpu-texture-free-noise/ // vec4 SGPP_coord_prepare(vec4 x) { return x - floor(x * ( 1.0 / 289.0 )) * 289.0; } vec3 SGPP_coord_prepare(vec3 x) { return x - floor(x * ( 1.0 / 289.0 )) * 289.0; } vec4 SGPP_permute(vec4 x) { return fract( x * ( ( 34.0 / 289.0 ) * x + ( 1.0 / 289.0 ) ) ) * 289.0; } vec4 SGPP_resolve(vec4 x) { return fract( x * ( 7.0 / 288.0 ) ); } vec4 SGPP_hash_2D( vec2 gridcell ) // generates a random number for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate vec4 hash_coord = SGPP_coord_prepare( vec4( gridcell.xy, gridcell.xy + vec2(1.0) ) ); return SGPP_resolve( SGPP_permute( SGPP_permute( hash_coord.xzxz ) + hash_coord.yyww ) ); } void SGPP_hash_2D( vec2 gridcell, out vec4 hash_0, out vec4 hash_1 ) // generates 2 random numbers for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate vec4 hash_coord = SGPP_coord_prepare( vec4( gridcell.xy, gridcell.xy + vec2(1.0) ) ); hash_0 = SGPP_permute( SGPP_permute( hash_coord.xzxz ) + hash_coord.yyww ); hash_1 = SGPP_resolve( SGPP_permute( hash_0 ) ); hash_0 = SGPP_resolve( hash_0 ); } void SGPP_hash_3D( vec3 gridcell, out vec4 lowz_hash, out vec4 highz_hash ) // generates a random number for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate gridcell = SGPP_coord_prepare( gridcell ); vec3 gridcell_inc1 = mix( gridcell + vec3(1.0), vec3(0.0), vec3(greaterThan( gridcell, vec3(287.5) )) ); highz_hash = SGPP_permute( SGPP_permute( vec2( gridcell.x, gridcell_inc1.x ).xyxy ) + vec2( gridcell.y, gridcell_inc1.y ).xxyy ); lowz_hash = SGPP_resolve( SGPP_permute( highz_hash + gridcell.zzzz ) ); highz_hash = SGPP_resolve( SGPP_permute( highz_hash + gridcell_inc1.zzzz ) ); } void SGPP_hash_3D( vec3 gridcell, vec3 v1_mask, // user definable v1 and v2. ( 0's and 1's ) vec3 v2_mask, out vec4 hash_0, out vec4 hash_1, out vec4 hash_2 ) // generates 3 random numbers for each of the 4 3D cell corners. cell corners: v0=0,0,0 v3=1,1,1 the other two are user definable { vec3 coords0 = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / 289.0 )) * 289.0; vec3 coords3 = mix( coords0 + vec3(1.0), vec3(0.0), vec3(greaterThan( coords0, vec3(287.5) )) ); vec3 coords1 = mix( coords3, coords0, vec3(lessThan( v1_mask, vec3(0.5) )) ); vec3 coords2 = mix( coords3, coords0, vec3(lessThan( v1_mask, vec3(0.5) )) ); hash_2 = SGPP_permute( SGPP_permute( SGPP_permute( vec4( coords0.x, coords1.x, coords2.x, coords3.x ) ) + vec4( coords0.y, coords1.y, coords2.y, coords3.y ) ) + vec4( coords0.z, coords1.z, coords2.z, coords3.z ) ); hash_0 = SGPP_resolve( hash_2 ); hash_1 = SGPP_resolve( hash_2 = SGPP_permute( hash_2 ) ); hash_2 = SGPP_resolve( SGPP_permute( hash_2 ) ); } void SGPP_hash_3D( vec3 gridcell, out vec4 lowz_hash_0, out vec4 lowz_hash_1, out vec4 lowz_hash_2, out vec4 highz_hash_0, out vec4 highz_hash_1, out vec4 highz_hash_2 ) // generates 3 random numbers for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate gridcell = SGPP_coord_prepare( gridcell ); vec3 gridcell_inc1 = mix( gridcell + vec3(1.0), vec3(0.0), vec3(greaterThan( gridcell, vec3(287.5) )) ); highz_hash_2 = SGPP_permute( SGPP_permute( vec2( gridcell.x, gridcell_inc1.x ).xyxy ) + vec2( gridcell.y, gridcell_inc1.y ).xxyy ); lowz_hash_0 = SGPP_resolve( lowz_hash_2 = SGPP_permute( highz_hash_2 + gridcell.zzzz ) ); highz_hash_0 = SGPP_resolve( highz_hash_2 = SGPP_permute( highz_hash_2 + gridcell_inc1.zzzz ) ); lowz_hash_1 = SGPP_resolve( lowz_hash_2 = SGPP_permute( lowz_hash_2 ) ); highz_hash_1 = SGPP_resolve( highz_hash_2 = SGPP_permute( highz_hash_2 ) ); lowz_hash_2 = SGPP_resolve( SGPP_permute( lowz_hash_2 ) ); highz_hash_2 = SGPP_resolve( SGPP_permute( highz_hash_2 ) ); }

    // // implementation of the blumblumshub hash // as described in MNoise paper http://www.cs.umbc.edu/~olano/papers/mNoise.pdf // // http://briansharpe.wordpress.com/2011/10/01/gpu-texture-free-noise/ // vec4 BBS_coord_prepare(vec4 x) { return x - floor(x * ( 1.0 / 61.0 )) * 61.0; } vec3 BBS_coord_prepare(vec3 x) { return x - floor(x * ( 1.0 / 61.0 )) * 61.0; } vec4 BBS_permute(vec4 x) { return fract( x * x * ( 1.0 / 61.0 )) * 61.0; } vec4 BBS_permute_and_resolve(vec4 x) { return fract( x * x * ( 1.0 / 61.0 ) ); } vec4 BBS_hash_2D( vec2 gridcell ) // generates a random number for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate vec4 hash_coord = BBS_coord_prepare( vec4( gridcell.xy, gridcell.xy + vec2(1.0)) ); vec4 p = BBS_permute( hash_coord.xzxz /* * 7.0 / ); // * 7.0 will increase variance close to origin return BBS_permute_and_resolve( p + hash_coord.yyww ); } vec4 BBS_hash_hq_2D( vec2 gridcell ) // generates a hq random number for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate vec4 hash_coord = BBS_coord_prepare( vec4( gridcell.xy, gridcell.xy + vec2(1.0) ) ); vec4 p = BBS_permute( hash_coord.xzxz / * 7.0 */ ); // * 7.0 will increase variance close to origin p = BBS_permute( p + hash_coord.yyww ); return BBS_permute_and_resolve( p + hash_coord.xzxz ); } void BBS_hash_3D( vec3 gridcell, out vec4 lowz_hash, out vec4 highz_hash ) // generates a random number for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate

    //  was having precision issues here with 61.0.  60.0 fixes it.  need to test on other cards.
    const float DOMAIN = 60.0;
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + vec3(1.0), vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 ) )) );
    
    vec4 p = BBS_permute( vec2( gridcell.x, gridcell_inc1.x ).xyxy /* * 7.0 */ );  // * 7.0 will increase variance close to origin
    p = BBS_permute( p + vec2( gridcell.y, gridcell_inc1.y ).xxyy );
    lowz_hash = BBS_permute_and_resolve( p + gridcell.zzzz );
    highz_hash = BBS_permute_and_resolve( p + gridcell_inc1.zzzz );
    

    }

    // // FAST32_hash // A very fast hashing function. Requires 32bit support. // http://briansharpe.wordpress.com/2011/11/15/a-fast-and-simple-32bit-floating-point-hash-function/ // // The hash formula takes the form.... // hash = mod( coord.x * coord.x * coord.y * coord.y, SOMELARGEFLOAT ) / SOMELARGEFLOAT // We truncate and offset the domain to the most interesting part of the noise. // SOMELARGEFLOAT should be in the range of 400.0->1000.0 and needs to be hand picked. Only some give good results. // 3D Noise is achieved by offsetting the SOMELARGEFLOAT value by the Z coordinate // vec4 FAST32_hash_2D( vec2 gridcell ) // generates a random number for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate const vec2 OFFSET = vec2( 26.0, 161.0 ); const float DOMAIN = 71.0; const float SOMELARGEFLOAT = 951.135664; vec4 P = vec4( gridcell.xy, gridcell.xy + vec2(1.0)); P = P - floor(P * ( 1.0 / DOMAIN )) * DOMAIN; // truncate the domain P += OFFSET.xyxy; // offset to interesting part of the noise P *= P; // calculate and return the hash return fract( P.xzxz * P.yyww * vec4( 1.0 / SOMELARGEFLOAT ) ); } void FAST32_hash_2D( vec2 gridcell, out vec4 hash_0, out vec4 hash_1 ) // generates 2 random numbers for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate const vec2 OFFSET = vec2( 26.0, 161.0 ); const float DOMAIN = 71.0; const vec2 SOMELARGEFLOATS = vec2( 951.135664, 642.949883 ); vec4 P = vec4( gridcell.xy, gridcell.xy + vec2(1.0) ); P = P - floor(P * ( 1.0 / DOMAIN )) * DOMAIN; P += OFFSET.xyxy; P *= P; P = P.xzxz * P.yyww; hash_0 = fract( P * vec4( 1.0 / SOMELARGEFLOATS.x ) ); hash_1 = fract( P * vec4( 1.0 / SOMELARGEFLOATS.y ) ); } void FAST32_hash_2D( vec2 gridcell, out vec4 hash_0, out vec4 hash_1, out vec4 hash_2 ) // generates 3 random numbers for each of the 4 cell corners { // gridcell is assumed to be an integer coordinate const vec2 OFFSET = vec2( 26.0, 161.0 ); const float DOMAIN = 71.0; const vec3 SOMELARGEFLOATS = vec3( 951.135664, 642.949883, 803.202459 ); vec4 P = vec4( gridcell.xy, gridcell.xy + vec2(1.0) ); P = P - floor(P * ( 1.0 / DOMAIN )) * DOMAIN; P += OFFSET.xyxy; P *= P; P = P.xzxz * P.yyww; hash_0 = fract( P * vec4( 1.0 / SOMELARGEFLOATS.x ) ); hash_1 = fract( P * vec4( 1.0 / SOMELARGEFLOATS.y ) ); hash_2 = fract( P * vec4( 1.0 / SOMELARGEFLOATS.z ) ); } vec4 FAST32_hash_2D_Cell( vec2 gridcell ) // generates 4 different random numbers for the single given cell point { // gridcell is assumed to be an integer coordinate const vec2 OFFSET = vec2( 26.0, 161.0 ); const float DOMAIN = 71.0; const vec4 SOMELARGEFLOATS = vec4( 951.135664, 642.949883, 803.202459, 986.973274 ); vec2 P = gridcell - floor(gridcell * ( 1.0 / DOMAIN )) * DOMAIN; P += OFFSET.xy; P *= P; return fract( vec4(P.x * P.y) * ( 1.0 / SOMELARGEFLOATS.xyzw ) ); } vec4 FAST32_hash_3D_Cell( vec3 gridcell ) // generates 4 different random numbers for the single given cell point { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const vec4 SOMELARGEFLOATS = vec4( 635.298681, 682.357502, 668.926525, 588.255119 );
    const vec4 ZINC = vec4( 48.500388, 65.294118, 63.934599, 63.279683 );
    
    //  truncate the domain
    gridcell.xyz = gridcell - floor(gridcell * ( 1.0 / DOMAIN )) * DOMAIN;
    gridcell.xy += OFFSET.xy;
    gridcell.xy *= gridcell.xy;
    return fract( vec4( gridcell.x * gridcell.y ) * ( (1.0) / ( SOMELARGEFLOATS + gridcell.zzzz * ZINC ) ) );
    

    } void FAST32_hash_3D( vec3 gridcell, out vec4 lowz_hash, out vec4 highz_hash ) // generates a random number for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const float SOMELARGEFLOAT = 635.298681;
    const float ZINC = 48.500388;
    
    //  truncate the domain
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + 1.0, vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 ) )) );
    
    //  calculate the noise
    vec4 P = vec4( gridcell.xy, gridcell_inc1.xy ) + OFFSET.xyxy;
    P *= P;
    P = P.xzxz * P.yyww;
    highz_hash.xy = vec2( 1.0 / ( SOMELARGEFLOAT + vec2( gridcell.z, gridcell_inc1.z ) * ZINC ) );
    lowz_hash = fract( P * highz_hash.xxxx );
    highz_hash = fract( P * highz_hash.yyyy );
    

    } void FAST32_hash_3D( vec3 gridcell, vec3 v1_mask, // user definable v1 and v2. ( 0's and 1's ) vec3 v2_mask, out vec4 hash_0, out vec4 hash_1, out vec4 hash_2 ) // generates 3 random numbers for each of the 4 3D cell corners. cell corners: v0=0,0,0 v3=1,1,1 the other two are user definable { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const vec3 SOMELARGEFLOATS = vec3( 635.298681, 682.357502, 668.926525 );
    const vec3 ZINC = vec3( 48.500388, 65.294118, 63.934599 );
    
    //  truncate the domain
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + 1.0, vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 ) )) );
    
    //  compute x*x*y*y for the 4 corners
    vec4 P = vec4( gridcell.xy, gridcell_inc1.xy ) + OFFSET.xyxy;
    P *= P;
    vec4 V1xy_V2xy = mix( P.zwzw, P.xyxy, vec4(lessThan( vec4( v1_mask.xy, v2_mask.xy ), vec4(0.5) )) );        //  apply mask for v1 and v2
    P = vec4( P.x, V1xy_V2xy.xz, P.z ) * vec4( P.y, V1xy_V2xy.yw, P.w );
    
    //  get the lowz and highz mods
    vec3 lowz_mods = vec3( 1.0 / ( SOMELARGEFLOATS.xyz + gridcell.zzz * ZINC.xyz ) );
    vec3 highz_mods = vec3( 1.0 / ( SOMELARGEFLOATS.xyz + gridcell_inc1.zzz * ZINC.xyz ) );
    
    //  apply mask for v1 and v2 mod values
    v1_mask = mix( highz_mods, lowz_mods, vec3(lessThan( v1_mask.zzz, vec3(0.5))) );
    v2_mask = mix( highz_mods, lowz_mods, vec3(lessThan( v2_mask.zzz, vec3(0.5))) );
    
    //  compute the final hash
    hash_0 = fract( P * vec4( lowz_mods.x, v1_mask.x, v2_mask.x, highz_mods.x ) );
    hash_1 = fract( P * vec4( lowz_mods.y, v1_mask.y, v2_mask.y, highz_mods.y ) );
    hash_2 = fract( P * vec4( lowz_mods.z, v1_mask.z, v2_mask.z, highz_mods.z ) );
    

    } vec4 FAST32_hash_3D( vec3 gridcell, vec3 v1_mask, // user definable v1 and v2. ( 0's and 1's ) vec3 v2_mask ) // generates 1 random number for each of the 4 3D cell corners. cell corners: v0=0,0,0 v3=1,1,1 the other two are user definable { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const float SOMELARGEFLOAT = 635.298681;
    const float ZINC = 48.500388;
    
    //  truncate the domain
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + 1.0, vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 )) ) );
    
    //  compute x*x*y*y for the 4 corners
    vec4 P = vec4( gridcell.xy, gridcell_inc1.xy ) + OFFSET.xyxy;
    P *= P;
    vec4 V1xy_V2xy = mix( P.zwzw, P.xyxy, vec4(lessThan( vec4( v1_mask.xy, v2_mask.xy ), vec4(0.5) )) );        //  apply mask for v1 and v2
    P = vec4( P.x, V1xy_V2xy.xz, P.z ) * vec4( P.y, V1xy_V2xy.yw, P.w );
    
    //  get the z mod vals
    vec2 V1z_V2z = mix( gridcell_inc1.zz, gridcell.zz, vec2(lessThan( vec2( v1_mask.z, v2_mask.z ), vec2(0.5) )) );
    vec4 mod_vals = vec4( 1.0 / ( SOMELARGEFLOAT + vec4( gridcell.z, V1z_V2z, gridcell_inc1.z ) * ZINC ) );
    
    //  compute the final hash
    return fract( P * mod_vals );
    

    } void FAST32_hash_3D( vec3 gridcell, out vec4 lowz_hash_0, out vec4 lowz_hash_1, out vec4 lowz_hash_2, out vec4 highz_hash_0, out vec4 highz_hash_1, out vec4 highz_hash_2 ) // generates 3 random numbers for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const vec3 SOMELARGEFLOATS = vec3( 635.298681, 682.357502, 668.926525 );
    const vec3 ZINC = vec3( 48.500388, 65.294118, 63.934599 );
    
    //  truncate the domain
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + 1.0, vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 ))) );
    
    //  calculate the noise
    vec4 P = vec4( gridcell.xy, gridcell_inc1.xy ) + OFFSET.xyxy;
    P *= P;
    P = P.xzxz * P.yyww;
    lowz_hash_2.xyzw = vec4( 1.0 / ( SOMELARGEFLOATS.xyzx + vec2( gridcell.z, gridcell_inc1.z ).xxxy * ZINC.xyzx ) );
    highz_hash_2.xy = vec2( 1.0 / ( SOMELARGEFLOATS.yz + gridcell_inc1.zz * ZINC.yz ) );
    lowz_hash_0 = fract( P * lowz_hash_2.xxxx );
    highz_hash_0 = fract( P * lowz_hash_2.wwww );
    lowz_hash_1 = fract( P * lowz_hash_2.yyyy );
    highz_hash_1 = fract( P * highz_hash_2.xxxx );
    lowz_hash_2 = fract( P * lowz_hash_2.zzzz );
    highz_hash_2 = fract( P * highz_hash_2.yyyy );
    

    } void FAST32_hash_3D( vec3 gridcell, out vec4 lowz_hash_0, out vec4 lowz_hash_1, out vec4 lowz_hash_2, out vec4 lowz_hash_3, out vec4 highz_hash_0, out vec4 highz_hash_1, out vec4 highz_hash_2, out vec4 highz_hash_3 ) // generates 4 random numbers for each of the 8 cell corners { // gridcell is assumed to be an integer coordinate

    //  TODO:   these constants need tweaked to find the best possible noise.
    //          probably requires some kind of brute force computational searching or something....
    const vec2 OFFSET = vec2( 50.0, 161.0 );
    const float DOMAIN = 69.0;
    const vec4 SOMELARGEFLOATS = vec4( 635.298681, 682.357502, 668.926525, 588.255119 );
    const vec4 ZINC = vec4( 48.500388, 65.294118, 63.934599, 63.279683 );
    
    //  truncate the domain
    gridcell.xyz = gridcell.xyz - floor(gridcell.xyz * ( 1.0 / DOMAIN )) * DOMAIN;
    vec3 gridcell_inc1 = mix( gridcell + 1.0, vec3(0.0), vec3(greaterThan( gridcell, vec3( DOMAIN - 1.5 )) ) );
    
    //  calculate the noise
    vec4 P = vec4( gridcell.xy, gridcell_inc1.xy ) + OFFSET.xyxy;
    P *= P;
    P = P.xzxz * P.yyww;
    lowz_hash_3.xyzw = vec4( 1.0 / ( SOMELARGEFLOATS.xyzw + gridcell.zzzz * ZINC.xyzw ) );
    highz_hash_3.xyzw = vec4( 1.0 / ( SOMELARGEFLOATS.xyzw + gridcell_inc1.zzzz * ZINC.xyzw ) );
    lowz_hash_0 = fract( P * lowz_hash_3.xxxx );
    highz_hash_0 = fract( P * highz_hash_3.xxxx );
    lowz_hash_1 = fract( P * lowz_hash_3.yyyy );
    highz_hash_1 = fract( P * highz_hash_3.yyyy );
    lowz_hash_2 = fract( P * lowz_hash_3.zzzz );
    highz_hash_2 = fract( P * highz_hash_3.zzzz );
    lowz_hash_3 = fract( P * lowz_hash_3.wwww );
    highz_hash_3 = fract( P * highz_hash_3.wwww );
    

    }

    // // Interpolation functions // ( smoothly increase from 0.0 to 1.0 as x increases linearly from 0.0 to 1.0 ) // http://briansharpe.wordpress.com/2011/11/14/two-useful-interpolation-functions-for-noise-development/ // float Interpolation_C1( float x ) { return x * x * (3.0 - 2.0 * x); } // 3x^2-2x^3 ( Hermine Curve. Same as SmoothStep(). As used by Perlin in Original Noise. ) vec2 Interpolation_C1( vec2 x ) { return x * x * (3.0 - 2.0 * x); } vec3 Interpolation_C1( vec3 x ) { return x * x * (3.0 - 2.0 * x); } vec4 Interpolation_C1( vec4 x ) { return x * x * (3.0 - 2.0 * x); }

    float Interpolation_C2( float x ) { return x * x * x * (x * (x * 6.0 - 15.0) + 10.0); } // 6x^5-15x^4+10x^3 ( Quintic Curve. As used by Perlin in Improved Noise. http://mrl.nyu.edu/~perlin/paper445.pdf ) vec2 Interpolation_C2( vec2 x ) { return x * x * x * (x * (x * 6.0 - 15.0) + 10.0); } vec3 Interpolation_C2( vec3 x ) { return x * x * x * (x * (x * 6.0 - 15.0) + 10.0); } vec4 Interpolation_C2( vec4 x ) { return x * x * x * (x * (x * 6.0 - 15.0) + 10.0); } vec4 Interpolation_C2_InterpAndDeriv( vec2 x ) { return x.xyxy * x.xyxy * ( x.xyxy * ( x.xyxy * ( x.xyxy * vec4( vec2(6.0), vec2(0.0) ) + vec4( vec2(-15.0), vec2(30.0) ) ) + vec4( vec2(10.0), vec2(-60.0) ) ) + vec4( vec2(0.0), vec2(30.0) ) ); } vec3 Interpolation_C2_Deriv( vec3 x ) { return x * x * (x * (x * 30.0 - 60.0) + 30.0); }

    float Interpolation_C2_Fast( float x ) { float x3 = x_x_x; return ( 7.0 + ( x3 - 7.0 ) * x ) * x3; } // 7x^3-7x^4+x^7 ( Faster than Perlin Quintic. Not quite as good shape. ) vec2 Interpolation_C2_Fast( vec2 x ) { vec2 x3 = x_x_x; return ( 7.0 + ( x3 - 7.0 ) * x ) * x3; } vec3 Interpolation_C2_Fast( vec3 x ) { vec3 x3 = x_x_x; return ( 7.0 + ( x3 - 7.0 ) * x ) * x3; } vec4 Interpolation_C2_Fast( vec4 x ) { vec4 x3 = x_x_x; return ( 7.0 + ( x3 - 7.0 ) * x ) * x3; }

    float Interpolation_C3( float x ) { float xsq = x_x; float xsqsq = xsq_xsq; return xsqsq * ( 25.0 - 48.0 * x + xsq * ( 25.0 - xsqsq ) ); } // 25x^4-48x^5+25x^6-x^10 ( C3 Interpolation function. If anyone ever needs it... :) ) vec2 Interpolation_C3( vec2 x ) { vec2 xsq = x_x; vec2 xsqsq = xsq_xsq; return xsqsq * ( 25.0 - 48.0 * x + xsq * ( 25.0 - xsqsq ) ); } vec3 Interpolation_C3( vec3 x ) { vec3 xsq = x_x; vec3 xsqsq = xsq_xsq; return xsqsq * ( 25.0 - 48.0 * x + xsq * ( 25.0 - xsqsq ) ); } vec4 Interpolation_C3( vec4 x ) { vec4 xsq = x_x; vec4 xsqsq = xsq_xsq; return xsqsq * ( 25.0 - 48.0 * x + xsq * ( 25.0 - xsqsq ) ); }

    // // Falloff defined in XSquared // ( smoothly decrease from 1.0 to 0.0 as xsq increases from 0.0 to 1.0 ) // http://briansharpe.wordpress.com/2011/11/14/two-useful-interpolation-functions-for-noise-development/ // float Falloff_Xsq_C1( float xsq ) { xsq = 1.0 - xsq; return xsq_xsq; } // ( 1.0 - x_x )^2 ( Used by Humus for lighting falloff in Just Cause 2. GPUPro 1 ) float Falloff_Xsq_C2( float xsq ) { xsq = 1.0 - xsq; return xsq_xsq_xsq; } // ( 1.0 - x_x )^3. NOTE: 2nd derivative is 0.0 at x=1.0, but non-zero at x=0.0 vec4 Falloff_Xsq_C2( vec4 xsq ) { xsq = 1.0 - xsq; return xsq_xsq*xsq; }

    // // Value Noise 2D // Return value range of 0.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/valuesample1.jpg // float Value2D( vec2 P ) { // establish our grid cell and unit position vec2 Pi = floor(P); vec2 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash = FAST32_hash_2D( Pi );
    //vec4 hash = BBS_hash_2D( Pi );
    //vec4 hash = SGPP_hash_2D( Pi );
    //vec4 hash = BBS_hash_hq_2D( Pi );
    
    //  blend the results and return
    vec2 blend = Interpolation_C2( Pf );
    vec2 res0 = mix( hash.xy, hash.zw, blend.y );
    return mix( res0.x, res0.y, blend.x );
    

    }

    // // Value Noise 3D // Return value range of 0.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/valuesample1.jpg // float Value3D( vec3 P ) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_lowz, hash_highz;
    FAST32_hash_3D( Pi, hash_lowz, hash_highz );
    //BBS_hash_3D( Pi, hash_lowz, hash_highz );
    //SGPP_hash_3D( Pi, hash_lowz, hash_highz );
    
    //  blend the results and return
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( hash_lowz, hash_highz, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    return mix( res1.x, res1.y, blend.x );
    

    }

    // // Perlin Noise 2D ( gradient noise ) // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/perlinsample.jpg // float Perlin2D( vec2 P ) { // establish our grid cell and unit position vec2 Pi = floor(P); vec4 Pf_Pfmin1 = P.xyxy - vec4( Pi, Pi + vec2(1.0) );

    if 1

    //
    //  classic noise looks much better than improved noise in 2D, and with an efficent hash function runs at about the same speed.
    //  requires 2 random numbers per point.
    //
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_x, hash_y );
    //SGPP_hash_2D( Pi, hash_x, hash_y );
    
    //  calculate the gradient results
    vec4 grad_x = hash_x - 0.49999;
    vec4 grad_y = hash_y - 0.49999;
    vec4 grad_results = inversesqrt( grad_x * grad_x + grad_y * grad_y ) * ( grad_x * Pf_Pfmin1.xzxz + grad_y * Pf_Pfmin1.yyww );
    

    if 1

    //  Classic Perlin Interpolation
    grad_results *= 1.4142135623730950488016887242097;      //  (optionally) scale things to a strict -1.0->1.0 range    *= 1.0/sqrt(0.5)
    vec2 blend = Interpolation_C2( Pf_Pfmin1.xy );
    vec2 res0 = mix( grad_results.xy, grad_results.zw, blend.y );
    return mix( res0.x, res0.y, blend.x );
    

    else

    //  Classic Perlin Surflet
    //  http://briansharpe.wordpress.com/2012/03/09/modifications-to-classic-perlin-noise/
    grad_results *= 2.3703703703703703703703703703704.xxxx;     //  (optionally) scale things to a strict -1.0->1.0 range    *= 1.0/cube(0.75)
    vec4 vecs_len_sq = Pf_Pfmin1 * Pf_Pfmin1;
    vecs_len_sq = vecs_len_sq.xzxz + vecs_len_sq.yyww;
    return dot( Falloff_Xsq_C2( min( 1.0.xxxx, vecs_len_sq ) ), grad_results );
    

    endif

    else

    //
    //  2D improved perlin noise.
    //  requires 1 random value per point.
    //  does not look as good as classic in 2D due to only a small number of possible cell types.  But can run a lot faster than classic perlin noise if the hash function is slow
    //
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash = FAST32_hash_2D( Pi );
    //vec4 hash = BBS_hash_2D( Pi );
    //vec4 hash = SGPP_hash_2D( Pi );
    //vec4 hash = BBS_hash_hq_2D( Pi );
    
    //
    //  evaulate the gradients
    //  choose between the 4 diagonal gradients.  ( slightly slower than choosing the axis gradients, but shows less grid artifacts )
    //  NOTE:  diagonals give us a nice strict -1.0->1.0 range without additional scaling
    //  [1.0,1.0] [-1.0,1.0] [1.0,-1.0] [-1.0,-1.0]
    //
    hash -= 0.5.xxxx;
    vec4 grad_results = Pf_Pfmin1.xzxz * sign( hash ) + Pf_Pfmin1.yyww * sign( abs( hash ) - 0.25.xxxx );
    
    //  blend the results and return
    vec2 blend = Interpolation_C2( Pf_Pfmin1.xy );
    vec2 res0 = mix( grad_results.xy, grad_results.zw, blend.y );
    return mix( res0.x, res0.y, blend.x );
    

    endif

    }

    // // Perlin Noise 3D ( gradient noise ) // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/perlinsample.jpg // float Perlin3D( vec3 P ) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi; vec3 Pf_min1 = Pf - 1.0;

    if 1

    //
    //  classic noise.
    //  requires 3 random values per point.  with an efficent hash function will run faster than improved noise
    //
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hashx0, hashy0, hashz0, hashx1, hashy1, hashz1;
    FAST32_hash_3D( Pi, hashx0, hashy0, hashz0, hashx1, hashy1, hashz1 );
    //SGPP_hash_3D( Pi, hashx0, hashy0, hashz0, hashx1, hashy1, hashz1 );
    
    //  calculate the gradients
    vec4 grad_x0 = hashx0 - 0.49999;
    vec4 grad_y0 = hashy0 - 0.49999;
    vec4 grad_z0 = hashz0 - 0.49999;
    vec4 grad_x1 = hashx1 - 0.49999;
    vec4 grad_y1 = hashy1 - 0.49999;
    vec4 grad_z1 = hashz1 - 0.49999;
    vec4 grad_results_0 = inversesqrt( grad_x0 * grad_x0 + grad_y0 * grad_y0 + grad_z0 * grad_z0 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x0 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y0 + Pf.zzzz * grad_z0 );
    vec4 grad_results_1 = inversesqrt( grad_x1 * grad_x1 + grad_y1 * grad_y1 + grad_z1 * grad_z1 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x1 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y1 + Pf_min1.zzzz * grad_z1 );
    

    if 1

    //  Classic Perlin Interpolation
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( grad_results_0, grad_results_1, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    float final = mix( res1.x, res1.y, blend.x );
    final *= 1.1547005383792515290182975610039;     //  (optionally) scale things to a strict -1.0->1.0 range    *= 1.0/sqrt(0.75)
    return final;
    

    else

    //  Classic Perlin Surflet
    //  http://briansharpe.wordpress.com/2012/03/09/modifications-to-classic-perlin-noise/
    Pf *= Pf;
    Pf_min1 *= Pf_min1;
    vec4 vecs_len_sq = vec4( Pf.x, Pf_min1.x, Pf.x, Pf_min1.x ) + vec4( Pf.yy, Pf_min1.yy );
    float final = dot( Falloff_Xsq_C2( min( vec4(1.0), vecs_len_sq + Pf.zzzz ) ), grad_results_0 ) + dot( Falloff_Xsq_C2( min( 1.0.xxxx, vecs_len_sq + Pf_min1.zzzz ) ), grad_results_1 );
    final *= 2.3703703703703703703703703703704;     //  (optionally) scale things to a strict -1.0->1.0 range    *= 1.0/cube(0.75)
    return final;
    

    endif

    else

    //
    //  improved noise.
    //  requires 1 random value per point.  Will run faster than classic noise if a slow hashing function is used
    //
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_lowz, hash_highz;
    FAST32_hash_3D( Pi, hash_lowz, hash_highz );
    //BBS_hash_3D( Pi, hash_lowz, hash_highz );
    //SGPP_hash_3D( Pi, hash_lowz, hash_highz );
    

    if 0

    //
    //  this will implement Ken Perlins "improved" classic noise using the 12 mid-edge gradient points.
    //  NOTE:  mid-edge gradients give us a nice strict -1.0->1.0 range without additional scaling
    //  [1,1,0] [-1,1,0] [1,-1,0] [-1,-1,0]
    //  [1,0,1] [-1,0,1] [1,0,-1] [-1,0,-1]
    //  [0,1,1] [0,-1,1] [0,1,-1] [0,-1,-1]
    //
    hash_lowz *= 3.0;
    vec4 grad_results_0_0 = mix( vec2( Pf.y, Pf_min1.y ).xxyy, vec2( Pf.x, Pf_min1.x ).xyxy, lessThan( hash_lowz, 2.0.xxxx ) );
    vec4 grad_results_0_1 = mix( Pf.zzzz, vec2( Pf.y, Pf_min1.y ).xxyy, lessThan( hash_lowz, 1.0.xxxx ) );
    hash_lowz = fract( hash_lowz ) - 0.5;
    vec4 grad_results_0 = grad_results_0_0 * sign( hash_lowz ) + grad_results_0_1 * sign( abs( hash_lowz ) - 0.25.xxxx );
    
    hash_highz *= 3.0;
    vec4 grad_results_1_0 = mix( vec2( Pf.y, Pf_min1.y ).xxyy, vec2( Pf.x, Pf_min1.x ).xyxy, lessThan( hash_highz, 2.0.xxxx ) );
    vec4 grad_results_1_1 = mix( Pf_min1.zzzz, vec2( Pf.y, Pf_min1.y ).xxyy, lessThan( hash_highz, 1.0.xxxx ) );
    hash_highz = fract( hash_highz ) - 0.5;
    vec4 grad_results_1 = grad_results_1_0 * sign( hash_highz ) + grad_results_1_1 * sign( abs( hash_highz ) - 0.25.xxxx );
    
    //  blend the gradients and return
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( grad_results_0, grad_results_1, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    return mix( res1.x, res1.y, blend.x );
    

    else

    //
    //  "improved" noise using 8 corner gradients.  Faster than the 12 mid-edge point method.
    //  Ken mentions using diagonals like this can cause "clumping", but we'll live with that.
    //  [1,1,1]  [-1,1,1]  [1,-1,1]  [-1,-1,1]
    //  [1,1,-1] [-1,1,-1] [1,-1,-1] [-1,-1,-1]
    //
    hash_lowz -= 0.5.xxxx;
    vec4 grad_results_0_0 = vec2( Pf.x, Pf_min1.x ).xyxy * sign( hash_lowz );
    hash_lowz = abs( hash_lowz ) - 0.25.xxxx;
    vec4 grad_results_0_1 = vec2( Pf.y, Pf_min1.y ).xxyy * sign( hash_lowz );
    vec4 grad_results_0_2 = Pf.zzzz * sign( abs( hash_lowz ) - 0.125.xxxx );
    vec4 grad_results_0 = grad_results_0_0 + grad_results_0_1 + grad_results_0_2;
    
    hash_highz -= 0.5.xxxx;
    vec4 grad_results_1_0 = vec2( Pf.x, Pf_min1.x ).xyxy * sign( hash_highz );
    hash_highz = abs( hash_highz ) - 0.25.xxxx;
    vec4 grad_results_1_1 = vec2( Pf.y, Pf_min1.y ).xxyy * sign( hash_highz );
    vec4 grad_results_1_2 = Pf_min1.zzzz * sign( abs( hash_highz ) - 0.125.xxxx );
    vec4 grad_results_1 = grad_results_1_0 + grad_results_1_1 + grad_results_1_2;
    
    //  blend the gradients and return
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( grad_results_0, grad_results_1, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    return mix( res1.x, res1.y, blend.x ) * (2.0 / 3.0);    //  (optionally) mult by (2.0/3.0) to scale to a strict -1.0->1.0 range
    

    endif

    endif

    }

    // // ValuePerlin Noise 2D ( value gradient noise ) // A uniform blend between value and perlin noise // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/valueperlinsample.jpg // float ValuePerlin2D( vec2 P, float blend_val ) { // establish our grid cell and unit position vec2 Pi = floor(P); vec4 Pf_Pfmin1 = P.xyxy - vec4( Pi, Pi + vec2(1.0));

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_value, hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_value, hash_x, hash_y );
    
    //  calculate the gradient results
    vec4 grad_x = hash_x - 0.49999;
    vec4 grad_y = hash_y - 0.49999;
    vec4 grad_results = inversesqrt( grad_x * grad_x + grad_y * grad_y ) * ( grad_x * Pf_Pfmin1.xzxz + grad_y * Pf_Pfmin1.yyww );
    grad_results *= 1.4142135623730950488016887242097;      //  scale the perlin component to a -1.0->1.0 range    *= 1.0/sqrt(0.5)
    grad_results = mix( (hash_value * 2.0 - 1.0), grad_results, blend_val );
    
    //  blend the results and return
    vec2 blend = Interpolation_C2( Pf_Pfmin1.xy );
    vec2 res0 = mix( grad_results.xy, grad_results.zw, blend.y );
    return mix( res0.x, res0.y, blend.x );
    

    }

    // // ValuePerlin Noise 3D ( value gradient noise ) // A uniform blend between value and perlin noise // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2011/11/valueperlinsample.jpg // float ValuePerlin3D( vec3 P, float blend_val ) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi; vec3 Pf_min1 = Pf - 1.0;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_value0, hashx0, hashy0, hashz0, hash_value1, hashx1, hashy1, hashz1;
    FAST32_hash_3D( Pi, hash_value0, hashx0, hashy0, hashz0, hash_value1, hashx1, hashy1, hashz1 );
    
    //  calculate the gradients
    vec4 grad_x0 = hashx0 - 0.49999;
    vec4 grad_y0 = hashy0 - 0.49999;
    vec4 grad_z0 = hashz0 - 0.49999;
    vec4 grad_x1 = hashx1 - 0.49999;
    vec4 grad_y1 = hashy1 - 0.49999;
    vec4 grad_z1 = hashz1 - 0.49999;
    vec4 grad_results_0 = inversesqrt( grad_x0 * grad_x0 + grad_y0 * grad_y0 + grad_z0 * grad_z0 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x0 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y0 + Pf.zzzz * grad_z0 );
    vec4 grad_results_1 = inversesqrt( grad_x1 * grad_x1 + grad_y1 * grad_y1 + grad_z1 * grad_z1 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x1 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y1 + Pf_min1.zzzz * grad_z1 );
    grad_results_0 *= 1.1547005383792515290182975610039;        //  scale the perlin component to a -1.0->1.0 range    *= 1.0/sqrt(0.75)
    grad_results_1 *= 1.1547005383792515290182975610039;
    grad_results_0 = mix( (hash_value0 * 2.0 - 1.0), grad_results_0, blend_val );
    grad_results_1 = mix( (hash_value1 * 2.0 - 1.0), grad_results_1, blend_val );
    
    //  blend the gradients and return
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( grad_results_0, grad_results_1, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    return mix( res1.x, res1.y, blend.x );
    

    }

    // // Cubist Noise 2D // http://briansharpe.files.wordpress.com/2011/12/cubistsample.jpg // // Generates a noise which resembles a cubist-style painting pattern. Final Range 0.0->1.0 // NOTE: contains discontinuities. best used only for texturing. // NOTE: Any serious game implementation should hard-code these parameter values for efficiency. // float Cubist2D( vec2 P, vec2 range_clamp ) // range_clamp.x = low, range_clamp.y = 1.0/(high-low). suggest value low=-2.0 high=1.0 { // establish our grid cell and unit position vec2 Pi = floor(P); vec4 Pf_Pfmin1 = P.xyxy - vec4( Pi, Pi + vec2(1.0) );

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y, hash_value;
    FAST32_hash_2D( Pi, hash_x, hash_y, hash_value );
    
    //  calculate the gradient results
    vec4 grad_x = hash_x - 0.49999;
    vec4 grad_y = hash_y - 0.49999;
    vec4 grad_results = inversesqrt( grad_x * grad_x + grad_y * grad_y ) * ( grad_x * Pf_Pfmin1.xzxz + grad_y * Pf_Pfmin1.yyww );
    
    //  invert the gradient to convert from perlin to cubist
    grad_results = ( hash_value - 0.5 ) * ( 1.0 / grad_results );
    
    //  blend the results and return
    vec2 blend = Interpolation_C2( Pf_Pfmin1.xy );
    vec2 res0 = mix( grad_results.xy, grad_results.zw, blend.y );
    float final = mix( res0.x, res0.y, blend.x );
    
    //  the 1.0/grad calculation pushes the result to a possible to +-infinity.  Need to clamp to keep things sane
    return clamp( ( final - range_clamp.x ) * range_clamp.y, 0.0, 1.0 );
    //return smoothstep( 0.0, 1.0, ( final - range_clamp.x ) * range_clamp.y );     //  experiments.  smoothstep doesn't look as good, but does remove some discontinuities....
    

    }

    // // Cubist Noise 3D // http://briansharpe.files.wordpress.com/2011/12/cubistsample.jpg // // Generates a noise which resembles a cubist-style painting pattern. Final Range 0.0->1.0 // NOTE: contains discontinuities. best used only for texturing. // NOTE: Any serious game implementation should hard-code these parameter values for efficiency. // float Cubist3D( vec3 P, vec2 range_clamp ) // range_clamp.x = low, range_clamp.y = 1.0/(high-low). suggest value low=-2.0 high=1.0 { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi; vec3 Pf_min1 = Pf - 1.0;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hashx0, hashy0, hashz0, hash_value0, hashx1, hashy1, hashz1, hash_value1;
    FAST32_hash_3D( Pi, hashx0, hashy0, hashz0, hash_value0, hashx1, hashy1, hashz1, hash_value1 );
    
    //  calculate the gradients
    vec4 grad_x0 = hashx0 - 0.49999;
    vec4 grad_y0 = hashy0 - 0.49999;
    vec4 grad_z0 = hashz0 - 0.49999;
    vec4 grad_x1 = hashx1 - 0.49999;
    vec4 grad_y1 = hashy1 - 0.49999;
    vec4 grad_z1 = hashz1 - 0.49999;
    vec4 grad_results_0 = inversesqrt( grad_x0 * grad_x0 + grad_y0 * grad_y0 + grad_z0 * grad_z0 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x0 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y0 + Pf.zzzz * grad_z0 );
    vec4 grad_results_1 = inversesqrt( grad_x1 * grad_x1 + grad_y1 * grad_y1 + grad_z1 * grad_z1 ) * ( vec2( Pf.x, Pf_min1.x ).xyxy * grad_x1 + vec2( Pf.y, Pf_min1.y ).xxyy * grad_y1 + Pf_min1.zzzz * grad_z1 );
    
    //  invert the gradient to convert from perlin to cubist
    grad_results_0 = ( hash_value0 - 0.5 ) * ( 1.0 / grad_results_0 );
    grad_results_1 = ( hash_value1 - 0.5 ) * ( 1.0 / grad_results_1 );
    
    //  blend the gradients and return
    vec3 blend = Interpolation_C2( Pf );
    vec4 res0 = mix( grad_results_0, grad_results_1, blend.z );
    vec2 res1 = mix( res0.xy, res0.zw, blend.y );
    float final = mix( res1.x, res1.y, blend.x );
    
    //  the 1.0/grad calculation pushes the result to a possible to +-infinity.  Need to clamp to keep things sane
    return clamp( ( final - range_clamp.x ) * range_clamp.y, 0.0, 1.0 );
    //return smoothstep( 0.0, 1.0, ( final - range_clamp.x ) * range_clamp.y );     //  experiments.  smoothstep doesn't look as good, but does remove some discontinuities....
    

    }

    // convert a 0.0->1.0 sample to a -1.0->1.0 sample weighted towards the extremes vec4 Cellular_weight_samples( vec4 samples ) { samples = samples * 2.0 - 1.0; //return (1.0 - samples * samples) * sign(samples); // square return (samples * samples * samples) - sign(samples); // cubic (even more variance) }

    // // Cellular Noise 2D // Based off Stefan Gustavson's work at http://www.itn.liu.se/~stegu/GLSL-cellular // http://briansharpe.files.wordpress.com/2011/12/cellularsample.jpg // // Speed up by using 2x2 search window instead of 3x3 // produces a range of 0.0->1.0 // float Cellular2D(vec2 P) { // establish our grid cell and unit position vec2 Pi = floor(P); vec2 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_x, hash_y );
    //SGPP_hash_2D( Pi, hash_x, hash_y );
    
    //  generate the 4 random points
    

    if 1

    //  restrict the random point offset to eliminate artifacts
    //  we'll improve the variance of the noise by pushing the points to the extremes of the jitter window
    const float JITTER_WINDOW = 0.25;   // 0.25 will guarentee no artifacts.  0.25 is the intersection on x of graphs f(x)=( (0.5+(0.5-x))^2 + (0.5-x)^2 ) and f(x)=( (0.5+x)^2 + x^2 )
    hash_x = Cellular_weight_samples( hash_x ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y = Cellular_weight_samples( hash_y ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    

    else

    //  non-weighted jitter window.  jitter window of 0.4 will give results similar to Stefans original implementation
    //  nicer looking, faster, but has minor artifacts.  ( discontinuities in signal )
    const float JITTER_WINDOW = 0.4;
    hash_x = hash_x * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, 1.0-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW);
    hash_y = hash_y * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW);
    

    endif

    //  return the closest squared distance
    vec4 dx = Pf.xxxx - hash_x;
    vec4 dy = Pf.yyyy - hash_y;
    vec4 d = dx * dx + dy * dy;
    d.xy = min(d.xy, d.zw);
    return min(d.x, d.y) * ( 1.0 / 1.125 ); //  scale return value from 0.0->1.125 to 0.0->1.0  ( 0.75^2 * 2.0  == 1.125 )
    

    }

    // // Cellular Noise 3D // Based off Stefan Gustavson's work at http://www.itn.liu.se/~stegu/GLSL-cellular // http://briansharpe.files.wordpress.com/2011/12/cellularsample.jpg // // Speed up by using 2x2x2 search window instead of 3x3x3 // produces range of 0.0->1.0 // float Cellular3D(vec3 P) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1;
    FAST32_hash_3D( Pi, hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1 );
    //SGPP_hash_3D( Pi, hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1 );
    
    //  generate the 8 random points
    

    if 1

    //  restrict the random point offset to eliminate artifacts
    //  we'll improve the variance of the noise by pushing the points to the extremes of the jitter window
    const float JITTER_WINDOW = 0.166666666;    // 0.166666666 will guarentee no artifacts. It is the intersection on x of graphs f(x)=( (0.5 + (0.5-x))^2 + 2*((0.5-x)^2) ) and f(x)=( 2 * (( 0.5 + x )^2) + x * x )
    hash_x0 = Cellular_weight_samples( hash_x0 ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y0 = Cellular_weight_samples( hash_y0 ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    hash_x1 = Cellular_weight_samples( hash_x1 ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y1 = Cellular_weight_samples( hash_y1 ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    hash_z0 = Cellular_weight_samples( hash_z0 ) * JITTER_WINDOW + vec4(0.0, 0.0, 0.0, 0.0);
    hash_z1 = Cellular_weight_samples( hash_z1 ) * JITTER_WINDOW + vec4(1.0, 1.0, 1.0, 1.0);
    

    else

    //  non-weighted jitter window.  jitter window of 0.4 will give results similar to Stefans original implementation
    //  nicer looking, faster, but has minor artifacts.  ( discontinuities in signal )
    const float JITTER_WINDOW = 0.4;
    hash_x0 = hash_x0 * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, 1.0-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW);
    hash_y0 = hash_y0 * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW);
    hash_x1 = hash_x1 * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, 1.0-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW);
    hash_y1 = hash_y1 * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, -JITTER_WINDOW, 1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW);
    hash_z0 = hash_z0 * JITTER_WINDOW * 2.0 + vec4(-JITTER_WINDOW, -JITTER_WINDOW, -JITTER_WINDOW, -JITTER_WINDOW);
    hash_z1 = hash_z1 * JITTER_WINDOW * 2.0 + vec4(1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW, 1.0-JITTER_WINDOW);
    

    endif

    //  return the closest squared distance
    vec4 dx1 = Pf.xxxx - hash_x0;
    vec4 dy1 = Pf.yyyy - hash_y0;
    vec4 dz1 = Pf.zzzz - hash_z0;
    vec4 dx2 = Pf.xxxx - hash_x1;
    vec4 dy2 = Pf.yyyy - hash_y1;
    vec4 dz2 = Pf.zzzz - hash_z1;
    vec4 d1 = dx1 * dx1 + dy1 * dy1 + dz1 * dz1;
    vec4 d2 = dx2 * dx2 + dy2 * dy2 + dz2 * dz2;
    d1 = min(d1, d2);
    d1.xy = min(d1.xy, d1.wz);
    return min(d1.x, d1.y) * ( 9.0 / 12.0 );    //  scale return value from 0.0->1.333333 to 0.0->1.0   (2/3)^2 * 3  == (12/9) == 1.333333
    

    }

    /* // // SparseConvolution2D // // Very crude approximation of sparse convolution noise. ( derived from the Cellular2D implementation ) // return value scaling to 0.0->1.0 range TODO // float SparseConvolution2D(vec2 P) { // establish our grid cell and unit position vec2 Pi = floor(P); vec2 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_x, hash_y );
    //SGPP_hash_2D( Pi, hash_x, hash_y );
    
    //  generate the 4 random points
    //  restrict the random point offset to eliminate artifacts
    //  we'll improve the variance of the noise by pushing the points to the extremes of the jitter window
    const float JITTER_WINDOW = 0.25;   // 0.25 will guarentee no artifacts.  0.25 is the intersection on x of graphs f(x)=( (0.5+(0.5-x))^2 + (0.5-x)^2 ) and f(x)=( (0.5+x)^2 + x^2 )
    hash_x = Cellular_weight_samples( hash_x ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y = Cellular_weight_samples( hash_y ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    
    //  find the squared distance to each point
    vec4 dx = Pf.xxxx - hash_x;
    vec4 dy = Pf.yyyy - hash_y;
    vec4 d = dx * dx + dy * dy;
    
    //  sum kernels and return
    const float RADIUS = 1.0 - JITTER_WINDOW;
    d *= ( ( 1.0 / RADIUS ) * ( 1.0 / RADIUS ) );
    return dot( Falloff_Xsq_C2( min( d, 1.0.xxxx ) ), 1.0.xxxx );
    

    }

    // // SparseConvolution3D // // Very crude approximation of sparse convolution noise. ( derived from the Cellular3D implementation ) // return value scaling to 0.0->1.0 range TODO // float SparseConvolution3D(vec3 P) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1;
    FAST32_hash_3D( Pi, hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1 );
    //SGPP_hash_3D( Pi, hash_x0, hash_y0, hash_z0, hash_x1, hash_y1, hash_z1 );
    
    //  generate the 8 random points
    //  restrict the random point offset to eliminate artifacts
    //  we'll improve the variance of the noise by pushing the points to the extremes of the jitter window
    const float JITTER_WINDOW = 0.166666666;    // 0.166666666 will guarentee no artifacts. It is the intersection on x of graphs f(x)=( (0.5 + (0.5-x))^2 + 2*((0.5-x)^2) ) and f(x)=( 2 * (( 0.5 + x )^2) + x * x )
    hash_x0 = Cellular_weight_samples( hash_x0 ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y0 = Cellular_weight_samples( hash_y0 ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    hash_x1 = Cellular_weight_samples( hash_x1 ) * JITTER_WINDOW + vec4(0.0, 1.0, 0.0, 1.0);
    hash_y1 = Cellular_weight_samples( hash_y1 ) * JITTER_WINDOW + vec4(0.0, 0.0, 1.0, 1.0);
    hash_z0 = Cellular_weight_samples( hash_z0 ) * JITTER_WINDOW + vec4(0.0, 0.0, 0.0, 0.0);
    hash_z1 = Cellular_weight_samples( hash_z1 ) * JITTER_WINDOW + vec4(1.0, 1.0, 1.0, 1.0);
    
    //  find the squared distance to each point
    vec4 dx1 = Pf.xxxx - hash_x0;
    vec4 dy1 = Pf.yyyy - hash_y0;
    vec4 dz1 = Pf.zzzz - hash_z0;
    vec4 dx2 = Pf.xxxx - hash_x1;
    vec4 dy2 = Pf.yyyy - hash_y1;
    vec4 dz2 = Pf.zzzz - hash_z1;
    vec4 d1 = dx1 * dx1 + dy1 * dy1 + dz1 * dz1;
    vec4 d2 = dx2 * dx2 + dy2 * dy2 + dz2 * dz2;
    
    //  sum kernels and return
    const float RADIUS = ( 1.0 - JITTER_WINDOW );
    d1 *= ( ( 1.0 / RADIUS ) * ( 1.0 / RADIUS ) );
    d2 *= ( ( 1.0 / RADIUS ) * ( 1.0 / RADIUS ) );
    return dot( Falloff_Xsq_C2( min( d1, 1.0.xxxx ) ) + Falloff_Xsq_C2( min( d2, 1.0.xxxx ) ), 1.0.xxxx );
    

    } */

    // // PolkaDot Noise 2D // http://briansharpe.files.wordpress.com/2011/12/polkadotsample.jpg // http://briansharpe.files.wordpress.com/2012/01/polkaboxsample.jpg // TODO, these images have random intensity and random radius. This noise now has intensity as proportion to radius. Images need updated. TODO // // Generates a noise of smooth falloff polka dots. // Allow for control on radius. Intensity is proportional to radius // Return value range of 0.0->1.0 // float PolkaDot2D( vec2 P, float radius_low, // radius range is 0.0->1.0 float radius_high ) { // establish our grid cell and unit position vec2 Pi = floor(P); vec2 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash = FAST32_hash_2D_Cell( Pi );
    
    //  user variables
    float RADIUS = max( 0.0, radius_low + hash.z * ( radius_high - radius_low ) );
    float VALUE = RADIUS / max( radius_high, radius_low );  //  new keep value in proportion to radius.  Behaves better when used for bumpmapping, distortion and displacement
    
    //  calc the noise and return
    RADIUS = 2.0/RADIUS;
    Pf *= RADIUS;
    Pf -= ( RADIUS - 1.0 );
    Pf += hash.xy * ( RADIUS - 2.0 );
    //Pf *= Pf;     //  this gives us a cool box looking effect
    return Falloff_Xsq_C2( min( dot( Pf, Pf ), 1.0 ) ) * VALUE;
    

    } // PolkaDot2D_FixedRadius, PolkaDot2D_FixedValue, PolkaDot2D_FixedRadius_FixedValue TODO

    // // PolkaDot Noise 3D // http://briansharpe.files.wordpress.com/2011/12/polkadotsample.jpg // http://briansharpe.files.wordpress.com/2012/01/polkaboxsample.jpg // TODO, these images have random intensity and random radius. This noise now has intensity as proportion to radius. Images need updated. TODO // // Generates a noise of smooth falloff polka dots. // Allow for control on radius. Intensity is proportional to radius // Return value range of 0.0->1.0 // float PolkaDot3D( vec3 P, float radius_low, // radius range is 0.0->1.0 float radius_high ) { // establish our grid cell and unit position vec3 Pi = floor(P); vec3 Pf = P - Pi;

    //  calculate the hash.
    vec4 hash = FAST32_hash_3D_Cell( Pi );
    
    //  user variables
    float RADIUS = max( 0.0, radius_low + hash.w * ( radius_high - radius_low ) );
    float VALUE = RADIUS / max( radius_high, radius_low );  //  new keep value in proportion to radius.  Behaves better when used for bumpmapping, distortion and displacement
    
    //  calc the noise and return
    RADIUS = 2.0/RADIUS;
    Pf *= RADIUS;
    Pf -= ( RADIUS - 1.0 );
    Pf += hash.xyz * ( RADIUS - 2.0 );
    //Pf *= Pf;     //  this gives us a cool box looking effect
    return Falloff_Xsq_C2( min( dot( Pf, Pf ), 1.0 ) ) * VALUE;
    

    } // PolkaDot3D_FixedRadius, PolkaDot3D_FixedValue, PolkaDot3D_FixedRadius_FixedValue TODO

    // // Stars2D // http://briansharpe.files.wordpress.com/2011/12/starssample.jpg // // procedural texture for creating a starry background. ( looks good when combined with a nebula/space-like colour texture ) // NOTE: Any serious game implementation should hard-code these parameter values for efficiency. // // Return value range of 0.0->1.0 // float Stars2D( vec2 P, float probability_threshold, // probability a star will be drawn ( 0.0->1.0 ) float max_dimness, // the maximal dimness of a star ( 0.0->1.0 0.0 = all stars bright, 1.0 = maximum variation ) float two_over_radius ) // fixed radius for the stars. radius range is 0.0->1.0. shader requires 2.0/radius as input. { // establish our grid cell and unit position vec2 Pi = floor(P); vec2 Pf = P - Pi;

    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash = FAST32_hash_2D_Cell( Pi );
    //vec4 hash = FAST32_hash_2D( Pi * 2.0 );       //  Need to multiply by 2.0 here because we want to use all 4 corners once per cell.  No sharing with other cells.  It helps if the hash function has an odd domain.
    //vec4 hash = BBS_hash_2D( Pi * 2.0 );
    //vec4 hash = SGPP_hash_2D( Pi * 2.0 );
    //vec4 hash = BBS_hash_hq_2D( Pi * 2.0 );
    
    //  user variables
    float VALUE = 1.0 - max_dimness * hash.z;
    
    //  calc the noise and return
    Pf *= two_over_radius;
    Pf -= ( two_over_radius - 1.0 );
    Pf += hash.xy * ( two_over_radius - 2.0 );
    return ( hash.w < probability_threshold ) ? ( Falloff_Xsq_C1( min( dot( Pf, Pf ), 1.0 ) ) * VALUE ) : 0.0;
    

    }

    // // SimplexPerlin2D ( simplex gradient noise ) // Perlin noise over a simplex (triangular) grid // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexperlinsample.jpg // // Implementation originally based off Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise // float SimplexPerlin2D( vec2 P ) { // simplex math constants const float SKEWFACTOR = 0.36602540378443864676372317075294; // 0.5_(sqrt(3.0)-1.0) const float UNSKEWFACTOR = 0.21132486540518711774542560974902; // (3.0-sqrt(3.0))/6.0 const float SIMPLEX_TRI_HEIGHT = 0.70710678118654752440084436210485; // sqrt( 0.5 ) height of simplex triangle const vec3 SIMPLEX_POINTS = vec3( 1.0-UNSKEWFACTOR, -UNSKEWFACTOR, 1.0-2.0_UNSKEWFACTOR ); // vertex info for simplex triangle

    //  establish our grid cell.
    P *= SIMPLEX_TRI_HEIGHT;        // scale space so we can have an approx feature size of 1.0  ( optional )
    vec2 Pi = floor( P + dot( P, vec2(SKEWFACTOR) ) );
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_x, hash_y );
    //SGPP_hash_2D( Pi, hash_x, hash_y );
    
    //  establish vectors to the 3 corners of our simplex triangle
    vec2 v0 = Pi - dot( Pi, vec2(UNSKEWFACTOR) ) - P;
    vec4 v1pos_v1hash = (v0.x < v0.y) ? vec4(SIMPLEX_POINTS.xy, hash_x.y, hash_y.y) : vec4(SIMPLEX_POINTS.yx, hash_x.z, hash_y.z);
    vec4 v12 = vec4( v1pos_v1hash.xy, SIMPLEX_POINTS.zz ) + v0.xyxy;
    
    //  calculate the dotproduct of our 3 corner vectors with 3 random normalized vectors
    vec3 grad_x = vec3( hash_x.x, v1pos_v1hash.z, hash_x.w ) - 0.49999;
    vec3 grad_y = vec3( hash_y.x, v1pos_v1hash.w, hash_y.w ) - 0.49999;
    vec3 grad_results = inversesqrt( grad_x * grad_x + grad_y * grad_y ) * ( grad_x * vec3( v0.x, v12.xz ) + grad_y * vec3( v0.y, v12.yw ) );
    
    const float FINAL_NORMALIZATION = 99.204310604478759765467803137703;    //  scales the final result to a strict 1.0->-1.0 range
    
    //  evaluate the surflet, sum and return
    vec3 m = vec3( v0.x, v12.xz ) * vec3( v0.x, v12.xz ) + vec3( v0.y, v12.yw ) * vec3( v0.y, v12.yw );
    m = max(0.5 - m, 0.0);      //  The 0.5 here is SIMPLEX_TRI_HEIGHT^2
    m = m*m;
    m = m*m;
    return dot(m, grad_results) * FINAL_NORMALIZATION;
    

    }

    // // SimplexPolkaDot2D // polkadots over a simplex (triangular) grid // Return value range of 0.0->1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexpolkadotsample.jpg // float SimplexPolkaDot2D( vec2 P, float radius, // radius range is 0.0->1.0 float max_dimness ) // the maximal dimness of a dot ( 0.0->1.0 0.0 = all dots bright, 1.0 = maximum variation ) { // simplex math based off Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise

    //  simplex math constants
    const float SKEWFACTOR = 0.36602540378443864676372317075294;            // 0.5*(sqrt(3.0)-1.0)
    const float UNSKEWFACTOR = 0.21132486540518711774542560974902;          // (3.0-sqrt(3.0))/6.0
    const float SIMPLEX_TRI_HEIGHT = 0.70710678118654752440084436210485;    // sqrt( 0.5 )  height of simplex triangle
    const float INV_SIMPLEX_TRI_HALF_EDGELEN = 2.4494897427831780981972840747059;   // sqrt( 0.75 )/(2.0*sqrt( 0.5 ))
    const vec3 SIMPLEX_POINTS = vec3( 1.0-UNSKEWFACTOR, -UNSKEWFACTOR, 1.0-2.0*UNSKEWFACTOR );      //  vertex info for simplex triangle
    
    //  establish our grid cell.
    P *= SIMPLEX_TRI_HEIGHT;        // scale space so we can have an approx feature size of 1.0  ( optional )
    vec2 Pi = floor( P + dot( P, vec2(SKEWFACTOR) ) );
    
    //  establish vectors to the 4 corners of our simplex triangle
    vec2 v0 = ( Pi - dot( Pi, vec2(UNSKEWFACTOR) ) - P );
    vec4 v0123_x = vec4( 0.0, SIMPLEX_POINTS.xyz ) + v0.x;
    vec4 v0123_y = vec4( 0.0, SIMPLEX_POINTS.yxz ) + v0.y;
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash = FAST32_hash_2D( Pi );
    //vec4 hash = BBS_hash_2D( Pi );
    //vec4 hash = SGPP_hash_2D( Pi );
    //vec4 hash = BBS_hash_hq_2D( Pi );
    
    //  apply user controls
    radius = INV_SIMPLEX_TRI_HALF_EDGELEN/radius;       //  INV_SIMPLEX_TRI_HALF_EDGELEN here is to scale to a nice 0.0->1.0 range
    v0123_x *= radius;
    v0123_y *= radius;
    
    //  return a smooth falloff from the closest point.  ( we use a f(x)=(1.0-x*x)^3 falloff )
    vec4 point_distance = max( vec4(0.0), 1.0 - ( v0123_x*v0123_x + v0123_y*v0123_y ) );
    point_distance = point_distance*point_distance*point_distance;
    return dot( 1.0 - hash * max_dimness, point_distance );
    

    }

    // // SimplexCellular2D // cellular noise over a simplex (triangular) grid // Return value range of 0.0->~1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexcellularsample.jpg // // TODO: scaling of return value to strict 0.0->1.0 range // float SimplexCellular2D( vec2 P ) { // simplex math based off Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise

    //  simplex math constants
    const float SKEWFACTOR = 0.36602540378443864676372317075294;            // 0.5*(sqrt(3.0)-1.0)
    const float UNSKEWFACTOR = 0.21132486540518711774542560974902;          // (3.0-sqrt(3.0))/6.0
    const float SIMPLEX_TRI_HEIGHT = 0.70710678118654752440084436210485;    // sqrt( 0.5 )  height of simplex triangle.
    const float INV_SIMPLEX_TRI_HEIGHT = 1.4142135623730950488016887242097; //  1.0 / sqrt( 0.5 )
    const vec3 SIMPLEX_POINTS = vec3( 1.0-UNSKEWFACTOR, -UNSKEWFACTOR, 1.0-2.0*UNSKEWFACTOR ) * INV_SIMPLEX_TRI_HEIGHT;     //  vertex info for simplex triangle
    
    //  establish our grid cell.
    P *= SIMPLEX_TRI_HEIGHT;        // scale space so we can have an approx feature size of 1.0  ( optional )
    vec2 Pi = floor( P + dot( P, vec2(SKEWFACTOR) ) );
    
    //  calculate the hash.
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x, hash_y;
    FAST32_hash_2D( Pi, hash_x, hash_y );
    //SGPP_hash_2D( Pi, hash_x, hash_y );
    
    //  push hash values to extremes of jitter window
    const float JITTER_WINDOW = ( 0.10566243270259355887271280487451 * INV_SIMPLEX_TRI_HEIGHT );        // this will guarentee no artifacts.
    hash_x = Cellular_weight_samples( hash_x ) * JITTER_WINDOW;
    hash_y = Cellular_weight_samples( hash_y ) * JITTER_WINDOW;
    
    //  calculate sq distance to closest point
    vec2 p0 = ( ( Pi - dot( Pi, vec2(UNSKEWFACTOR) ) ) - P ) * INV_SIMPLEX_TRI_HEIGHT;
    hash_x += p0.xxxx;
    hash_y += p0.yyyy;
    hash_x.yzw += SIMPLEX_POINTS.xyz;
    hash_y.yzw += SIMPLEX_POINTS.yxz;
    vec4 distsq = hash_x*hash_x + hash_y*hash_y;
    vec2 tmp = min( distsq.xy, distsq.zw );
    return min( tmp.x, tmp.y );
    

    }

    // // Given an arbitrary 3D point this calculates the 4 vectors from the corners of the simplex pyramid to the point // It also returns the integer grid index information for the corners // void Simplex3D_GetCornerVectors( vec3 P, // input point out vec3 Pi, // integer grid index for the origin out vec3 Pi_1, // offsets for the 2nd and 3rd corners. ( the 4th = Pi + 1.0.xxx ) out vec3 Pi_2, out vec4 v1234_x, // vectors from the 4 corners to the intput point out vec4 v1234_y, out vec4 v1234_z ) { // // Simplex math from Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise //

    //  simplex math constants
    const float SKEWFACTOR = 1.0/3.0;
    const float UNSKEWFACTOR = 1.0/6.0;
    const float SIMPLEX_CORNER_POS = 0.5;
    const float SIMPLEX_PYRAMID_HEIGHT = 0.70710678118654752440084436210485;    // sqrt( 0.5 )  height of simplex pyramid.
    
    P *= SIMPLEX_PYRAMID_HEIGHT;        // scale space so we can have an approx feature size of 1.0  ( optional )
    
    //  Find the vectors to the corners of our simplex pyramid
    Pi = floor( P + dot(P, vec3(SKEWFACTOR)) );
    vec3 x0 = P - Pi + dot(Pi, vec3(UNSKEWFACTOR));
    vec3 g = step(x0.yzx, x0.xyz);
    vec3 l = 1.0 - g;
    Pi_1 = min( g.xyz, l.zxy );
    Pi_2 = max( g.xyz, l.zxy );
    vec3 x1 = x0 - Pi_1 + UNSKEWFACTOR;
    vec3 x2 = x0 - Pi_2 + SKEWFACTOR;
    vec3 x3 = x0 - SIMPLEX_CORNER_POS;
    
    //  pack them into a parallel-friendly arrangement
    v1234_x = vec4( x0.x, x1.x, x2.x, x3.x );
    v1234_y = vec4( x0.y, x1.y, x2.y, x3.y );
    v1234_z = vec4( x0.z, x1.z, x2.z, x3.z );
    

    }

    // // Calculate the weights for the 3D simplex surflet // vec4 Simplex3D_GetSurfletWeights( vec4 v1234_x, vec4 v1234_y, vec4 v1234_z ) { // perlins original implementation uses the surlet falloff formula of (0.6-x_x)^4. // This is buggy as it can cause discontinuities along simplex faces. (0.5-x_x)^3 solves this and gives an almost identical curve

    //  evaluate surflet. f(x)=(0.5-x*x)^3
    vec4 surflet_weights = v1234_x * v1234_x + v1234_y * v1234_y + v1234_z * v1234_z;
    surflet_weights = max(0.5 - surflet_weights, 0.0);      //  0.5 here represents the closest distance (squared) of any simplex pyramid corner to any of its planes.  ie, SIMPLEX_PYRAMID_HEIGHT^2
    return surflet_weights*surflet_weights*surflet_weights;
    

    }

    // // SimplexPerlin3D ( simplex gradient noise ) // Perlin noise over a simplex (triangular) grid // Return value range of -1.0->1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexperlinsample.jpg // // Implementation originally based off Stefan Gustavson's and Ian McEwan's work at... // http://github.com/ashima/webgl-noise // float SimplexPerlin3D(vec3 P) { // calculate the simplex vector and index math vec3 Pi; vec3 Pi_1; vec3 Pi_2; vec4 v1234_x; vec4 v1234_y; vec4 v1234_z; Simplex3D_GetCornerVectors( P, Pi, Pi_1, Pi_2, v1234_x, v1234_y, v1234_z );

    //  generate the random vectors
    //  ( various hashing methods listed in order of speed )
    vec4 hash_0;
    vec4 hash_1;
    vec4 hash_2;
    FAST32_hash_3D( Pi, Pi_1, Pi_2, hash_0, hash_1, hash_2 );
    //SGPP_hash_3D( Pi, Pi_1, Pi_2, hash_0, hash_1, hash_2 );
    hash_0 -= 0.49999;
    hash_1 -= 0.49999;
    hash_2 -= 0.49999;
    
    //  evaluate gradients
    vec4 grad_results = inversesqrt( hash_0 * hash_0 + hash_1 * hash_1 + hash_2 * hash_2 ) * ( hash_0 * v1234_x + hash_1 * v1234_y + hash_2 * v1234_z );
    
    const float FINAL_NORMALIZATION = 37.837217149891986479046334729594;    //  scales the final result to a strict 1.0->-1.0 range
    
    //  sum with the surflet and return
    return dot( Simplex3D_GetSurfletWeights( v1234_x, v1234_y, v1234_z ), grad_results ) * FINAL_NORMALIZATION;
    

    }

    // // SimplexCellular3D // cellular noise over a simplex (triangular) grid // Return value range of 0.0->~1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexcellularsample.jpg // // TODO: scaling of return value to strict 0.0->1.0 range // float SimplexCellular3D( vec3 P ) { // calculate the simplex vector and index math vec3 Pi; vec3 Pi_1; vec3 Pi_2; vec4 v1234_x; vec4 v1234_y; vec4 v1234_z; Simplex3D_GetCornerVectors( P, Pi, Pi_1, Pi_2, v1234_x, v1234_y, v1234_z );

    //  generate the random vectors
    //  ( various hashing methods listed in order of speed )
    vec4 hash_x;
    vec4 hash_y;
    vec4 hash_z;
    FAST32_hash_3D( Pi, Pi_1, Pi_2, hash_x, hash_y, hash_z );
    //SGPP_hash_3D( Pi, Pi_1, Pi_2, hash_x, hash_y, hash_z );
    
    //  push hash values to extremes of jitter window
    const float INV_SIMPLEX_PYRAMID_HEIGHT = 1.4142135623730950488016887242097; //  1.0 / sqrt( 0.5 )   This scales things so to a nice 0.0->1.0 range
    const float JITTER_WINDOW = ( 0.0597865779345250670558198111 * INV_SIMPLEX_PYRAMID_HEIGHT) ;        // this will guarentee no artifacts.
    hash_x = Cellular_weight_samples( hash_x ) * JITTER_WINDOW;
    hash_y = Cellular_weight_samples( hash_y ) * JITTER_WINDOW;
    hash_z = Cellular_weight_samples( hash_z ) * JITTER_WINDOW;
    
    //  offset the vectors.
    v1234_x *= INV_SIMPLEX_PYRAMID_HEIGHT;
    v1234_y *= INV_SIMPLEX_PYRAMID_HEIGHT;
    v1234_z *= INV_SIMPLEX_PYRAMID_HEIGHT;
    v1234_x += hash_x;
    v1234_y += hash_y;
    v1234_z += hash_z;
    
    //  calc the distance^2 to the closest point
    vec4 distsq = v1234_x*v1234_x + v1234_y*v1234_y + v1234_z*v1234_z;
    return min( min( distsq.x, distsq.y ), min( distsq.z, distsq.w ) );
    

    }

    // // SimplexPolkaDot3D // polkadots over a simplex (triangular) grid // Return value range of 0.0->1.0 // http://briansharpe.files.wordpress.com/2012/01/simplexpolkadotsample.jpg // float SimplexPolkaDot3D( vec3 P, float radius, // radius range is 0.0->1.0 float max_dimness ) // the maximal dimness of a dot ( 0.0->1.0 0.0 = all dots bright, 1.0 = maximum variation ) { // calculate the simplex vector and index math vec3 Pi; vec3 Pi_1; vec3 Pi_2; vec4 v1234_x; vec4 v1234_y; vec4 v1234_z; Simplex3D_GetCornerVectors( P, Pi, Pi_1, Pi_2, v1234_x, v1234_y, v1234_z );

    //  calculate the hash
    vec4 hash = FAST32_hash_3D( Pi, Pi_1, Pi_2 );
    
    //  apply user controls
    const float INV_SIMPLEX_TRI_HALF_EDGELEN = 2.3094010767585030580365951220078;   // scale to a 0.0->1.0 range.  2.0 / sqrt( 0.75 )
    radius = INV_SIMPLEX_TR
    
  • Gradients for Cellular2D / Cellular3D

    Gradients for Cellular2D / Cellular3D

    Hey Brian ! Awesome library you have here :) I've added two functions: Cellular2D_Deriv and Cellular3D_Deriv which return the noise value + the gradients (similarly to your other *_Deriv functions). You probably know this already from Worley's paper, but the gradients have the same direction as the vector going from the nearest feature point used to compute the noise value to the input position, which makes the implementation rather trivial with your code. Voila ! Let me know if these functions are useful to you too. In the meantime, I'll be looking forward to reading some new articles from your blog :) Jonathan

Bash math utilities

Bashmash Bash math utilities What is Bashmash? Bashmash is a set of math utilities for the Bash language. It simplifies common mathematical operations

Mar 11, 2021
📽 Highly Optimized Graphics Math (glm) for C

?? OpenGL Mathematics (glm) for C Documentation Almost all functions (inline versions) and parameters are documented inside the corresponding headers.

Nov 27, 2022
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup
Software ray tracer written from scratch in C that can run on CPU or GPU with emphasis on ease of use and trivial setup

A minimalist and platform-agnostic interactive/real-time raytracer. Strong emphasis on simplicity, ease of use and almost no setup to get started with

Oct 5, 2022
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.
Legion Low Level Rendering Interface provides a graphics API agnostic rendering interface with minimal CPU overhead and low level access to verbose GPU operations.

Legion-LLRI Legion-LLRI, or “Legion Low Level Rendering Interface” is a rendering API that aims to provide a graphics API agnostic approach to graphic

Aug 13, 2022
SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL)

SMAA is a very efficient GPU-based MLAA implementation (DX9, DX10, DX11 and OpenGL), capable of handling subpixel features seamlessly, and featuring an improved and advanced pattern detection & handling mechanism.

Nov 26, 2022
Plot dynamic 3d functions z = f(x, y, t)
Plot dynamic 3d functions z = f(x, y, t)

A tool for plotting 3D functions Controls Button Description Arrows Rotate view F Enable/Disable filling surface faces G Enable/Disable showing coordi

Oct 15, 2021
Dissecting the M1's GPU for 3D acceleration

Asahi GPU Research for an open source graphics stack for Apple M1. wrap Build with the included makefile make wrap.dylib, and insert in any Metal appl

Nov 26, 2022
nsfminer is an Ethash GPU mining application: with nsfminer you can mine every coin which relies on an Ethash Proof of Work.

nsfminer (no stinkin' fees) Ethereum (ethash) miner with OpenCL, CUDA and stratum support nsfminer is an Ethash GPU mining application: with nsfminer

Sep 2, 2022
2D GPU renderer for dynamic UIs
2D GPU renderer for dynamic UIs

vger vger is a vector graphics renderer which renders a limited set of primitives, but does so almost entirely on the GPU. Works on iOS and macOS. API

Nov 26, 2022
This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer GPU libraries.

OpenGL Cube Demo using PVR_PSP2 Driver layer GPU libraries This is a openGL cube demo program. It was made as a tech demo using PVR_PSP2 Driver layer

Oct 31, 2021