The render path consists of rendertarget definitions and commands. The commands are executed in order to yield the rendering result. Each command outputs either to the destination rendertarget & viewport (default if output definition is omitted), or one of the named rendertargets. MRT output is also possible. If the rendertarget is a cube map, the face to render to (0-5) can also be specified.
A rendertarget's size can be either absolute or multiply or divide the destination viewport size. The multiplier or divisor does not need to be an integer number. Furthermore, a rendertarget can be declared "persistent" so that it will not be mixed with other rendertargets of the same size and format, and its contents can be assumed to be available also on subsequent frames.
Note that if you already have created a named rendertarget texture in code and have stored it into the resource cache by using AddManualResource() you can use it directly as an output (by referring to its name) without requiring a rendertarget definition for it.
The available commands are:
- clear: Clear any of color, depth and stencil. Color clear can optionally use the fog color from the Zone visible at the far clip distance.
- scenepass: Render scene objects whose material technique contains the specified pass. Will either be front-to-back ordered with state sorting, or back-to-front ordered with no state sorting. For deferred rendering, object lightmasks can be optionally marked to the stencil buffer. Vertex lights can optionally be handled during a pass, if it has the necessary shader combinations. Textures global to the pass can be bound to free texture units; these can either be the viewport, a named rendertarget, or a texture resource identified with its pathname.
- quad: Render a viewport-sized quad using the specified shaders. The blend mode (default=replace) can be optionally specified.
- forwardlights: Render per-pixel forward lighting for opaque objects with the specified pass name. Shadow maps are also rendered as necessary.
- lightvolumes: Render deferred light volumes using the specified shaders. G-buffer textures can be bound as necessary.
- renderui: Render the UI into the output rendertarget. Using this will cause the default UI render to the backbuffer to be skipped.
- sendevent: Send an event with a specified string parameter ("event name"). This can be used to call custom code,typically custom low-level rendering, in the middle of the renderpath execution.
Scenepass, quad, forwardlights and lightvolumes commands all allow command-global shader compilation defines, shader parameters and textures to be defined. For example in deferred rendering, the lightvolumes command would bind the G-buffer textures to be able to calculate the lighting. Note that when binding command-global textures, these are (for optimization) bound only once in the beginning of the command. If the texture binding is overwritten by an object's material, it is "lost" until the end of the command. Therefore the command-global textures should be in units that are not used by materials.
Note that it's legal for only one forwardlights or one lightvolumes command to exist in the renderpath.
A render path can be loaded from a main XML file by calling Load(), after which other XML files (for example one for each post-processing effect) can be appended to it by calling Append(). Rendertargets and commands can be enabled or disabled by calling SetEnabled() to switch eg. a post-processing effect on or off. To aid in this, both can be identified by tag names, for example the bloom effect uses the tag "Bloom" for all of its rendertargets and commands.
It is legal to both write to the destination viewport and sample from it during the same command: pingpong copies of its contents will be made automatically. If the viewport has hardware multisampling on, the multisampled backbuffer will be resolved to a texture before sampling it.
The render path XML definition looks like this:
For examples of renderpath definitions, see the default forward, deferred and light pre-pass renderpaths in the bin/CoreData/RenderPaths directory, and the postprocess renderpath definitions in the bin/Data/PostProcess directory.
Normally needed depth-stencil surfaces are automatically allocated when the render path is executed.
The special "lineardepth" (synonym "depth") format is intended for storing scene depth in deferred rendering. It is not an actual hardware depth-stencil texture, but a 32-bit single channel (R) float rendertarget. (On OpenGL2 it's RGBA instead, due to the limitation of all color buffers having to be the same format. The shader include file Samplers.glsl in bin/CoreData/Shaders/GLSL provides functions to encode and decode linear depth to RGB.)
Writing depth manually to a rendertarget, while using a non-readable depth-stencil surface ensures best compatibility and prevents any conflicts from using both depth test and manual depth sampling at the same time.
There is also a possibility to define a readable hardware depth texture, and instruct the render path to use it instead. Availability for this must first be checked with the function GetReadableDepthSupport(). On Direct3D9 this will use the INTZ "hack" format. To define a readable depth-stencil texture, use the format "readabledepth" (synonym "hwdepth") and set it as the depth-stencil by using the "depthstencil" attribute in render path commands. Note that you must set it in every command where you want to use it, otherwise an automatically allocated depth-stencil will be used. Note also that the existence of a stencil channel is not guaranteed, so stencil masking optimizations for lights normally used by the Renderer & View classes will be disabled.
In the special case of a depth-only rendering pass you can set the readable depth texture directly as the "output" and don't need to specify the "depthstencil" attribute at all.
After the readable depth texture has been filled, it can be bound to a texture unit in any subsequent commands. Pixel shaders should use the ReconstructDepth() helper function to reconstruct a linear depth value between 0-1 from the nonlinear hardware depth value. When the readable depth texture is bound for sampling, depth write is automatically disabled, as both modifying and sampling the depth would be undefined.
An example render path for readable hardware depth exists in bin/CoreData/RenderPaths/ForwardHWDepth.xml:
The render path starts by allocating a readable depth-stencil texture the same size as the destination viewport, clearing its depth, then rendering a depth-only pass to it. Next the destination color rendertarget is cleared normally, while the readable depth texture is used as the depth-stencil for that and all subsequent commands. Any command after the depth render pass could now bind the depth texture to an unit for sampling, for example for smooth particle or SSAO effects.
The ForwardDepth.xml render path does the same, but using a linear depth rendertarget instead of a hardware depth texture. The advantage is better compatibility (guaranteed to work without checking GetReadableDepthSupport()) but it has worse performance as it will perform an additional full scene rendering pass.
Soft particles rendering is a practical example of utilizing scene depth reading. The default renderpaths that expose a readable depth bind the depth texture in the alpha pass. This is utilized by the UnlitParticle & LitParticle shaders when the SOFTPARTICLES shader compilation define is included. The particle techniques containing "Soft" in their name in Bin/CoreData/Techniques use this define. Note that they expect a readable depth and will not work with the plain forward renderpath!
Soft particles can be implemented in two contrasting approaches: "shrinking" and "expanding". In the shrinking approach (default) depth test can be left on and the soft particle shader starts to reduce particle opacity when the particle geometry approaches solid geometry. In the expanding approach the particles should have depth test off, and the shader instead starts to reduce the particle opacity when the particle geometry overshoots the solid geometry.
For the expanding mode, see the "SoftExpand" family of particle techniques. Their downside is that performance can be lower due to not being able to use hardware depth test.
Finally note the SoftParticleFadeScale shader parameter which is needed to control the distance over which the fade will take effect. This is defined in example materials using soft particles (SmokeSoft.xml & LitSmokeSoft.xml)
Otherwise fully customized scene render passes can be specified, but there are a few things to remember related to forward lighting:
- The opaque base pass must be tagged with metadata "base". When forward lighting logic does the lit base pass optimization, it will search for a pass with the word "lit" prepended, ie. if your custom opaque base pass is called "custombase", the corresponding lit base pass would be "litcustombase".
- The transparent base pass must be tagged with metadata "alpha". For lit transparent objects, the forward lighting logic will look for a pass with the word "lit" prepended, ie. if the custom alpha base pass is called "customalpha", the corresponding lit pass is "litcustomalpha". The lit drawcalls will be interleaved with the transparent base pass, and the scenepass command should have back-to-front sorting enabled.
- If forward and deferred lighting are mixed, the G-buffer writing pass must be tagged with metadata "gbuffer" to prevent geometry being double-lit also with forward lights.
- Remember to mark the lighting mode (per-vertex / per-pixel) into the techniques which define custom passes, as the lighting mode can be guessed automatically only for the known default passes.
- The forwardlights command can optionally disable the lit base pass optimization without having to touch the material techniques, if a separate opaque ambient-only base pass is needed. By default the optimization is enabled.
Post-processing effects are usually implemented by using the quad command. When using intermediate rendertargets that are of different size than the viewport rendertarget, it is often necessary in shaders to reference their (inverse) size and the half-pixel offset for Direct3D9. These shader uniforms are automatically attempted to be assigned for named rendertargets. For an example look at the bloom postprocess shaders: because there is a rendertarget called BlurH, each quad command in the renderpath will attempt to set the shader uniforms cBlurHInvSize and cBlurHOffsets (both Vector2.) Note that setting shader uniforms is case insensitive.
In OpenGL post-processing shaders it is important to distinguish between sampling a rendertarget texture and a regular texture resource, because intermediate rendertargets (such as the G-buffer) may be vertically inverted. Use the GetScreenPos() or GetQuadTexCoord() functions to get rendertarget UV coordinates from the clip coordinates; this takes flipping into account automatically. For sampling a regular texture, use GetQuadTexCoordNoFlip() function, which requires world coordinates instead of clip coordinates.
Texture2D and TextureCube support multisampling. Programmatically, multisampling is enabled through the SetSize() function when defining the dimensions and format. Multisampling can also be set in a renderpath's rendertarget definition.
The normal operation is that a multisampled rendertarget will be automatically resolved to 1-sample before being sampled as a texture. This is denoted by the autoResolve parameter, whose default value is true. On OpenGL (when supported) and Direct3D11, it's also possible to access the individual samples of a Texture2D in shader code by defining a multisampled sampler and using specialized functions (texelFetch on OpenGL, Texture2DMS.Load on Direct3D11). In this case the "autoResolve" parameter should be set to false. Note that accessing individual samples is not possible for cube textures, or when using Direct3D9.
By accessing the individual samples of multisampled G-buffer textures, a deferred MSAA renderer could be implemented. This has some performance considerations / complexities (you should avoid running the lighting calculations per sample when not on triangle edges) and is not implemented by default.