10.1 Sampling and Antialiasing
The sampling task from Chapter 7 was a frustrating one since the aliasing problem was known to be unsolvable from the start. The infinite frequency content of geometric edges and hard shadows guarantees aliasing in the final images, no matter how high the image sampling rate. (Our only consolation is that the visual impact of this remaining aliasing can be reduced to unobjectionable levels with a sufficient number of well-placed samples.)
Fortunately, for textures things are not this difficult from the start: either there is often a convenient analytic form of the texture function available, which makes it possible to remove excessively high frequencies before sampling it, or it is possible to be careful when evaluating the function so as not to introduce high frequencies in the first place. When this problem is carefully addressed in texture implementations, as is done through the rest of this chapter, there is usually no need for more than one sample per pixel in order to render an image without texture aliasing.
Two problems must be addressed in order to remove aliasing from texture functions:
- The sampling rate in texture space must be computed. The screen space sampling rate is known from the image resolution and pixel sampling rate, but here we need to determine the resulting sampling rate on a surface in the scene in order to find the rate at which the texture function is being sampled.
- Given the texture sampling rate, sampling theory must be applied to guide the computation of a texture value that doesn’t have higher frequency variation than can be represented by the sampling rate (e.g., by removing excess frequencies beyond the Nyquist limit from the texture function).
These two issues will be addressed in turn throughout the rest of this section.
10.1.1 Finding the Texture Sampling Rate
Consider an arbitrary texture function that is a function of position, , defined on a surface in the scene. If we ignore the complications introduced by visibility issues—the possibility that another object may occlude the surface at nearby image samples or that the surface may have a limited extent on the image plane—this texture function can also be expressed as a function over points on the image plane, , where is the function that maps image points to points on the surface. Thus, gives the value of the texture function as seen at image position .
As a simple example of this idea, consider a 2D texture function applied to a quadrilateral that is perpendicular to the axis and has corners at the world space points , , , and . If an orthographic camera is placed looking down the axis such that the quadrilateral precisely fills the image plane and if points on the quadrilateral are mapped to 2D texture coordinates by
then the relationship between and screen pixels is straightforward:
where the overall image resolution is (Figure 10.2). Thus, given a sample spacing of one pixel in the image plane, the sample spacing in texture parameter space is , and the texture function must remove any detail at a higher frequency than can be represented at that sampling rate.
This relationship between pixel coordinates and texture coordinates, and thus the relationship between their sampling rates, is the key bit of information that determines the maximum frequency content allowable in the texture function. As a slightly more complex example, given a triangle with texture coordinates at the vertices and viewed with a perspective projection, it’s possible to analytically find the differences in and across the sample points on the image plane. This is the basis of basic texture map antialiasing in graphics processors.
For more complex scene geometry, camera projections, and mappings to texture coordinates, it is much more difficult to precisely determine the relationship between image positions and texture parameter values. Fortunately, for texture antialiasing, we don’t need to be able to evaluate for arbitrary but just need to find the relationship between changes in pixel sample position and the resulting change in texture sample position at a particular point on the image. This relationship is given by the partial derivatives of this function, and . For example, these can be used to find a first-order approximation to the value of ,
If these partial derivatives are changing slowly with respect to the distances and , this is a reasonable approximation. More importantly, the values of these partial derivatives give an approximation to the change in texture sample position for a shift of one pixel in the and directions, respectively, and thus directly yield the texture sampling rate. For example, in the previous quadrilateral example, , , , and .
The key to finding the values of these partial derivatives in the general case lies in the RayDifferential structure, which was defined in Section 2.5.1. This structure is initialized for each camera ray by the Camera::GenerateRayDifferential() method; it contains not only the ray actually being traced through the scene but also two additional rays, one offset horizontally one pixel sample from the camera ray and the other offset vertically by one pixel sample. All of the geometric ray intersection routines use only the main camera ray for their computations; the auxiliary rays are ignored (this is easy to do because RayDifferential is a subclass of Ray).
Here we will use the offset rays to estimate the partial derivatives of the mapping p from image position to world space position and the partial derivatives of the mappings and from to parametric coordinates, giving the partial derivatives of world space positions and and the partial derivatives of parametric coordinates , , , and . In Section 10.2, we will see how these can be used to compute the screen space derivatives of arbitrary quantities based on or and consequently the sampling rates of these quantities. The values of these partial derivatives at the intersection point are stored in the SurfaceInteraction structure. They are declared as mutable, since they are set in a method that takes a const instance of that object.
The SurfaceInteraction::ComputeDifferentials() method computes these values. It is called by SurfaceInteraction::ComputeScatteringFunctions() before the Material’s ComputeScatteringFunctions() method is called so that these values will be available for any texture evaluation routines that are called by the material. Because ray differentials aren’t available for all rays traced by the system (e.g., rays starting from light sources traced for photon mapping or bidirectional path tracing), the hasDifferentials field of the RayDifferential must be checked before these computations are performed. If the differentials are not present, then the derivatives are all set to zero (which will eventually lead to unfiltered point sampling of textures).
The key to computing these estimates is the assumption that the surface is locally flat with respect to the sampling rate at the point being shaded. This is a reasonable approximation in practice, and it is hard to do much better. Because ray tracing is a point-sampling technique, we have no additional information about the scene in between the rays we have traced. For highly curved surfaces or at silhouette edges, this approximation can break down, though this is rarely a source of noticeable error in practice.
For this approximation, we need the plane through the point intersected by the main ray that is tangent to the surface. This plane is given by the implicit equation
where , , , and . We can then compute the intersection points and between the auxiliary rays and and this plane (Figure 10.3). These new points give an approximation to the partial derivatives of position on the surface and , based on forward differences:
Because the differential rays are offset one pixel sample in each direction, there’s no need to divide these differences by a value, since .
The ray–plane intersection algorithm described in Section 3.1.2 gives the value where a ray described by origin and direction intersects a plane described by :
To compute this value for the two auxiliary rays, the plane’s coefficient is computed first. It isn’t necessary to compute the , , and coefficients, since they’re available in n. We can then apply the formula directly.
Using the positions and , an approximation to their respective coordinates can be found by taking advantage of the fact that the surface’s partial derivatives and form a (not necessarily orthogonal) coordinate system on the plane and that the coordinates of the auxiliary intersection points in terms of this coordinate system are their coordinates with respect to the parameterization (Figure 10.4).
Given a position on the plane, we can compute its position with respect to the coordinate system by
or, equivalently,
A solution to this linear system of equations for one of the auxiliary points or gives the corresponding screen space partial derivatives or , respectively.
This linear system has three equations with two unknowns—that is, it’s overconstrained. We need to be careful since one of the equations may be degenerate—for example, if and are in the plane such that their components are both zero, then the third equation will be degenerate. Therefore, we’d like to solve the system of equations using two equations that don’t give a degenerate system. An easy way to do this is to take the cross product of and , see which coordinate of the result has the largest magnitude, and use the other two. Their cross product is already available in n, so using this approach is straightforward. Even after all this, it may happen that the linear system has no solution (usually due to the partial derivatives not forming a coordinate system on the plane). In that case, all that can be done is to return arbitrary values.
10.1.2 Filtering Texture Functions
It is necessary to remove frequencies in texture functions that are past the Nyquist limit for the texture sampling rate. The goal is to compute, with as few approximations as possible, the result of the ideal texture resampling process, which says that in order to evaluate without aliasing, we must first band-limit it, removing frequencies beyond the Nyquist limit by convolving it with the sinc filter:
The band-limited function in turn should then be convolved with the pixel filter centered at the point on the screen at which we want to evaluate the texture function:
This gives the theoretically perfect value for the texture as projected onto the screen.
In practice, there are many simplifications that can be made to this process, with little reduction in visual quality. For example, a box filter may be used for the band-limiting step, and the second step is usually ignored completely, effectively acting as if the pixel filter were a box filter, which makes it possible to do the antialiasing work completely in texture space and simplifies the implementation significantly. The EWA filtering algorithm in Section 10.4.5 is a notable exception in that it assumes a Gaussian pixel filter.
The box filter is easy to use, since it can be applied analytically by computing the average of the texture function over the appropriate region. Intuitively, this is a reasonable approach to the texture filtering problem, and it can be computed directly for many texture functions. Indeed, through the rest of this chapter, we will often use a box filter to average texture function values between samples and informally use the term filter region to describe the area being averaged over. This is the most common approach when filtering texture functions.
Even the box filter, with all of its shortcomings, gives acceptable results for texture filtering in many cases. One factor that helps is the fact that a number of samples are usually taken in each pixel. Thus, even if the filtered texture values used in each one are sub-optimal, once they are filtered by the pixel reconstruction filter, the end result generally doesn’t suffer too much.
An alternative to using the box filter to filter texture functions is to use the observation that the effect of the ideal sinc filter is to let frequency components below the Nyquist limit pass through unchanged but to remove frequencies past it. Therefore, if we know the frequency content of the texture function (e.g., if it is a sum of terms, each one with known frequency content), then if we replace the high-frequency terms with their average values, we are effectively doing the work of the sinc prefilter. This approach is known as clamping and is the basis for antialiasing in the textures based on the noise function in Section 10.6.
Finally, for texture functions where none of these techniques is easily applied, a final option is supersampling—the function is evaluated and filtered at multiple locations near the main evaluation point, thus increasing the sampling rate in texture space. If a box filter is used to filter these sample values, this is equivalent to averaging the value of the function. This approach can be expensive if the texture function is complex to evaluate, and as with image sampling a very large number of samples may be needed to remove aliasing. Although this is a brute-force solution, it is still more efficient than increasing the image sampling rate, since it doesn’t incur the cost of tracing more rays through the scene.
10.1.3 Ray Differentials for Specular Reflection and Transmission
Given the effectiveness of ray differentials for finding filter regions for texture antialiasing for camera rays, it is useful to extend the method to make it possible to determine texture space sampling rates for objects that are seen indirectly via specular reflection or refraction; objects seen in mirrors, for example, should also no more have texture aliasing than directly visible objects. Igehy (1999) developed an elegant solution to the problem of how to find the appropriate differential rays for specular reflection and refraction, which is the approach used in pbrt.
Figure 10.5 illustrates the difference that proper texture filtering for specular reflection and transmission can make: it shows a glass ball and a mirrored ball on a plane with a texture map containing high-frequency components. Ray differentials ensure that the images of the texture seen via reflection and refraction from the balls are free of aliasing artifacts. Here, ray differentials eliminate aliasing without excessively blurring the texture.
In order to compute the reflected or transmitted ray differentials at a surface intersection point, we need an approximation to the rays that would have been traced at the intersection points for the two offset rays in the ray differential that hit the surface (Figure 10.6). The new ray for the main ray is computed by the BSDF, so here we only need to compute the outgoing rays for the and differentials.
For both reflection and refraction, the origin of each differential ray is easily found. The SurfaceInteraction::ComputeDifferentials() method previously computed approximations for how much the surface position changes with respect to position on the image plane and . Adding these offsets to the intersection point of the main ray gives approximate origins for the new rays. If the incident ray doesn’t have differentials, then it’s impossible to compute reflected ray differentials and this step is skipped.
Finding the directions of these rays is slightly trickier. Igehy (1999) observed that if we know how much the reflected direction changes with respect to a shift of a pixel sample in the and directions on the image plane, we can use this information to approximate the direction of the offset rays. For example, the direction for the ray offset in is
Recall from Equation (8.5) that for a general world space surface normal and outgoing direction, the direction for perfect specular reflection is
Fortunately, the partial derivatives of this expression are easily computed:
Using the properties of the dot product, it can be shown that
The value of can be found from the difference between the direction of the ray differential’s main ray and the direction of the offset ray, and all of the other necessary quantities are readily available from the SurfaceInteraction, so the implementation of this computation for the partial derivatives in and is straightforward.
A similar process of differentiating the equation for the direction of a specularly transmitted ray, Equation (8.8), gives the equation to find the differential change in the transmitted direction. We won’t include the derivation or our implementation here, but refer the interested reader to the original paper and to the pbrt source code, respectively.