10.1 Texture Sampling and Antialiasing
The sampling task from Chapter 8 was a frustrating one since the aliasing problem was known to be unsolvable from the start. The infinite frequency content of geometric edges and hard shadows guarantees aliasing in the final images, no matter how high the image sampling rate. (Our only consolation is that the visual impact of this remaining aliasing can be reduced to unobjectionable levels with a sufficient number of well-placed samples.)
Fortunately, things are not this difficult from the start for textures: either there is often a convenient analytic form of the texture function available, which makes it possible to remove excessively high frequencies before sampling it, or it is possible to be careful when evaluating the function so as not to introduce high frequencies in the first place. When this problem is carefully addressed in texture implementations, as is done through the rest of this chapter, there is usually no need for more than one sample per pixel in order to render an image without texture aliasing. (Of course, sufficiently reducing Monte Carlo noise from lighting calculations may be another matter.)
Two problems must be addressed in order to remove aliasing from texture functions:
- The sampling rate in texture space must be computed. The screen-space sampling rate is known from the image resolution and pixel sampling rate, but here we need to determine the resulting sampling rate on a surface in the scene in order to find the rate at which the texture function is being sampled.
- Given the texture sampling rate, sampling theory must be applied to guide the computation of a texture value that does not have higher frequency variation than can be represented by the sampling rate (e.g., by removing excess frequencies beyond the Nyquist limit from the texture function).
These two issues will be addressed in turn throughout the rest of this section.
10.1.1 Finding the Texture Sampling Rate
Consider an arbitrary texture function that is a function of position, , defined on a surface in the scene. If we ignore the complications introduced by visibility—the possibility that another object may occlude the surface at nearby image samples or that the surface may have a limited extent on the image plane—this texture function can also be expressed as a function over points on the image plane, , where is the function that maps image points to points on the surface. Thus, gives the value of the texture function as seen at image position .
As a simple example of this idea, consider a 2D texture function applied to a quadrilateral that is perpendicular to the axis and has corners at the world-space points , , , and . If an orthographic camera is placed looking down the axis such that the quadrilateral precisely fills the image plane and if points on the quadrilateral are mapped to 2D texture coordinates by
then the relationship between and screen pixels is straightforward:
where the overall image resolution is (Figure 10.2). Thus, given a sample spacing of one pixel in the image plane, the sample spacing in texture parameter space is , and the texture function must remove any detail at a higher frequency than can be represented at that sampling rate.
This relationship between pixel coordinates and texture coordinates, and thus the relationship between their sampling rates, is the key bit of information that determines the maximum frequency content allowable in the texture function. As a slightly more complex example, given a triangle with texture coordinates at its vertices and viewed with a perspective projection, it is possible to analytically find the differences in and across the sample points on the image plane. This approach was the basis of texture antialiasing in graphics processors before they became programmable.
For more complex scene geometry, camera projections, and mappings to texture coordinates, it is much more difficult to precisely determine the relationship between image positions and texture parameter values. Fortunately, for texture antialiasing, we do not need to be able to evaluate for arbitrary but just need to find the relationship between changes in pixel sample position and the resulting change in texture sample position at a particular point on the image. This relationship is given by the partial derivatives of this function, and . For example, these can be used to find a first-order approximation to the value of ,
If these partial derivatives are changing slowly with respect to the distances and , this is a reasonable approximation. More importantly, the values of these partial derivatives give an approximation to the change in texture sample position for a shift of one pixel in the and directions, respectively, and thus directly yield the texture sampling rate. For example, in the previous quadrilateral example, , , , and .
The key to finding the values of these derivatives in the general case lies in values from the RayDifferential structure, which was defined in Section 3.6.1. This structure is initialized for each camera ray by the Camera::GenerateRayDifferential() method; it contains not only the ray being traced through the scene but also two additional rays, one offset horizontally one pixel sample from the camera ray and the other offset vertically by one pixel sample. All the geometric ray intersection routines use only the main camera ray for their computations; the auxiliary rays are ignored (this is easy to do because RayDifferential is a subclass of Ray).
We can use the offset rays to estimate the partial derivatives of the mapping p from image position to world-space position and the partial derivatives of the mappings and from to parametric coordinates, giving the partial derivatives of rendering-space positions and and the derivatives of parametric coordinates , , , and . In Section 10.2, we will see how these can be used to compute the screen-space derivatives of arbitrary quantities based on or and consequently the sampling rates of these quantities. The values of these derivatives at the intersection point are stored in the SurfaceInteraction structure.
The SurfaceInteraction::ComputeDifferentials() method computes these values. It is called by SurfaceInteraction::GetBSDF() before the Material’s GetBxDF() method is called so that these values will be available for any texture evaluation routines that are called by the material.
Ray differentials are not available for all rays traced by the system—for example, rays starting from light sources traced for photon mapping or bidirectional path tracing. Further, although we will see how to compute ray differentials after rays undergo specular reflection and transmission in Section 10.1.3, how to compute ray differentials after diffuse reflection is less clear. In cases like those as well as the corner case where one of the differentials’ directions is perpendicular to the surface normal, which leads to undefined numerical values in the following, an alternative approach based on approximating the ray differentials of a ray from the camera to the intersection point is used.
The key to estimating the derivatives is the assumption that the surface is locally flat with respect to the sampling rate at the point being shaded. This is a reasonable approximation in practice, and it is hard to do much better. Because ray tracing is a point-sampling technique, we have no additional information about the scene in between the rays we have traced. For highly curved surfaces or at silhouette edges, this approximation can break down, though this is rarely a source of noticeable error.
For this approximation, we need the plane through the point intersected by the main ray that is tangent to the surface. This plane is given by the implicit equation
where , , , and . We can then compute the intersection points and between the auxiliary rays and and this plane (Figure 10.3). These new points give an approximation to the partial derivatives of position on the surface and , based on forward differences:
Because the differential rays are offset one pixel sample in each direction, there is no need to divide these differences by a value, since .
The ray–plane intersection algorithm described in Section 6.1.2 gives the value where a ray described by origin and direction intersects a plane described by :
To compute this value for the two auxiliary rays, the plane’s coefficient is computed first. It is not necessary to compute the , , and coefficients, since they are available in n. We can then apply the formula directly.
For cases where ray differentials are not available, we will add a method to the Camera interface that returns approximate values for and at a point on a surface in the scene. These should be a reasonable approximation to the differentials of a ray from the camera that found an intersection at the given point. Cameras’ implementations of this method must return reasonable results even for points outside of their viewing volumes for which they cannot actually generate rays.
CameraBase provides an implementation of an approach to approximating these differentials that is based on the minimum of the camera ray differentials across the entire image. Because all of pbrt’s current camera implementations inherit from CameraBase, the following method takes care of all of them.
This method starts by orienting the camera so that the camera-space axis is aligned with the vector from the camera position to the intersection point. It then uses lower bounds on the spread of rays over the image that are provided by the camera to find approximate differential rays. It then intersects these rays with the tangent plane at the intersection point. (See Figure 10.4.)
There are a number of sources of error in this approximation. Beyond the fact that it does not account for how light was scattered at intermediate surfaces for multiple-bounce ray paths, there is also the fact that it is based on the minimum of the camera’s differentials for all rays. In general, it tries to underestimate those derivatives rather than overestimate them, as we prefer aliasing over blurring here. The former error can at least be addressed with additional pixel samples. In order to give a sense of the impact of some of these approximations, Figure 10.5 has visualization that compares the local area estimated by those derivatives at intersections to the area computed using the actual ray differentials generated by the camera.
For the first step of the algorithm, we have an intersection point in rendering space p that we would like to transform into a coordinate system where it is along the axis with the camera at the origin. Transforming to camera space gets us started and an additional rotation that transforms the vector from the origin to the intersection point to be aligned with finishes the job. The coefficient of the plane equation can then be found by taking the dot product of the transformed point and surface normal. Because the and components of the transformed point are equal to 0, the dot product can be optimized to be a single multiply.
Camera implementations that inherit from CameraBase and use this method must initialize the following member variables with values that are lower bounds on each of the respective position and direction differentials over all the pixels in the image.
The main ray in this coordinate system has origin and direction . Adding the position and direction differential vectors to those gives the origin and direction of each differential ray. Given those, the same calculation as earlier gives us the values for the ray–plane intersections for the differential rays and thence the intersection points.
For an orthographic camera, these differentials can be computed directly. There is no change in the direction vector, and the position differentials are the same at every pixel. Their values are already computed in the OrthographicCamera constructor, so can be used directly to initialize the base class’s member variables.
All the other cameras call FindMinimumDifferentials(), which estimates these values by sampling at many points across the diagonal of the image and storing the minimum of all the differentials encountered. That function is not very interesting, so it is not included here.
Given the intersection points px and py, and can now be estimated by taking their differences with the main intersection point. To get final estimates of the partial derivatives, these vectors must be transformed back out into rendering space and scaled to account for the actual pixel sampling rate. As with the initial ray differentials that were generated in the <<Scale camera ray differentials based on image sampling rate>> fragment, these are scaled to account for the pixel sampling rate.
A call to this method takes care of computing the and differentials in the ComputeDifferentials() method.
We now have both the partial derivatives and as well as, one way or another, and . From them, we would now like to compute , , , and . Using the chain rule, we can find that
( has a similar expression with replaced by and replaced by .)
Equation (10.1) can be written as a matrix equation where the two following matrices that include have three rows, one for each of ’s , , and components:
This is an overdetermined linear system since there are three equations but only two unknowns, and . An effective solution approach in this case is to apply linear least squares, which says that for a linear system of the form with and known, the least-squares solution for is given by
In this case, , , and .
is a matrix with elements given by dot products of partial derivatives of position:
Its inverse is
Note that in both matrices the two off-diagonal entries are equal. Thus, the fragment that computes the entries of only needs to compute three values. The inverse of the matrix determinant is computed here as well. If its value is infinite, the linear system cannot be solved; setting invDet to 0 causes the subsequently computed derivatives to be 0, which leads to point-sampled textures, the best remaining option in that case.
The portion of the solution is easily computed. For the derivatives with respect to screen-space , we have the two-element matrix
The solution for screen-space is analogous.
The solution to Equation (10.2) for each partial derivative can be found by taking the product of Equations (10.3) and (10.4). We will gloss past the algebra; its result can be directly expressed in terms of the values computed so far.
In certain tricky cases (e.g., with highly distorted parameterizations or at object silhouette edges), the estimated partial derivatives may be infinite or have very large magnitudes. It is worth clamping them to reasonable values in that case to prevent overflow and not-a-number values in subsequent computations that are based on them.
10.1.2 Ray Differentials at Medium Transitions
Now is a good time to take care of another detail related to ray differentials: recall from Section 9.1.5 that materials may return an unset BSDF to indicate an interface between two scattering media that does not itself scatter light. In this case, it is necessary to spawn a new ray in the same direction, but past the intersection on the surface. In this case we would like the effect of the ray differentials to be the same as if no scattering had occurred. This can be achieved by setting the differential origins to the points given by evaluating the ray equation at the intersection (see Figure 10.6).
10.1.3 Ray Differentials for Specular Reflection and Transmission
Given the effectiveness of ray differentials for finding filter regions for texture antialiasing for camera rays, it is useful to extend the method to make it possible to determine texture-space sampling rates for objects that are seen indirectly via specular reflection or refraction; objects seen in mirrors, for example, should not have texture aliasing, identical to the case for directly visible objects. Igehy (1999) developed an elegant solution to the problem of how to find the appropriate differential rays for specular reflection and refraction, which is the approach used in pbrt.
Figure 10.7 illustrates the difference that proper texture filtering for specular reflection and transmission can make: it shows a glass ball and a mirrored ball on a plane with a texture map containing high-frequency components. Ray differentials ensure that the images of the texture seen via reflection and refraction from the balls are free of aliasing artifacts. Here, ray differentials eliminate aliasing without excessively blurring the texture.
To compute the reflected or transmitted ray differentials at a surface intersection point, we need an approximation to the rays that would have been traced at the intersection points for the two offset rays in the ray differential that hit the surface (Figure 10.8). The new ray for the main ray is found by sampling the BSDF, so here we only need to compute the outgoing rays for the and differentials. This task is handled by another SurfaceInteraction::SpawnRay() variant that takes an incident ray differential as well as information about the BSDF and the type of scattering that occurred.
It is not well defined what the ray differentials should be in the case of non-specular scattering. Therefore, this method handles the two types of specular scattering only; for all other types of rays, approximate differentials will be computed at their subsequent intersection points with Camera::Approximate_dp_dxy().
A few variables will be used for both types of scattering, including the partial derivatives of the surface normal with respect to and on the image and and , which are computed using the chain rule.
For both reflection and transmission, the origin of each differential ray can be found using the already-computed approximations of how much the surface position changes with respect to position on the image plane and .
Finding the directions of these rays is slightly trickier. If we know how much the reflected direction changes with respect to a shift of a pixel sample in the and directions on the image plane, we can use this information to approximate the direction of the offset rays. For example, the direction for the ray offset in is
Recall from Equation (9.1) that for a normal and outgoing direction the direction for perfect specular reflection is
The partial derivatives of this expression are easily computed:
Using the properties of the dot product, it can further be shown that
The value of has already been computed from the difference between the direction of the ray differential’s main ray and the direction of the offset ray, and all the other necessary quantities are readily available from the SurfaceInteraction.
A similar process of differentiating the equation for the direction of a specularly transmitted ray, Equation (9.4), gives the equation to find the differential change in the transmitted direction. pbrt computes refracted rays as
where is flipped if necessary to lie in the same hemisphere as , and where is the relative index of refraction from ’s medium to ’s medium.
If we denote the term in brackets by , then we have . Taking the partial derivative in , we have
Using some of the values found from computing specularly reflected ray differentials, we can find that we already know how to compute all of these values except for .
Before we get to the computation of ’s partial derivatives, we will start by reorienting the surface normal if necessary so that it lies on the same side of the surface as . This matches pbrt’s computation of refracted ray directions.
Returning to and considering , we have
Its first term can be evaluated with already known values. For the second term, we will start with Snell’s law, which gives
If we square both sides of the equation and take the partial derivative , we find
We now can solve for :
Putting it all together and simplifying, we have
The partial derivative in is analogous and the implementation follows.
If a ray undergoes many specular bounces, ray differentials sometimes drift off to have very large magnitudes, which can leave a trail of infinite and not-a-number values in their wake when they are used for texture filtering calculations. Therefore, the final fragment in this SpawnRay() method computes the squared length of all the differentials. If any is greater than , the ray differentials are discarded and the RayDifferential hasDifferentials value is set to false. The fragment that handles this, <<Squash potentially troublesome differentials>>, is simple and thus not included here.
10.1.4 Filtering Texture Functions
To eliminate texture aliasing, it is necessary to remove frequencies in texture functions that are past the Nyquist limit for the texture sampling rate. The goal is to compute, with as few approximations as possible, the result of the ideal texture resampling process, which says that in order to evaluate a texture function at a point on the image without aliasing, we must first band-limit it, removing frequencies beyond the Nyquist limit by convolving it with the sinc filter:
where, as in Section 10.1.1, maps pixel locations to points in the texture function’s domain. The band-limited function in turn should then be convolved with the pixel filter centered at the point on the screen at which we want to evaluate the texture function:
This gives the theoretically perfect value for the texture as projected onto the screen.
In practice, there are many simplifications that can be made to this process. For example, a box filter may be used for the band-limiting step, and the second step is usually ignored completely, effectively acting as if the pixel filter were a box filter, which makes it possible to do the antialiasing work completely in texture space. (The EWA filtering algorithm in Section 10.4.4 is a notable exception in that it assumes a Gaussian pixel filter.)
Assuming box filters then if, for example, the texture function is defined over parametric coordinates, the filtering task is to average it over a region in :
The extent of the filter region can be determined using the derivatives from the previous sections—for example, setting
and similarly for and to conservatively specify the box’s extent.
The box filter is easy to use, since it can be applied analytically by computing the average of the texture function over the appropriate region. Intuitively, this is a reasonable approach to the texture filtering problem, and it can be computed directly for many texture functions. Indeed, through the rest of this chapter, we will often use a box filter to average texture function values between samples and informally use the term filter region to describe the area being averaged over. This is the most common approach when filtering texture functions.
Even the box filter, with all of its shortcomings, gives acceptable results for texture filtering in many cases. One factor that helps is the fact that a number of samples are usually taken in each pixel. Thus, even if the filtered texture values used in each one are suboptimal, once they are filtered by the pixel reconstruction filter, the end result generally does not suffer too much.
An alternative to using the box filter to filter texture functions is to use the observation that the effect of the ideal sinc filter is to let frequency components below the Nyquist limit pass through unchanged but to remove frequencies past it. Therefore, if we know the frequency content of the texture function (e.g., if it is a sum of terms, each one with known frequency content), then if we replace the high-frequency terms with their average values, we are effectively doing the work of the sinc prefilter.
Finally, for texture functions where none of these techniques is easily applied, a final option is supersampling—the function is evaluated and filtered at multiple locations near the main evaluation point, thus increasing the sampling rate in texture space. If a box filter is used to filter these sample values, this is equivalent to averaging the value of the function. This approach can be expensive if the texture function is complex to evaluate, and as with image sampling, a very large number of samples may be needed to remove aliasing. Although this is a brute-force solution, it is still more efficient than increasing the image sampling rate, since it does not incur the cost of tracing more rays through the scene.