2.3 Sampling Using the Inversion Method
To evaluate the Monte Carlo estimator in Equation (2.7), it is necessary to be able to draw random samples from a chosen probability distribution. There are a variety of techniques for doing so, but one of the most important for rendering is the inversion method, which maps uniform samples from to a given 1D probability distribution by inverting the distribution’s CDF. (In Section 2.4.2 we will see how this approach can be applied to higher-dimensional functions by considering a sequence of 1D distributions.) When used with well-distributed samples such as those generated by the samplers that are defined in Chapter 8, the inversion method can be particularly effective. Throughout the remainder of the book, we will see the application of the inversion method to generate samples from the distributions defined by BSDFs, light sources, cameras, and scattering media.
2.3.1 Discrete Case
Equation (2.2) leads to an algorithm for sampling from a set of discrete probabilities using a uniform random variable. Suppose we have a process with four possible outcomes where the probabilities of each of the four outcomes are given by , , , and , with . The corresponding PMF is shown in Figure 2.3.
There is a direct connection between the sums in Equation (2.2) and the definition of the CDF. The discrete CDF is given by
which can be interpreted graphically by stacking the bars of the PMF on top of each other, starting at the left. This idea is shown in Figure 2.4.
The sampling operation of Equation (2.2) can be expressed as finding such that
which can be interpreted as inverting the CDF , and thus, the name of the technique. Continuing the graphical interpretation, this sampling operation can be considered in terms of projecting the events’ probabilities onto the vertical axis where they cover the range and using a random variable to select among them (see Figure 2.5). It should be clear that this draws from the correct distribution—the probability of the uniform sample hitting any particular bar is exactly equal to the height of that bar.
The SampleDiscrete() function implements this algorithm. It takes a not-necessarily normalized set of nonnegative weights, a uniform random sample u, and returns the index of one of the weights with probability proportional to its weight. The sampling operation it performs corresponds to finding such that
which corresponds to multiplying Equation (2.19) by . (Not requiring a normalized PMF is a convenience for calling code and not much more work in the function’s implementation.) Two optional parameters are provided to return the value of the PMF for the sample as well as a new uniform random sample that is derived from u.
This function is designed for the case where only a single sample needs to be generated from the weights’ distribution; if multiple samples are required, the AliasTable, which will be introduced in Section A.1, should generally be used instead: it generates samples in time after an preprocessing step, whereas SampleDiscrete() requires time for each sample generated.
The case of weights being empty is handled first so that subsequent code can assume that there is at least one weight.
The discrete probability of sampling the ith element is given by weights[i] divided by the sum of all weight values. Therefore, the function computes that sum next.
Following Equation (2.20), the uniform sample u is scaled by the sum of the weights to get a value that will be used to sample from them. Even though the provided u value should be in the range , it is possible that u * sumWeights will be equal to sumWeights due to floating-point round-off. In that rare case, up is bumped down to the next lower floating-point value so that subsequent code can assume that up < sumWeights.
We would now like to find the last offset in the weights array where the random sample up is greater than the sum of weights up to . Sampling is performed using a linear search from the start of the array, accumulating a sum of weights until the sum would be greater than .
After the while loop terminates, the randomness in the provided sample u has only been used to select an element of the array—a discrete choice. The offset of a sample between the CDF values that bracket it is itself a uniform random value that can easily be remapped to . This value is returned to the caller in uRemapped, if requested.
One might ask: why bother? It is not too difficult to generate uniform random variables, so the benefit of providing this option may seem marginal. However, for some of the high-quality sample generation algorithms in Chapter 8, it can be beneficial to reuse samples in this way rather than generating new ones—thus, this option is provided.
2.3.2 Continuous Case
In order to generalize this technique to continuous distributions, consider what happens as the number of discrete possibilities approaches infinity. The PMF from Figure 2.3 becomes a PDF, and the CDF from Figure 2.4 becomes its integral. The projection process is still the same, but it has a convenient mathematical interpretation—it represents inverting the CDF and evaluating the inverse at .
More precisely, we can draw a sample from a PDF with the following steps:
- Integrate the PDF to find the CDF .
- Obtain a uniformly distributed random number .
- Generate a sample by solving for ; in other words, find .
We will illustrate this algorithm with a simple example; see Section A.4 for its application to a number of additional functions.
Sampling a Linear Function
The function defined over linearly interpolates between at and at . Here we will assume that ; an exercise at the end of the chapter discusses the more general case.
The function’s integral is , which gives the normalization constant to define its PDF,
Integrating the PDF gives the CDF, which is the quadratic function
Inverting gives a sampling recipe
though note that in this form, the case gives an indeterminate result. The more stable formulation
computes the same result and is implemented here.
One detail to note is the std::min call in the return statement, which ensures that the returned value is within the range . Although the sampling algorithm generates values in that range given , round-off error may cause the result to be equal to 1. Because some of the code that calls the sampling routines depends on the returned values being in the specified range, the sampling routines must ensure this is so.
In addition to providing functions that sample from a distribution and compute the PDF of a sample, pbrt usually also provides functions that invert sampling operations, returning the random sample that corresponds to a value . In the 1D case, this is equivalent to evaluating the CDF.