13.3 Sampling Random Variables
In order to evaluate the Monte Carlo estimator in Equation (13.3), it is necessary to be able to draw random samples from the chosen probability distribution. This section will introduce the basics of this process and demonstrate it with some straightforward examples. The next two sections will introduce more complex approaches to sampling before Section 13.6 develops the approach for the general multidimensional case. In Chapters 14, 15, and 16, we’ll then see how to use these techniques to generate samples from the distributions defined by BSDFs, light sources, cameras, and scattering media.
13.3.1 The Inversion Method
The inversion method uses one or more uniform random variables and maps them to random variables from the desired distribution. To explain how this process works in general, we will start with a simple discrete example. Suppose we have a process with four possible outcomes. The probabilities of each of the four outcomes are given by , , , and , respectively, with the requirement that . The corresponding PDF is shown in Figure 13.1.
In order to draw a sample from this distribution, we first find the CDF . In the continuous case, is the indefinite integral of . In the discrete case, we can directly construct the CDF by stacking the bars on top of each other, starting at the left. This idea is shown in Figure 13.2. Notice that the height of the rightmost bar must be 1 because of the requirement that all probabilities sum to 1.
To draw a sample from the distribution, we then take a uniform random number and use it to select one of the possible outcomes using the CDF, doing so in a way that chooses a particular outcome with probability equal to the outcome’s own probability. This idea is illustrated in Figure 13.3, where the events’ probabilities are projected onto the vertical axis and a random variable selects among them. It should be clear that this draws from the correct distribution—the probability of the uniform sample hitting any particular bar is exactly equal to the height of that bar.
In order to generalize this technique to continuous distributions, consider what happens as the number of discrete possibilities approaches infinity. The PDF from Figure 13.1 becomes a smooth curve, and the CDF from Figure 13.2 becomes its integral. The projection process is still the same, although if the function is continuous, the projection has a convenient mathematical interpretation—it represents inverting the CDF and evaluating the inverse at . This technique is thus called the inversion method.
More precisely, we can draw a sample from an arbitrary PDF with the following steps:
- Compute the CDF .
- Compute the inverse .
- Obtain a uniformly distributed random number .
- Compute .
Example: Power Distribution
As an example of how this procedure works, consider the task of drawing samples from a power distribution, . The PDF of the power distribution is
for the constant that normalizes the PDF. The first task to tackle is to find the PDF. In most cases, this simply involves computing the value of the proportionality constant , which can be found using the constraint that :
Therefore, . We can integrate this PDF to get the CDF:
and inversion is simple: . Therefore, given a uniform random variable , samples can be drawn from the power distribution as
Another approach is to use a sampling trick that works only for the power distribution, selecting . This random variable is distributed according to the power distribution as well. To see why, note that is the probability that all the . But the are independent, so
which is exactly the desired CDF. Depending on the speed of your random number generator, this technique can be faster than the inversion method for small values of .
Example: Exponential Distribution
When rendering images with participating media, it is frequently useful to draw samples from a distribution . As before, the first step is to normalize this distribution so that it integrates to one. In this case, we’ll assume for now that the range of values we’d like the generated samples to cover is rather than , so
Thus we know that , and our PDF is . Now, we integrate to find :
This function is easy to invert:
and we can draw samples thus:
It may be tempting to simplify the log term from to , under the theory that because , these are effectively the same and a subtraction can thus be saved. The problem with this idea is that may have the value 0 but never has the value 1. With the simplification, it’s possible that we’d try to take the logarithm of 0, which is undefined; this danger is avoided with the first formulation. While a value of 0 may seem very unlikely, it does happen (especially in the world of floating-point arithmetic, rather than the real numbers). Some of the low-discrepancy sampling patterns introduced in Chapter 7 are particularly prone to generating the value 0.
Example: Piecewise-Constant 1D Functions
An interesting exercise is to work out how to sample from 1D piecewise-constant functions (step functions). Without loss of generality, we will just consider piecewise-constant functions defined over .
Assume that the 1D function’s domain is split into equal-sized pieces of size . These regions start and end at points , where ranges from 0 to , inclusive. Within each region, the value of the function is a constant (Figure 13.4(a)).
The value of is
The integral is
and so it is easy to construct the PDF for as . By direct application of the relevant formulae, the CDF is a piecewise linear function defined at points by
Between two points and , the CDF is linearly increasing with slope .
Recall that in order to sample we need to invert the CDF to find the value such that
Because the CDF is monotonically increasing, the value of must be between the and such that and . Given an array of CDF values, this pair of values can be efficiently found with a binary search.
Distribution1D is a small utility class that represents a piecewise-constant 1D function’s PDF and CDF and provides methods to perform this sampling efficiently.
The Distribution1D constructor takes n values of a piecewise-constant function f. It makes its own copy of the function values, computes the function’s CDF, and also stores the integral of the function, funcInt. Note that the constructor allocates n+1 Floats for the cdf array because if has step values, then we need to store the value of the CDF at each of the values of . Storing the CDF value of 1 at the end of the array is redundant but simplifies the sampling code later.
This constructor computes the integral of using Equation (13.5). It stores the result in the cdf array for now so that it doesn’t need to allocate additional temporary space for it.
Now that the value of the integral over all of is stored in cdf[n], this value can be copied into funcInt and the CDF can be normalized by dividing through all entries by this value.
The Distribution1D::SampleContinuous() method uses the given random sample u to sample from its distribution. It returns the corresponding value and the value of the PDF . If the optional off parameter is not nullptr, it returns the offset into the array of function values of the largest index where the CDF was less than or equal to u. (In other words, cdf[*off] <= u < cdf[*off+1].)
Mapping u to an interval matching the above criterion is carried out using the efficient binary search implemented in FindInterval() (see Appendix A for details).
Given the pair of CDF values that straddle u, we can compute . First, we determine how far u is between cdf[offset] and cdf[offset+1], du, where du is 0 if u == cdf[offset] and goes up to 1 if u == cdf[offset+1]. Because the CDF is piecewise linear, the sample value is the same offset between and (Figure 13.4(b)).
The PDF for this sample is easily computed since we have the function’s integral in funcInt. (Note that the offset offset into the CDF array has been computed in a way so that func[offset] gives the value of the function in the CDF range that the sample landed in.)
Finally, the appropriate value of is computed and returned.
In a small overloading of semantics, Distribution1D can also be used for discrete 1D probability distributions where there are some number of buckets , each with some weight, and we’d like to sample among the buckets with probability proportional to their relative weights. This functionality is used, for example, by some of the Integrators that compute a discrete distribution for the light sources in the scene with weights given by the lights’ powers. Sampling from the discrete distribution just requires figuring out which pair of CDF values the sample value lies between; the PDF is computed as the discrete probability of sampling the corresponding bucket.
It’s also useful to be able to compute the PDF for sampling a given value from the discrete PDF.
13.3.2 The Rejection Method
For some functions , it may not be possible to integrate them in order to find their PDFs, or it may not be possible to analytically invert their CDFs. The rejection method is a technique for generating samples according to a function’s distribution without needing to do either of these steps; it is essentially a dart-throwing approach. Assume that we want to draw samples from some such function but we do have a PDF that satisfies for some scalar constant , and suppose that we do know how to sample from . The rejection method is then:
This procedure repeatedly chooses a pair of random variables . If the point lies under , then the sample is accepted. Otherwise, it is rejected and a new sample pair is chosen. This idea is illustrated in Figure 13.5. Without going into too much detail, it should be clear that the efficiency of this scheme depends on how tightly bounds . This technique works in any number of dimensions.
Rejection sampling isn’t actually used in any of the Monte Carlo algorithms currently implemented in pbrt. We will normally prefer to find distributions that are similar to that can be sampled directly, so that well-distributed points on can be mapped to sample points that are in turn well-distributed, as will be discussed in Section 13.8. Nevertheless, rejection sampling is an important technique to be aware of, particularly when debugging Monte Carlo implementations. For example, if one suspects the presence of a bug in code that draws samples from some distribution using the inversion method, then one can replace it with a straightforward implementation based on the rejection method and see if the Monte Carlo estimator computes the same result. Of course, it’s necessary to take many samples in situations like these, so that variance in the estimates doesn’t mask errors.
Example: Rejection Sampling a Unit Circle
Suppose we want to select a uniformly distributed point inside a unit circle. Using the rejection method, we simply select a random position inside the circumscribed square and return it if it falls inside the circle. This process is shown in Figure 13.6.
The function RejectionSampleDisk() implements this algorithm. A similar approach will work to generate uniformly distributed samples on the inside of any complex shape as long as it has an inside–outside test.
In general, the efficiency of rejection sampling depends on the percentage of samples that are expected to be rejected. For the problem of finding uniform points in the 2D case, this is easy to compute. It is the area of the circle divided by the area of the square: . If the method is applied to generate samples in hyperspheres in the general -dimensional case, however, the volume of an -dimensional hypersphere actually goes to 0 as increases, and this approach becomes increasingly inefficient.