8.6 Halton Sampler
The underlying goal of the StratifiedSampler is to generate a well-distributed but randomized set of sample points, with no two sample points too close together and no excessively large regions of the sample space that have no samples. As Figure 8.24 showed, a jittered stratified pattern is better at this than an independent uniform random pattern, although its quality can suffer when samples in adjacent strata happen to be close to the shared boundary of their two strata.
This section introduces the HaltonSampler, which is based on algorithms that directly generate low-discrepancy sample points that are simultaneously well distributed over all the dimensions of the sample—not just one or two dimensions at a time, as the StratifiedSampler did.
8.6.1 Hammersley and Halton Points
Hammersley and Halton points are two closely related types of low-discrepancy points that are constructed using the radical inverse. The radical inverse is based on the fact that a positive integer value can be expressed in a base with a sequence of digits uniquely determined by
where all digits are between 0 and .
The radical inverse function in base converts a nonnegative integer to a fractional value in by reflecting these digits about the radix point:
One of the simplest low-discrepancy sequences is the van der Corput sequence, which is a 1D sequence given by the radical inverse function in base 2:
with Note that van der Corput points are a point sequence because an arbitrary number of them can be generated in succession; the total number need not be specified in advance. (However, if the number of points is not a power of 2, then the gaps between points will be of different sizes.)
Table 8.2 shows the first few values of the van der Corput sequence. Notice how it recursively splits the intervals of the 1D line in half, generating a sample point at the center of each interval.
Base 2 | ||
---|---|---|
0 | 0 | |
1 | 1 | |
2 | 10 | |
3 | 11 | |
4 | 100 | |
5 | 101 | |
The discrepancy of this sequence is
which is optimal.
The -dimensional Halton sequence is defined using the radical inverse base , with a different base for each dimension. The bases used must all be relatively prime to each other, so a natural choice is to use the first prime numbers ):
Like the van der Corput sequence, the Halton sequence can be used even if the total number of samples needed is not known in advance; all prefixes of the sequence are well distributed, so as additional samples are added to the sequence, low discrepancy will be maintained. (However, its distribution is best when the total number of samples is the product of powers of the bases for integer .)
The discrepancy of a -dimensional Halton sequence is
which is asymptotically optimal.
If the number of samples is fixed, the Hammersley point set can be used, giving slightly lower discrepancy. Hammersley point sets are defined by
again with where is the total number of samples to be taken, and as before all the bases are relatively prime. Figure 8.27(a) shows a plot of the first 216 points of the 2D Halton sequence and Figure 8.27(b) shows a set of 256 Hammersley points. (216 Halton points were used in this figure, since they are based on the radical inverses in base 2 and 3, and .)
The RadicalInverse() function computes the radical inverse for a given number a using the baseIndexth prime number as the base. (It and related functions are defined in the files util/lowdiscrepancy.h and util/lowdiscrepancy.cpp.) It does so by computing the digits starting with and then computing a series where , , such that
(For example, with base 10, it would convert the value 1234 to 4321.) The value of can be found entirely using integer arithmetic, without accumulating any round-off error.
The final value of the radical inverse is then found by converting to floating-point and multiplying by , where is the number of digits in the value, to get the value in Equation (8.19). The factor for this multiplication is built up in invBaseM as the digits are processed.
The value of a for the next loop iteration is found by dividing by the base; the remainder is the least significant digit of the current value of a.
It will also be useful to be able to compute the inverse of the radical inverse function; the InverseRadicalInverse() function takes the reversed integer digits in a given base, corresponding to the final value of reversedDigits in the RadicalInverse() function, and returns the index a that corresponds to them. Note that in order to be able to compute the inverse correctly, the total number of digits in the original value must be provided: for example, both 1234 and 123400 are converted to 4321 after the integer-only part of the radical inverse algorithm; trailing zeros become leading zeros, which are lost.
8.6.2 Randomization via Scrambling
One disadvantage of the fact that the Hammersley set and Halton sequence are both fully deterministic is that it is not possible to estimate variance by computing multiple independent estimates of an integral with them. Furthermore, they both have the shortcoming that as the base increases, lower-dimensional projections of sample values can exhibit regular patterns (see Figure 8.28(a)). Because, for example, 2D projections of these points are used for sampling points on light sources, these patterns can lead to visible error in rendered images.
These issues can be addressed using techniques that randomize the points that are generated by these algorithms while still maintaining low discrepancy. A family of such techniques are based on randomizing the digits of each sample coordinate with random permutations. Over all permutations, each coordinate value is then uniformly distributed over , unlike as with the original point. These techniques are often referred to as scrambling.
Scrambling can be performed by defining a set of permutations for each base , where each digit has a distinct permutation of associated with it. (In the following, we will consider scrambling a single dimension of a -dimensional sample point and thus drop the base from our notation, leaving it implicit. In practice, all dimensions are independently scrambled.)
Given such a set of permutations, we can define the scrambled radical inverse where a corresponding permutation is applied to each digit:
Note that the same permutations must be used for generating all the sample points for a given base.
There are a few subtleties related to the permutations. First, with the regular radical inverse, computation of a sample dimension’s value can stop once the remaining digits are 0, as they will have no effect on the final result. With the scrambled radical inverse, the zero digits must continue to be processed. If they are not, then scrambling only corresponds to a permutation of the unscrambled sample values in each dimension, which does not give a uniform distribution over . (In practice, it is only necessary to consider enough digits so that any more digits make no difference to the result given the limits of floating-point precision.)
Second, it is important that each digit has its own permutation. One way to see why this is important is to consider the trailing 0 digits: if the same permutation is used for all of them, then all scrambled values will have the same digit value repeating infinitely at their end. Once again, would not be sampled uniformly.
The choice of permutations can affect the quality of the resulting points. In the following implementation, we will use random permutations. That alone is enough to break up the structure of the points, as shown in Figure 8.28(b). However, carefully constructed deterministic permutations have been shown to reduce error for some integration problems. See the “Further Reading” section for more information.
The DigitPermutation utility class manages allocation and initialization of a set of digit permutations for a single base .
All the permutations are stored in a single flat array: the first base elements of it are the permutation for the first digit, the next base elements are the second digit’s permutation, and so forth. The DigitPermutation constructor’s two tasks are to determine how many digits must be handled and then to generate a permutation for each one.
To save a bit of storage, unsigned 16-bit integers are used for the digit values. As such, the maximum base allowed is . pbrt only supports up to 1,000 dimensions for Halton points, which corresponds to a maximum base of 7,919, the 1,000th prime number, which is comfortably below that limit.
The trailing zero-valued digits must be processed until the digit is reached where is small enough that if the product of with the largest digit is subtracted from 1 using floating-point arithmetic, the result is still 1. At this point, no subsequent digits matter, regardless of the permutation. The DigitPermutation constructor performs this check using precisely the same logic as the (soon to be described) ScrambledRadicalInverse() function does, to be sure that they are in agreement about how many digits need to be handled.
The permutations are computed using PermutationElement(), which is provided with a different seed for each digit index so that the permutations are independent.
The Permute() method takes care of indexing into the permutations array to return the permuted digit value for a given digit index and the unpermuted value of the digit.
Finally, the ComputeRadicalInversePermutations() utility function returns a vector of DigitPermutations, one for each base up to the maximum.
With DigitPermutations available, we can implement the ScrambledRadicalInverse() function. Its structure is generally the same as RadicalInverse(), though here we can see that it uses a different termination criterion, as was discussed with the implementation of <<Compute number of digits needed for base>> above.
Each digit is handled the same way as in RadicalInverse(), with the only change being that it is permuted using the provided DigitPermutation.
An even more effective scrambling approach defines digit permutations that not only depend on the index of the current digit , but that also depend on the values of the previous digits . This approach is known as Owen scrambling, after its inventor. Remarkably, it can be shown that for a class of smooth functions, the integration error with this scrambling technique decreases at a rate
which is a substantial improvement over the error rate for regular Monte Carlo.
The reason for this benefit can be understood in terms of Owen scrambling being more effective at breaking up structure in the sample values while still maintaining their low discrepancy. Its effect is easiest to see when considering the trailing zero digits that are present in all sample values: if they are all permuted with the same permutation at each digit, they will end up with the same values, which effectively means that there is some structure shared among all the samples. Owen scrambling eliminates this regularity, to the benefit of integration error. (It also benefits the earlier digits in a similar manner, though the connection is less immediately intuitive.)
The challenge with Owen scrambling is that it is infeasible to explicitly store all the permutations, as the number of them that are required grows exponentially with the number of digits. In this case, we can once again take advantage of the PermutationElement() function and its capability of permuting without explicitly representing the full permutation.
The computation for each digit is similar to the two previous radical inverse functions; only the third and fourth lines of code in the following fragment are different. At the third line, the values of the previous digits are available in reversedDigits, so hashing them to get a seed for the random permutation suffices to implement Owen scrambling. (Here we have used MixBits() rather than Hash(), as it takes a 64-bit value (which we have at hand) and is more efficient, which is important here since the hashing operation is performed for each digit.) A call to PermutationElement() then gives the corresponding permuted digit value, which is then processed as before.
8.6.3 Halton Sampler Implementation
Given all the capabilities introduced so far in this section, it is not too hard to implement the HaltonSampler, which generates samples using the Halton sequence.
For the pixel samples, the HaltonSampler scales the domain of the first two dimensions of the Halton sequence from so that it covers an integral number of pixels in each dimension. In doing so, it ensures that the pixel samples for adjacent pixels are well distributed with respect to each other. (This is a useful property that the stratified sampler does not guarantee.)
Its constructor takes the full image resolution, even if only a subwindow of it is being rendered. This allows it to always produce the same sample values at each pixel, regardless of whether only some of the pixels are being rendered. This is another place where we have tried to ensure that the renderer’s operation is deterministic: rendering a small crop window of an image when debugging does not affect the sample values generated at those pixels if the HaltonSampler is being used.
For this and the following samplers that allow the user to select a randomization strategy, it will be helpful to have an enumeration that encodes them. (Note that the FastOwen option is not supported by the HaltonSampler.)
Some sample generation approaches are naturally pixel-based and fit in easily to the Sampler interface as it has been presented so far. For example, the StratifiedSampler can easily start generating samples in a new pixel after its StartPixelSample() method has been called—it just needs to set RNG state so that it is consistent over all the samples in the pixel.
Others, like the HaltonSampler, naturally generate consecutive samples that are spread across the entire image, visiting completely different pixels if the samples are generated in succession. (Many such samplers are effectively placing each additional sample such that it fills the largest hole in the -dimensional sample space, which leads to subsequent samples being inside different pixels.) These sampling algorithms are somewhat problematic with the Sampler interface as described so far: the StartPixelSample() method must be able to set the sampler’s state so that it is able to generate samples for any requested pixel.
Table 8.3 illustrates the issue for Halton samples. The second column shows 2D Halton sample values in , which are then multiplied by the image resolution in each dimension to get sample positions in the image plane (here we are considering a image for simplicity). Note that here, each pixel is visited by each sixth sample. If we are rendering an image with three samples per pixel, then to generate all the samples for the pixel , we need to generate the samples with indices 0, 6, and 12.
Sample index | sample coordinates | Pixel sample coordinates |
---|---|---|
0 | ||
1 | ||
2 | ||
3 | ||
4 | ||
5 | ||
6 | ||
7 | ||
8 | ||
9 | ||
10 | ||
11 | ||
12 | ||
To map the first two dimensions of samples from to pixel coordinates, the HaltonSampler finds the smallest scale factor that is larger than the lower of either the image resolution or MaxHaltonResolution in each dimension. (We will explain shortly how this specific choice of scales makes it easy to see which pixel a sample lands in.) After scaling, any samples outside the image extent will be simply ignored.
For images with resolution greater than MaxHaltonResolution in one or both dimensions, a tile of Halton points is repeated across the image. This resolution limit helps maintain sufficient floating-point precision in the computed sample values.
For each dimension, baseScales holds the scale factor, or , and baseExponents holds the exponents and .
To see why the HaltonSampler uses this scheme to map samples to pixel coordinates, consider the effect of scaling a value computed with the radical inverse base by a factor . If the digits of expressed in base are , then recall that the radical inverse is the value in base . If we multiply this value by , for example, we have the first two digits have moved to the left of the radix point, and the fractional component of the value starts with .
This operation—scaling by —forms the core of being able to determine which sample indices land in which pixels. Considering the first two digits in the above example, we can see that the integer component of the scaled value ranges from to and that as increases, its last two digits in base take on any particular value after each sample index values.
Given a value , , we can find the first value that gives the value in the integer components. By definition, the digits of in base are . Thus, if and , then the scaled value of ’s radical inverse will have an integer component equal to .
Computing the index of the first sample in a given pixel where the samples have been scaled by involves computing the inverse radical inverse of the last digits of in base 2, which we will denote by , and of the last digits of in base 3, . This gives us a system of equations
where the index that satisfies these equations is the index of a sample that lies within the given pixel, after scaling.
Given this insight, we can now finally implement the StartPixelSample() method. The code that solves Equation (8.21) for is in the <<Compute Halton sample index for first sample in pixel p>>, which is not included here in the book; see Grünschloß et al. (2012) for details of the algorithm.
Given the index into the Halton sequence that corresponds to the first sample for the pixel, we need to find the index for the requested sample, sampleIndex. Because the bases and used in the HaltonSampler for pixel samples are relatively prime, it follows that if the sample values are scaled by some , then any particular pixel in the range will be visited once every samples. That product is stored in sampleStride and the final Halton index is found by adding the product of that and the current sampleIndex.
The methods that generate Halton sample dimensions are straightforward; they just increment the dimension member variable based on how many dimensions they have consumed and call the appropriate radical inverse function. In the unlikely case that the maximum supported number of dimensions have been used, the implementation wraps around to the start and then skips over the first two dimensions, which are used solely for pixel samples.
The SampleDimension() method takes care of calling the appropriate radical inverse function for the current sample in the current dimension according to the selected randomization strategy.
The Get2D() method is easily implemented using SampleDimension().
GetPixel2D() has to account for two important details in the rest of the HaltonSampler implementation. First, because the computation of the sample index, haltonIndex, in StartPixelSample() does not account for random digit permutations, those must not be included in the samples returned for the first two dimensions: a call to RadicalInverse() is always used here.
Second, because the first baseExponents[i] digits of the first two dimensions’ radical inverses are used to select which pixel is sampled, these digits must be discarded before computing the radical inverse for the first two dimensions of the sample, since the GetPixel2D() method is supposed to return the fractional offset in within the pixel being sampled. This is most easily done by removing the trailing digits of the sample index before computing the radical inverse. Because the first dimension is base 2, this can efficiently be done using a shift, though a divide is necessary for base 3 in the second dimension.
8.6.4 Evaluation
Figure 8.29 shows plots of the power spectra for the HaltonSampler with each of the three randomization strategies. The frequency space perspective is revealing. First, note that all three strategies have low energy along the two axes: this indicates that they all do well with functions that mostly vary in only one dimension. This behavior can be understood from their construction: because each dimension uses an independent radical inverse, 1D projections of the sample points are stratified. (Consider in comparison the jittered sampling pattern’s PSD, which had a radially symmetric distribution around the origin. Given 2D stratified samples, only are guaranteed to be stratified along either of the dimensions, whereas with the Halton sampler, all are.)
However, the non-randomized Halton sampler has wide variation in its PSD at higher frequencies. Ideally, those frequencies would all have roughly unit energy, but in this case, some frequencies have over a hundred times more and others a hundred times less. Results will be poor if the frequencies of the function match the ones with high power in the PSD. This issue can be seen in rendered images; Figure 8.30 compares the visual results from sampling a checkerboard texture using a Halton-based sampler to using the stratified sampler from the previous section. Note the unpleasant pattern along edges in the foreground and toward the horizon.
Returning to the power spectra in Figure 8.29, we can see that random digit permutations give a substantial improvement in the power spectrum, though there is still clear structure, with some frequencies having very low power and others still having high power. The benefit of Owen scrambling in this case is striking: it gives a uniform power spectrum at higher frequencies while maintaining low power along the axes.
It can also be illuminating to measure the performance of samplers with simple functions that can be integrated analytically. Figure 8.31 shows plots of mean squared error (MSE) for using the independent, stratified, and Halton samplers for integrating a Gaussian and a checkerboard function (shown in the plots). In this case, using a log–log scale has the effect of causing convergence rates of the form to appear as lines with slope , which makes it easier to compare asymptotic convergence of various techniques. For both functions, both stratified and Halton sampling have a higher rate of convergence than the of independent sampling, as can be seen by the steeper slopes of their error curves. The Halton sampler does especially well with the Gaussian, achieving nearly two thousand times lower MSE than independent sampling at 4,096 samples.
Figure 8.32 shows the image of a test scene that we will use for comparing samplers. It features a moving camera, defocus blur, illumination from an environment map light source, and multiply scattered light from sources to give an integral with tens of dimensions. Figure 8.33 is a log–log plot of MSE versus number of samples for these samplers with this scene. With a more complex integrand than the simple ones in Figure 8.31, the Halton sampler does not have the enormous benefit it did there. Nevertheless, it makes a significant improvement to error—for example, MSE is lower than independent sampling at 4,096 samples per pixel.