8.6 Halton Sampler

The underlying goal of the StratifiedSampler is to generate a well-distributed but randomized set of sample points, with no two sample points too close together and no excessively large regions of the sample space that have no samples. As Figure 8.24 showed, a jittered stratified pattern is better at this than an independent uniform random pattern, although its quality can suffer when samples in adjacent strata happen to be close to the shared boundary of their two strata.

This section introduces the HaltonSampler, which is based on algorithms that directly generate low-discrepancy sample points that are simultaneously well distributed over all the dimensions of the sample—not just one or two dimensions at a time, as the StratifiedSampler did.

8.6.1 Hammersley and Halton Points

Hammersley and Halton points are two closely related types of low-discrepancy points that are constructed using the radical inverse. The radical inverse is based on the fact that a positive integer value a can be expressed in a base  b with a sequence of digits d Subscript m Baseline left-parenthesis a right-parenthesis ellipsis d 2 left-parenthesis a right-parenthesis d 1 left-parenthesis a right-parenthesis uniquely determined by

a equals sigma-summation Underscript i equals 1 Overscript m Endscripts d Subscript i Baseline left-parenthesis a right-parenthesis b Superscript i minus 1 Baseline comma

where all digits d Subscript i Baseline left-parenthesis a right-parenthesis are between 0 and b minus 1 .

The radical inverse function normal upper Phi Subscript b in base b converts a nonnegative integer a to a fractional value in left-bracket 0 comma 1 right-parenthesis by reflecting these digits about the radix point:

normal upper Phi Subscript b Baseline left-parenthesis a right-parenthesis equals 0 period d 1 left-parenthesis a right-parenthesis d 2 left-parenthesis a right-parenthesis ellipsis d Subscript m Baseline left-parenthesis a right-parenthesis equals sigma-summation Underscript i equals 1 Overscript m Endscripts d Subscript i Baseline left-parenthesis a right-parenthesis b Superscript negative i Baseline period

One of the simplest low-discrepancy sequences is the van der Corput sequence, which is a 1D sequence given by the radical inverse function in base 2:

x Subscript a Baseline equals normal upper Phi 2 left-parenthesis a right-parenthesis comma

with a equals 0 comma 1 comma ellipsis period Note that van der Corput points are a point sequence because an arbitrary number of them can be generated in succession; the total number need not be specified in advance. (However, if the number of points n is not a power of 2, then the gaps between points will be of different sizes.)

Table 8.2 shows the first few values of the van der Corput sequence. Notice how it recursively splits the intervals of the 1D line in half, generating a sample point at the center of each interval.

Table 8.2: The radical inverse normal upper Phi 2 left-parenthesis a right-parenthesis of the first few nonnegative integers, computed in base 2. Notice how successive values of normal upper Phi 2 left-parenthesis a right-parenthesis are not close to any of the previous values of normal upper Phi 2 left-parenthesis a right-parenthesis . As more and more values of the sequence are generated, samples are necessarily closer to previous samples, although with a minimum distance that is guaranteed to be reasonably good.

a Base 2 normal upper Phi 2 left-parenthesis a right-parenthesis
0 0 0
1 1 0.1 equals 1 slash 2
2 10 0.01 equals 1 slash 4
3 11 0.11 equals 3 slash 4
4 100 0.001 equals 1 slash 8
5 101 0.101 equals 5 slash 8
vertical-ellipsis

The discrepancy of this sequence is

upper D Subscript n Superscript asterisk Baseline left-parenthesis upper P right-parenthesis equals upper O left-parenthesis StartFraction log n Over n EndFraction right-parenthesis comma

which is optimal.

The d -dimensional Halton sequence is defined using the radical inverse base b , with a different base for each dimension. The bases used must all be relatively prime to each other, so a natural choice is to use the first d prime numbers left-parenthesis p 1 comma ellipsis comma p Subscript d Baseline ):

x Subscript a Baseline equals left-parenthesis normal upper Phi 2 left-parenthesis a right-parenthesis comma normal upper Phi 3 left-parenthesis a right-parenthesis comma normal upper Phi 5 left-parenthesis a right-parenthesis comma ellipsis comma normal upper Phi Subscript p Sub Subscript d Subscript Baseline left-parenthesis a right-parenthesis right-parenthesis period

Like the van der Corput sequence, the Halton sequence can be used even if the total number of samples needed is not known in advance; all prefixes of the sequence are well distributed, so as additional samples are added to the sequence, low discrepancy will be maintained. (However, its distribution is best when the total number of samples is the product of powers of the bases normal upper Pi p Subscript i Superscript k Super Subscript i for integer k Subscript i .)

The discrepancy of a d -dimensional Halton sequence is

upper D Subscript n Superscript asterisk Baseline left-parenthesis x Subscript a Baseline right-parenthesis equals upper O left-parenthesis StartFraction left-parenthesis log n right-parenthesis Superscript d Baseline Over n EndFraction right-parenthesis comma

which is asymptotically optimal.

If the number of samples n is fixed, the Hammersley point set can be used, giving slightly lower discrepancy. Hammersley point sets are defined by

x Subscript a Baseline equals left-parenthesis StartFraction a Over n EndFraction comma normal upper Phi Subscript b 1 Baseline left-parenthesis a right-parenthesis comma normal upper Phi Subscript b 2 Baseline left-parenthesis a right-parenthesis comma ellipsis comma normal upper Phi Subscript b Sub Subscript d minus 1 Subscript Baseline left-parenthesis a right-parenthesis right-parenthesis comma

again with a equals 0 comma 1 comma ellipsis comma where n is the total number of samples to be taken, and as before all the bases b Subscript i are relatively prime. Figure 8.27(a) shows a plot of the first 216 points of the 2D Halton sequence and Figure 8.27(b) shows a set of 256 Hammersley points. (216 Halton points were used in this figure, since they are based on the radical inverses in base 2 and 3, and 2 cubed 3 cubed equals 216 .)

Figure 8.27: The First Points of Two Low-Discrepancy Sequences in 2D. (a) Halton (216 points), (b) Hammersley (256 points).

The RadicalInverse() function computes the radical inverse for a given number a using the baseIndexth prime number as the base. (It and related functions are defined in the files util/lowdiscrepancy.h and util/lowdiscrepancy.cpp.) It does so by computing the digits d Subscript i starting with d 1 and then computing a series v Subscript i where v 1 equals d 1 , v 2 equals b d 1 plus d 2 , such that

v Subscript n Baseline equals b Superscript n minus 1 Baseline d 1 plus b Superscript n minus 2 Baseline d 2 plus midline-horizontal-ellipsis plus d Subscript n Baseline period

(For example, with base 10, it would convert the value 1234 to 4321.) The value of v Subscript n can be found entirely using integer arithmetic, without accumulating any round-off error.

The final value of the radical inverse is then found by converting to floating-point and multiplying by 1 slash b Superscript m , where m is the number of digits in the value, to get the value in Equation (8.19). The factor for this multiplication is built up in invBaseM as the digits are processed.

<<Low Discrepancy Inline Functions>>= 
Float RadicalInverse(int baseIndex, uint64_t a) { int base = Primes[baseIndex]; Float invBase = (Float)1 / (Float)base, invBaseM = 1; uint64_t reversedDigits = 0; while (a) { <<Extract least significant digit from a and update reversedDigits>> 
uint64_t next = a / base; uint64_t digit = a - next * base; reversedDigits = reversedDigits * base + digit; invBaseM *= invBase; a = next;
} return std::min(reversedDigits * invBaseM, OneMinusEpsilon); }

The value of a for the next loop iteration is found by dividing by the base; the remainder is the least significant digit of the current value of a.

<<Extract least significant digit from a and update reversedDigits>>= 
uint64_t next = a / base; uint64_t digit = a - next * base; reversedDigits = reversedDigits * base + digit; invBaseM *= invBase; a = next;

It will also be useful to be able to compute the inverse of the radical inverse function; the InverseRadicalInverse() function takes the reversed integer digits in a given base, corresponding to the final value of reversedDigits in the RadicalInverse() function, and returns the index a that corresponds to them. Note that in order to be able to compute the inverse correctly, the total number of digits in the original value must be provided: for example, both 1234 and 123400 are converted to 4321 after the integer-only part of the radical inverse algorithm; trailing zeros become leading zeros, which are lost.

<<Low Discrepancy Inline Functions>>+=  
uint64_t InverseRadicalInverse(uint64_t inverse, int base, int nDigits) { uint64_t index = 0; for (int i = 0; i < nDigits; ++i) { uint64_t digit = inverse % base; inverse /= base; index = index * base + digit; } return index; }

8.6.2 Randomization via Scrambling

One disadvantage of the fact that the Hammersley set and Halton sequence are both fully deterministic is that it is not possible to estimate variance by computing multiple independent estimates of an integral with them. Furthermore, they both have the shortcoming that as the base b increases, lower-dimensional projections of sample values can exhibit regular patterns (see Figure 8.28(a)). Because, for example, 2D projections of these points are used for sampling points on light sources, these patterns can lead to visible error in rendered images.

Figure 8.28: Plot of Halton Sample Values with and without Scrambling. (a) In higher dimensions, projections of sample values start to exhibit regular structure. Here, points from the dimensions left-parenthesis normal upper Phi 29 left-parenthesis a right-parenthesis comma normal upper Phi 31 left-parenthesis a right-parenthesis right-parenthesis are shown. (b) Scrambled sequences based on Equation (8.20) break up this structure by permuting the digits of sample indices.

These issues can be addressed using techniques that randomize the points that are generated by these algorithms while still maintaining low discrepancy. A family of such techniques are based on randomizing the digits of each sample coordinate with random permutations. Over all permutations, each coordinate value is then uniformly distributed over left-bracket 0 comma 1 right-parenthesis , unlike as with the original point. These techniques are often referred to as scrambling.

Scrambling can be performed by defining a set of permutations pi Subscript i for each base b , where each digit has a distinct permutation of StartSet 0 comma 1 comma ellipsis comma b minus 1 EndSet associated with it. (In the following, we will consider scrambling a single dimension of a d -dimensional sample point and thus drop the base b from our notation, leaving it implicit. In practice, all dimensions are independently scrambled.)

Given such a set of permutations, we can define the scrambled radical inverse where a corresponding permutation is applied to each digit:

normal upper Psi Subscript b Baseline left-parenthesis a right-parenthesis equals 0 period pi 1 left-parenthesis d 1 right-parenthesis pi 2 left-parenthesis d 2 right-parenthesis ellipsis pi Subscript m Baseline left-parenthesis d Subscript m Baseline right-parenthesis period

Note that the same permutations pi Subscript i must be used for generating all the sample points for a given base.

There are a few subtleties related to the permutations. First, with the regular radical inverse, computation of a sample dimension’s value can stop once the remaining digits d Subscript i are 0, as they will have no effect on the final result. With the scrambled radical inverse, the zero digits must continue to be processed. If they are not, then scrambling only corresponds to a permutation of the unscrambled sample values in each dimension, which does not give a uniform distribution over left-bracket 0 comma 1 right-parenthesis . (In practice, it is only necessary to consider enough digits so that any more digits make no difference to the result given the limits of floating-point precision.)

Second, it is important that each digit has its own permutation. One way to see why this is important is to consider the trailing 0 digits: if the same permutation is used for all of them, then all scrambled values will have the same digit value repeating infinitely at their end. Once again, left-bracket 0 comma 1 right-parenthesis would not be sampled uniformly.

The choice of permutations can affect the quality of the resulting points. In the following implementation, we will use random permutations. That alone is enough to break up the structure of the points, as shown in Figure 8.28(b). However, carefully constructed deterministic permutations have been shown to reduce error for some integration problems. See the “Further Reading” section for more information.

The DigitPermutation utility class manages allocation and initialization of a set of digit permutations for a single base b .

<<DigitPermutation Definition>>= 
class DigitPermutation { public: <<DigitPermutation Public Methods>> 
DigitPermutation(int base, uint32_t seed, Allocator alloc) : base(base) { <<Compute number of digits needed for base>> 
nDigits = 0; Float invBase = (Float)1 / (Float)base, invBaseM = 1; while (1 - (base - 1) * invBaseM < 1) { ++nDigits; invBaseM *= invBase; }
permutations = alloc.allocate_object<uint16_t>(nDigits * base); <<Compute random permutations for all digits>> 
for (int digitIndex = 0; digitIndex < nDigits; ++digitIndex) { uint64_t dseed = Hash(base, digitIndex, seed); for (int digitValue = 0; digitValue < base; ++digitValue) { int index = digitIndex * base + digitValue; permutations[index] = PermutationElement(digitValue, base, dseed); } }
} int Permute(int digitIndex, int digitValue) const { return permutations[digitIndex * base + digitValue]; } std::string ToString() const;
private: <<DigitPermutation Private Members>> 
int base, nDigits; uint16_t *permutations;
};

All the permutations are stored in a single flat array: the first base elements of it are the permutation for the first digit, the next base elements are the second digit’s permutation, and so forth. The DigitPermutation constructor’s two tasks are to determine how many digits must be handled and then to generate a permutation for each one.

<<DigitPermutation Public Methods>>= 
DigitPermutation(int base, uint32_t seed, Allocator alloc) : base(base) { <<Compute number of digits needed for base>> 
nDigits = 0; Float invBase = (Float)1 / (Float)base, invBaseM = 1; while (1 - (base - 1) * invBaseM < 1) { ++nDigits; invBaseM *= invBase; }
permutations = alloc.allocate_object<uint16_t>(nDigits * base); <<Compute random permutations for all digits>> 
for (int digitIndex = 0; digitIndex < nDigits; ++digitIndex) { uint64_t dseed = Hash(base, digitIndex, seed); for (int digitValue = 0; digitValue < base; ++digitValue) { int index = digitIndex * base + digitValue; permutations[index] = PermutationElement(digitValue, base, dseed); } }
}

To save a bit of storage, unsigned 16-bit integers are used for the digit values. As such, the maximum base allowed is 2 Superscript 16 . pbrt only supports up to 1,000 dimensions for Halton points, which corresponds to a maximum base of 7,919, the 1,000th prime number, which is comfortably below that limit.

<<DigitPermutation Private Members>>= 
int base, nDigits; uint16_t *permutations;

The trailing zero-valued digits must be processed until the digit d Subscript m is reached where b Superscript negative m is small enough that if the product of b Superscript negative m with the largest digit is subtracted from 1 using floating-point arithmetic, the result is still 1. At this point, no subsequent digits matter, regardless of the permutation. The DigitPermutation constructor performs this check using precisely the same logic as the (soon to be described) ScrambledRadicalInverse() function does, to be sure that they are in agreement about how many digits need to be handled.

<<Compute number of digits needed for base>>= 
nDigits = 0; Float invBase = (Float)1 / (Float)base, invBaseM = 1; while (1 - (base - 1) * invBaseM < 1) { ++nDigits; invBaseM *= invBase; }

The permutations are computed using PermutationElement(), which is provided with a different seed for each digit index so that the permutations are independent.

<<Compute random permutations for all digits>>= 
for (int digitIndex = 0; digitIndex < nDigits; ++digitIndex) { uint64_t dseed = Hash(base, digitIndex, seed); for (int digitValue = 0; digitValue < base; ++digitValue) { int index = digitIndex * base + digitValue; permutations[index] = PermutationElement(digitValue, base, dseed); } }

The Permute() method takes care of indexing into the permutations array to return the permuted digit value for a given digit index and the unpermuted value of the digit.

<<DigitPermutation Public Methods>>+= 
int Permute(int digitIndex, int digitValue) const { return permutations[digitIndex * base + digitValue]; }

Finally, the ComputeRadicalInversePermutations() utility function returns a vector of DigitPermutations, one for each base up to the maximum.

<<Low Discrepancy Function Definitions>>= 
pstd::vector<DigitPermutation> * ComputeRadicalInversePermutations(uint32_t seed, Allocator alloc) { pstd::vector<DigitPermutation> *perms = alloc.new_object<pstd::vector<DigitPermutation>>(alloc); perms->resize(PrimeTableSize); for (int i = 0; i < PrimeTableSize; ++i) (*perms)[i] = DigitPermutation(Primes[i], seed, alloc); return perms; }

With DigitPermutations available, we can implement the ScrambledRadicalInverse() function. Its structure is generally the same as RadicalInverse(), though here we can see that it uses a different termination criterion, as was discussed with the implementation of <<Compute number of digits needed for base>> above.

<<Low Discrepancy Inline Functions>>+=  
Float ScrambledRadicalInverse(int baseIndex, uint64_t a, const DigitPermutation &perm) { int base = Primes[baseIndex]; Float invBase = (Float)1 / (Float)base, invBaseM = 1; uint64_t reversedDigits = 0; int digitIndex = 0; while (1 - (base - 1) * invBaseM < 1) { <<Permute least significant digit from a and update reversedDigits>> 
uint64_t next = a / base; int digitValue = a - next * base; reversedDigits = reversedDigits * base + perm.Permute(digitIndex, digitValue); invBaseM *= invBase; ++digitIndex; a = next;
} return std::min(invBaseM * reversedDigits, OneMinusEpsilon); }

Each digit is handled the same way as in RadicalInverse(), with the only change being that it is permuted using the provided DigitPermutation.

<<Permute least significant digit from a and update reversedDigits>>= 
uint64_t next = a / base; int digitValue = a - next * base; reversedDigits = reversedDigits * base + perm.Permute(digitIndex, digitValue); invBaseM *= invBase; ++digitIndex; a = next;

An even more effective scrambling approach defines digit permutations that not only depend on the index of the current digit i , but that also depend on the values of the previous digits d 1 d 2 ellipsis d Subscript i minus 1 . This approach is known as Owen scrambling, after its inventor. Remarkably, it can be shown that for a class of smooth functions, the integration error with this scrambling technique decreases at a rate

upper O left-parenthesis n Superscript negative three-halves Baseline left-parenthesis log n right-parenthesis Superscript left-parenthesis d minus 1 right-parenthesis slash 2 Baseline right-parenthesis comma

which is a substantial improvement over the upper O left-parenthesis n Superscript negative 1 slash 2 Baseline right-parenthesis error rate for regular Monte Carlo.

The reason for this benefit can be understood in terms of Owen scrambling being more effective at breaking up structure in the sample values while still maintaining their low discrepancy. Its effect is easiest to see when considering the trailing zero digits that are present in all sample values: if they are all permuted with the same permutation at each digit, they will end up with the same values, which effectively means that there is some structure shared among all the samples. Owen scrambling eliminates this regularity, to the benefit of integration error. (It also benefits the earlier digits in a similar manner, though the connection is less immediately intuitive.)

The challenge with Owen scrambling is that it is infeasible to explicitly store all the permutations, as the number of them that are required grows exponentially with the number of digits. In this case, we can once again take advantage of the PermutationElement() function and its capability of permuting without explicitly representing the full permutation.

<<Low Discrepancy Inline Functions>>+=  
Float OwenScrambledRadicalInverse(int baseIndex, uint64_t a, uint32_t hash) { int base = Primes[baseIndex]; Float invBase = (Float)1 / (Float)base, invBaseM = 1; uint64_t reversedDigits = 0; int digitIndex = 0; while (1 - invBaseM < 1) { <<Compute Owen-scrambled digit for digitIndex>> 
uint64_t next = a / base; int digitValue = a - next * base; uint32_t digitHash = MixBits(hash ^ reversedDigits); digitValue = PermutationElement(digitValue, base, digitHash); reversedDigits = reversedDigits * base + digitValue; invBaseM *= invBase; ++digitIndex; a = next;
} return std::min(invBaseM * reversedDigits, OneMinusEpsilon); }

The computation for each digit is similar to the two previous radical inverse functions; only the third and fourth lines of code in the following fragment are different. At the third line, the values of the previous digits are available in reversedDigits, so hashing them to get a seed for the random permutation suffices to implement Owen scrambling. (Here we have used MixBits() rather than Hash(), as it takes a 64-bit value (which we have at hand) and is more efficient, which is important here since the hashing operation is performed for each digit.) A call to PermutationElement() then gives the corresponding permuted digit value, which is then processed as before.

<<Compute Owen-scrambled digit for digitIndex>>= 
uint64_t next = a / base; int digitValue = a - next * base; uint32_t digitHash = MixBits(hash ^ reversedDigits); digitValue = PermutationElement(digitValue, base, digitHash); reversedDigits = reversedDigits * base + digitValue; invBaseM *= invBase; ++digitIndex; a = next;

8.6.3 Halton Sampler Implementation

Given all the capabilities introduced so far in this section, it is not too hard to implement the HaltonSampler, which generates samples using the Halton sequence.

<<HaltonSampler Definition>>= 
class HaltonSampler { public: <<HaltonSampler Public Methods>> 
HaltonSampler(int samplesPerPixel, Point2i fullResolution, RandomizeStrategy randomize = RandomizeStrategy::PermuteDigits, int seed = 0, Allocator alloc = {}); PBRT_CPU_GPU static constexpr const char *Name() { return "HaltonSampler"; } static HaltonSampler *Create(const ParameterDictionary &parameters, Point2i fullResolution, const FileLoc *loc, Allocator alloc); int SamplesPerPixel() const { return samplesPerPixel; } PBRT_CPU_GPU RandomizeStrategy GetRandomizeStrategy() const { return randomize; } void StartPixelSample(Point2i p, int sampleIndex, int dim) { haltonIndex = 0; int sampleStride = baseScales[0] * baseScales[1]; <<Compute Halton sample index for first sample in pixel p>> 
if (sampleStride > 1) { Point2i pm(Mod(p[0], MaxHaltonResolution), Mod(p[1], MaxHaltonResolution)); for (int i = 0; i < 2; ++i) { uint64_t dimOffset = (i == 0) ? InverseRadicalInverse(pm[i], 2, baseExponents[i]) : InverseRadicalInverse(pm[i], 3, baseExponents[i]); haltonIndex += dimOffset * (sampleStride / baseScales[i]) * multInverse[i]; } haltonIndex %= sampleStride; }
haltonIndex += sampleIndex * sampleStride; dimension = std::max(2, dim); } Float Get1D() { if (dimension >= PrimeTableSize) dimension = 2; return SampleDimension(dimension++); } Point2f Get2D() { if (dimension + 1 >= PrimeTableSize) dimension = 2; int dim = dimension; dimension += 2; return {SampleDimension(dim), SampleDimension(dim + 1)}; } Point2f GetPixel2D() { return {RadicalInverse(0, haltonIndex >> baseExponents[0]), RadicalInverse(1, haltonIndex / baseScales[1])}; } Sampler Clone(Allocator alloc); std::string ToString() const;
private: <<HaltonSampler Private Methods>> 
static uint64_t multiplicativeInverse(int64_t a, int64_t n) { int64_t x, y; extendedGCD(a, n, &x, &y); return Mod(x, n); } static void extendedGCD(uint64_t a, uint64_t b, int64_t *x, int64_t *y) { if (b == 0) { *x = 1; *y = 0; return; } int64_t d = a / b, xp, yp; extendedGCD(b, a % b, &xp, &yp); *x = yp; *y = xp - (d * yp); } Float SampleDimension(int dimension) const { if (randomize == RandomizeStrategy::None) return RadicalInverse(dimension, haltonIndex); else if (randomize == RandomizeStrategy::PermuteDigits) return ScrambledRadicalInverse(dimension, haltonIndex, (*digitPermutations)[dimension]); else return OwenScrambledRadicalInverse(dimension, haltonIndex, MixBits(1 + (dimension << 4))); }
<<HaltonSampler Private Members>> 
int samplesPerPixel; RandomizeStrategy randomize; pstd::vector<DigitPermutation> *digitPermutations = nullptr; static constexpr int MaxHaltonResolution = 128; Point2i baseScales, baseExponents; int multInverse[2]; int64_t haltonIndex = 0; int dimension = 0;
};

For the pixel samples, the HaltonSampler scales the domain of the first two dimensions of the Halton sequence from left-bracket 0 comma 1 right-parenthesis squared so that it covers an integral number of pixels in each dimension. In doing so, it ensures that the pixel samples for adjacent pixels are well distributed with respect to each other. (This is a useful property that the stratified sampler does not guarantee.)

Its constructor takes the full image resolution, even if only a subwindow of it is being rendered. This allows it to always produce the same sample values at each pixel, regardless of whether only some of the pixels are being rendered. This is another place where we have tried to ensure that the renderer’s operation is deterministic: rendering a small crop window of an image when debugging does not affect the sample values generated at those pixels if the HaltonSampler is being used.

<<HaltonSampler Method Definitions>>= 
HaltonSampler::HaltonSampler(int samplesPerPixel, Point2i fullRes, RandomizeStrategy randomize, int seed, Allocator alloc) : samplesPerPixel(samplesPerPixel), randomize(randomize) { if (randomize == RandomizeStrategy::PermuteDigits) digitPermutations = ComputeRadicalInversePermutations(seed, alloc); <<Find radical inverse base scales and exponents that cover sampling area>> 
for (int i = 0; i < 2; ++i) { int base = (i == 0) ? 2 : 3; int scale = 1, exp = 0; while (scale < std::min(fullRes[i], MaxHaltonResolution)) { scale *= base; ++exp; } baseScales[i] = scale; baseExponents[i] = exp; }
<<Compute multiplicative inverses for baseScales>> 
multInverse[0] = multiplicativeInverse(baseScales[1], baseScales[0]); multInverse[1] = multiplicativeInverse(baseScales[0], baseScales[1]);
}

<<HaltonSampler Private Members>>= 
int samplesPerPixel; RandomizeStrategy randomize; pstd::vector<DigitPermutation> *digitPermutations = nullptr;

For this and the following samplers that allow the user to select a randomization strategy, it will be helpful to have an enumeration that encodes them. (Note that the FastOwen option is not supported by the HaltonSampler.)

<<RandomizeStrategy Definition>>= 
enum class RandomizeStrategy { None, PermuteDigits, FastOwen, Owen };

Some sample generation approaches are naturally pixel-based and fit in easily to the Sampler interface as it has been presented so far. For example, the StratifiedSampler can easily start generating samples in a new pixel after its StartPixelSample() method has been called—it just needs to set RNG state so that it is consistent over all the samples in the pixel.

Others, like the HaltonSampler, naturally generate consecutive samples that are spread across the entire image, visiting completely different pixels if the samples are generated in succession. (Many such samplers are effectively placing each additional sample such that it fills the largest hole in the n -dimensional sample space, which leads to subsequent samples being inside different pixels.) These sampling algorithms are somewhat problematic with the Sampler interface as described so far: the StartPixelSample() method must be able to set the sampler’s state so that it is able to generate samples for any requested pixel.

Table 8.3 illustrates the issue for Halton samples. The second column shows 2D Halton sample values in left-bracket 0 comma 1 right-parenthesis squared , which are then multiplied by the image resolution in each dimension to get sample positions in the image plane (here we are considering a 2 times 3 image for simplicity). Note that here, each pixel is visited by each sixth sample. If we are rendering an image with three samples per pixel, then to generate all the samples for the pixel left-parenthesis 0 comma 0 right-parenthesis , we need to generate the samples with indices 0, 6, and 12.

Table 8.3: The HaltonSampler generates the coordinates in the middle column for the first two dimensions, which are scaled by 2 in the first dimension and 3 in the second dimension so that they cover a 2 times 3 pixel image. To fulfill the Sampler interface, it is necessary to be able to work backward from a given pixel and sample number within that pixel to find the corresponding sample index in the full Halton sequence.

Sample index left-bracket 0 comma 1 right-parenthesis squared sample coordinates Pixel sample coordinates
0 left-parenthesis 0.000000 comma 0.000000 right-parenthesis left-parenthesis 0.000000 comma 0.000000 right-parenthesis
1 left-parenthesis 0.500000 comma 0.333333 right-parenthesis left-parenthesis 1.000000 comma 1.000000 right-parenthesis
2 left-parenthesis 0.250000 comma 0.666667 right-parenthesis left-parenthesis 0.500000 comma 2.000000 right-parenthesis
3 left-parenthesis 0.750000 comma 0.111111 right-parenthesis left-parenthesis 1.500000 comma 0.333333 right-parenthesis
4 left-parenthesis 0.125000 comma 0.444444 right-parenthesis left-parenthesis 0.250000 comma 1.333333 right-parenthesis
5 left-parenthesis 0.625000 comma 0.777778 right-parenthesis left-parenthesis 1.250000 comma 2.333333 right-parenthesis
6 left-parenthesis 0.375000 comma 0.222222 right-parenthesis left-parenthesis 0.750000 comma 0.666667 right-parenthesis
7 left-parenthesis 0.875000 comma 0.555556 right-parenthesis left-parenthesis 1.750000 comma 1.666667 right-parenthesis
8 left-parenthesis 0.062500 comma 0.888889 right-parenthesis left-parenthesis 0.125000 comma 2.666667 right-parenthesis
9 left-parenthesis 0.562500 comma 0.037037 right-parenthesis left-parenthesis 1.125000 comma 0.111111 right-parenthesis
10 left-parenthesis 0.312500 comma 0.370370 right-parenthesis left-parenthesis 0.625000 comma 1.111111 right-parenthesis
11 left-parenthesis 0.812500 comma 0.703704 right-parenthesis left-parenthesis 1.625000 comma 2.111111 right-parenthesis
12 left-parenthesis 0.187500 comma 0.148148 right-parenthesis left-parenthesis 0.375000 comma 0.444444 right-parenthesis
vertical-ellipsis

To map the first two dimensions of samples from left-bracket 0 comma 1 right-parenthesis squared to pixel coordinates, the HaltonSampler finds the smallest scale factor left-parenthesis 2 Superscript j Baseline comma 3 Superscript k Baseline right-parenthesis that is larger than the lower of either the image resolution or MaxHaltonResolution in each dimension. (We will explain shortly how this specific choice of scales makes it easy to see which pixel a sample lands in.) After scaling, any samples outside the image extent will be simply ignored.

For images with resolution greater than MaxHaltonResolution in one or both dimensions, a tile of Halton points is repeated across the image. This resolution limit helps maintain sufficient floating-point precision in the computed sample values.

<<Find radical inverse base scales and exponents that cover sampling area>>= 
for (int i = 0; i < 2; ++i) { int base = (i == 0) ? 2 : 3; int scale = 1, exp = 0; while (scale < std::min(fullRes[i], MaxHaltonResolution)) { scale *= base; ++exp; } baseScales[i] = scale; baseExponents[i] = exp; }

For each dimension, baseScales holds the scale factor, 2 Superscript j or 3 Superscript k , and baseExponents holds the exponents j and k .

<<HaltonSampler Private Members>>+=  
static constexpr int MaxHaltonResolution = 128; Point2i baseScales, baseExponents;

To see why the HaltonSampler uses this scheme to map samples to pixel coordinates, consider the effect of scaling a value computed with the radical inverse base b by a factor b Superscript m . If the digits of a expressed in base b are d Subscript i Baseline left-parenthesis a right-parenthesis , then recall that the radical inverse is the value 0 period d 1 left-parenthesis a right-parenthesis d 2 left-parenthesis a right-parenthesis ellipsis in base b . If we multiply this value by b squared , for example, we have d 1 left-parenthesis a right-parenthesis d 2 left-parenthesis a right-parenthesis period d 3 left-parenthesis a right-parenthesis ellipsis semicolon the first two digits have moved to the left of the radix point, and the fractional component of the value starts with d 3 left-parenthesis a right-parenthesis .

This operation—scaling by b Superscript m —forms the core of being able to determine which sample indices land in which pixels. Considering the first two digits in the above example, we can see that the integer component of the scaled value ranges from 0 to b squared minus 1 and that as a increases, its last two digits in base b take on any particular value after each b squared sample index values.

Given a value x , 0 less-than-or-equal-to x less-than-or-equal-to b squared minus 1 , we can find the first value a that gives the value x in the integer components. By definition, the digits of x in base b are d 2 left-parenthesis x right-parenthesis d 1 left-parenthesis x right-parenthesis . Thus, if d 1 left-parenthesis a right-parenthesis equals d 2 left-parenthesis x right-parenthesis and d 2 left-parenthesis a right-parenthesis equals d 1 left-parenthesis x right-parenthesis , then the scaled value of a ’s radical inverse will have an integer component equal to x .

Computing the index of the first sample in a given pixel left-parenthesis x comma y right-parenthesis where the samples have been scaled by left-parenthesis 2 Superscript j Baseline comma 3 Superscript k Baseline right-parenthesis involves computing the inverse radical inverse of the last j digits of x in base 2, which we will denote by x Subscript r , and of the last k digits of y in base 3, y Subscript r . This gives us a system of equations

StartLayout 1st Row 1st Column x Subscript r 2nd Column identical-to left-parenthesis i normal m normal o normal d 2 Superscript j Baseline right-parenthesis 2nd Row 1st Column y Subscript r 2nd Column identical-to left-parenthesis i normal m normal o normal d 3 Superscript k Baseline right-parenthesis comma EndLayout

where the index i that satisfies these equations is the index of a sample that lies within the given pixel, after scaling.

Given this insight, we can now finally implement the StartPixelSample() method. The code that solves Equation (8.21) for i is in the <<Compute Halton sample index for first sample in pixel p>>, which is not included here in the book; see Grünschloß et al. (2012) for details of the algorithm.

Given the index into the Halton sequence that corresponds to the first sample for the pixel, we need to find the index for the requested sample, sampleIndex. Because the bases b equals 2 and b equals 3 used in the HaltonSampler for pixel samples are relatively prime, it follows that if the sample values are scaled by some left-parenthesis 2 Superscript j Baseline comma 3 Superscript k Baseline right-parenthesis , then any particular pixel in the range left-parenthesis 0 comma 0 right-parenthesis right-arrow left-parenthesis 2 Superscript j Baseline minus 1 comma 3 Superscript k Baseline minus 1 right-parenthesis will be visited once every 2 Superscript j Baseline 3 Superscript k samples. That product is stored in sampleStride and the final Halton index is found by adding the product of that and the current sampleIndex.

<<HaltonSampler Public Methods>>= 
void StartPixelSample(Point2i p, int sampleIndex, int dim) { haltonIndex = 0; int sampleStride = baseScales[0] * baseScales[1]; <<Compute Halton sample index for first sample in pixel p>> 
if (sampleStride > 1) { Point2i pm(Mod(p[0], MaxHaltonResolution), Mod(p[1], MaxHaltonResolution)); for (int i = 0; i < 2; ++i) { uint64_t dimOffset = (i == 0) ? InverseRadicalInverse(pm[i], 2, baseExponents[i]) : InverseRadicalInverse(pm[i], 3, baseExponents[i]); haltonIndex += dimOffset * (sampleStride / baseScales[i]) * multInverse[i]; } haltonIndex %= sampleStride; }
haltonIndex += sampleIndex * sampleStride; dimension = std::max(2, dim); }

<<HaltonSampler Private Members>>+= 
int64_t haltonIndex = 0; int dimension = 0;

The methods that generate Halton sample dimensions are straightforward; they just increment the dimension member variable based on how many dimensions they have consumed and call the appropriate radical inverse function. In the unlikely case that the maximum supported number of dimensions have been used, the implementation wraps around to the start and then skips over the first two dimensions, which are used solely for pixel samples.

<<HaltonSampler Public Methods>>+=  
Float Get1D() { if (dimension >= PrimeTableSize) dimension = 2; return SampleDimension(dimension++); }

The SampleDimension() method takes care of calling the appropriate radical inverse function for the current sample in the current dimension according to the selected randomization strategy.

<<HaltonSampler Private Methods>>= 
Float SampleDimension(int dimension) const { if (randomize == RandomizeStrategy::None) return RadicalInverse(dimension, haltonIndex); else if (randomize == RandomizeStrategy::PermuteDigits) return ScrambledRadicalInverse(dimension, haltonIndex, (*digitPermutations)[dimension]); else return OwenScrambledRadicalInverse(dimension, haltonIndex, MixBits(1 + (dimension << 4))); }

The Get2D() method is easily implemented using SampleDimension().

<<HaltonSampler Public Methods>>+=  
Point2f Get2D() { if (dimension + 1 >= PrimeTableSize) dimension = 2; int dim = dimension; dimension += 2; return {SampleDimension(dim), SampleDimension(dim + 1)}; }

GetPixel2D() has to account for two important details in the rest of the HaltonSampler implementation. First, because the computation of the sample index, haltonIndex, in StartPixelSample() does not account for random digit permutations, those must not be included in the samples returned for the first two dimensions: a call to RadicalInverse() is always used here.

Second, because the first baseExponents[i] digits of the first two dimensions’ radical inverses are used to select which pixel is sampled, these digits must be discarded before computing the radical inverse for the first two dimensions of the sample, since the GetPixel2D() method is supposed to return the fractional offset in left-bracket 0 comma 1 right-parenthesis squared within the pixel being sampled. This is most easily done by removing the trailing digits of the sample index before computing the radical inverse. Because the first dimension is base 2, this can efficiently be done using a shift, though a divide is necessary for base 3 in the second dimension.

<<HaltonSampler Public Methods>>+= 

8.6.4 Evaluation

Figure 8.29 shows plots of the power spectra for the HaltonSampler with each of the three randomization strategies. The frequency space perspective is revealing. First, note that all three strategies have low energy along the two axes: this indicates that they all do well with functions that mostly vary in only one dimension. This behavior can be understood from their construction: because each dimension uses an independent radical inverse, 1D projections of the sample points are stratified. (Consider in comparison the jittered sampling pattern’s PSD, which had a radially symmetric distribution around the origin. Given n 2D stratified samples, only StartRoot n EndRoot are guaranteed to be stratified along either of the dimensions, whereas with the Halton sampler, all n are.)

Figure 8.29: Power Spectra of Points Generated by the HaltonSampler. (a) Using no randomization, with substantial variation in power at the higher frequencies. (b) Using random digit scrambling, which improves the regularity of the PSD but still contains some spikes. (c) Using Owen scrambling, which gives near unit power at the higher frequencies, making it especially effective for antialiasing and integration.

However, the non-randomized Halton sampler has wide variation in its PSD at higher frequencies. Ideally, those frequencies would all have roughly unit energy, but in this case, some frequencies have over a hundred times more and others a hundred times less. Results will be poor if the frequencies of the function match the ones with high power in the PSD. This issue can be seen in rendered images; Figure 8.30 compares the visual results from sampling a checkerboard texture using a Halton-based sampler to using the stratified sampler from the previous section. Note the unpleasant pattern along edges in the foreground and toward the horizon.

Figure 8.30: Comparison of the Stratified Sampler to a Low-Discrepancy Sampler Based on Halton Points on the Image Plane. (a) The stratified sampler with a single sample per pixel and (b) the Halton sampler with a single sample per pixel and no scrambling. Note that although the Halton pattern is able to reproduce the checker pattern farther toward the horizon than the stratified pattern, there is a regular structure to the error that is visually distracting; it does not turn aliasing into less objectionable noise as well as jittering does.

Returning to the power spectra in Figure 8.29, we can see that random digit permutations give a substantial improvement in the power spectrum, though there is still clear structure, with some frequencies having very low power and others still having high power. The benefit of Owen scrambling in this case is striking: it gives a uniform power spectrum at higher frequencies while maintaining low power along the axes.

Figure 8.31: Mean Squared Error When Integrating Two Simple 2D Functions. Both are plotted using a log–log scale so that the asymptotic convergence rate can be seen from the slopes of the lines. For the stratified sampler, only square n times n stratifications are plotted. (a) With the smooth Gaussian function shown, the Halton sampler has a higher asymptotic rate of convergence than both stratified and independent sampling. Its performance is particularly good for sample counts of 2 Superscript i Baseline 3 Superscript i for integer i . (b) With the rotated checkerboard, stratified sampling is initially no better than independent sampling since the strata are not aligned with the checks. However, once the strata start to become smaller than the checks (around 256 samples), its asymptotic rate of convergence improves.

It can also be illuminating to measure the performance of samplers with simple functions that can be integrated analytically. Figure 8.31 shows plots of mean squared error (MSE) for using the independent, stratified, and Halton samplers for integrating a Gaussian and a checkerboard function (shown in the plots). In this case, using a log–log scale has the effect of causing convergence rates of the form upper O left-parenthesis n Superscript c Baseline right-parenthesis to appear as lines with slope c , which makes it easier to compare asymptotic convergence of various techniques. For both functions, both stratified and Halton sampling have a higher rate of convergence than the upper O left-parenthesis 1 slash n right-parenthesis of independent sampling, as can be seen by the steeper slopes of their error curves. The Halton sampler does especially well with the Gaussian, achieving nearly two thousand times lower MSE than independent sampling at 4,096 samples.

Figure 8.32 shows the image of a test scene that we will use for comparing samplers. It features a moving camera, defocus blur, illumination from an environment map light source, and multiply scattered light from sources to give an integral with tens of dimensions. Figure 8.33 is a log–log plot of MSE versus number of samples for these samplers with this scene. With a more complex integrand than the simple ones in Figure 8.31, the Halton sampler does not have the enormous benefit it did there. Nevertheless, it makes a significant improvement to error—for example, MSE is 1.09 times lower than independent sampling at 4,096 samples per pixel.

Figure 8.32: Test Scene for Sampler Evaluation. This scene requires integrating a function of tens of dimensions, including defocus blur, a moving camera, and multiply scattered illumination from an environment map light source. (Dragon model courtesy of the Stanford Computer Graphics Laboratory.)

Figure 8.33: Log–Log Plot of MSE versus Number of Samples for the Scene in Figure 8.32. The Halton sampler gives consistently lower error than both the independent and stratified samplers and converges at a slightly higher rate.