7.5 (0, 2)-Sequence Sampler

Another approach for generating high-quality samples takes advantage of a remarkable property of certain low-discrepancy sequences that allows us to satisfy two desirable properties of samples (only one of which was satisfied with the StratifiedSampler): they generate sample vectors for a pixel’s worth of image samples such that the sample values for each pixel sample are well distributed with respect to each other, and simultaneously such that the aggregate collection of sample values for all of the pixel samples in the pixel are collectively well distributed.

This sequence uses the first two dimensions of a low-discrepancy sequence derived by Sobol prime . This sequence is a special type of low-discrepancy sequence known as a left-parenthesis 0 comma 2 right-parenthesis -sequence. left-parenthesis 0 comma 2 right-parenthesis -sequences are stratified in a very general way. For example, the first 16 samples in a left-parenthesis 0 comma 2 right-parenthesis -sequence satisfy the stratification constraint from stratified sampling in Section 7.3, meaning there is just one sample in each of the boxes of extent left-parenthesis one-fourth comma one-fourth right-parenthesis . However, they also satisfy the Latin hypercube constraint, as only one of them is in each of the boxes of extent left-parenthesis one-sixteenth comma 1 right-parenthesis and left-parenthesis 1 comma one-sixteenth right-parenthesis . Furthermore, there is only one sample in each of the boxes of extent left-parenthesis one-half comma one-eighth right-parenthesis and left-parenthesis one-eighth comma one-half right-parenthesis .

Figure 7.28 shows all of the possibilities for dividing the domain into regions where the first 16 samples of a left-parenthesis 0 comma 2 right-parenthesis -sequence satisfy the stratification properties. Each succeeding sequence of 16 samples from this pattern also satisfies these distribution properties.

Figure 7.28: A sampling pattern that has a single sample in all of the base 2 elementary intervals. It satisfies both the 4 times 4 stratification and Latin hypercube constraints as well as the other two stratification constraints shown.

In general, any sequence of length 2 Superscript l 1 plus l 2 (where l Subscript i is a nonnegative integer) from a left-parenthesis 0 comma 2 right-parenthesis -sequence satisfies this general stratification constraint. The set of elementary intervals in two dimensions, base 2, is defined as

upper E equals StartSet left-bracket StartFraction a 1 Over 2 Superscript l 1 Baseline EndFraction comma StartFraction a 1 plus 1 Over 2 Superscript l 1 Baseline EndFraction right-parenthesis times left-bracket StartFraction a 2 Over 2 Superscript l 2 Baseline EndFraction comma StartFraction a 2 plus 1 Over 2 Superscript l 2 Baseline EndFraction right-parenthesis EndSet comma

where the integer a Subscript i Baseline equals 0 comma 1 comma 2 comma 3 comma ellipsis comma 2 Superscript l Super Subscript i Superscript Baseline minus 1 . One sample from each of the first 2 Superscript l 1 plus l 2 values in the sequence will be in each of the elementary intervals. Furthermore, the same property is true for each subsequent set of 2 Superscript l 1 plus l 2 values.

To understand now how left-parenthesis 0 comma 2 right-parenthesis -sequences can be applied to generating 2D samples, consider a pixel with 2 times 2 image samples, each with an array of 4 times 4 2D samples. The first left-parenthesis 2 times 2 right-parenthesis times left-parenthesis 4 times 4 right-parenthesis equals 2 Superscript 6 values of a left-parenthesis 0 comma 2 right-parenthesis -sequence are well distributed with respect to each other according to the corresponding set of elementary intervals. Furthermore, the first 4 times 4 equals 2 Superscript 4 samples are themselves well distributed according to their corresponding elementary intervals, as are the next 2 Superscript 4 of them, and the subsequent ones, and so on. Therefore, we can use the first 16 left-parenthesis 0 comma 2 right-parenthesis -sequence samples for the samples for the 4 times 4 array for the first image sample for a pixel, then the next 16 for the next image sample, and so forth. The result is an extremely well-distributed set of sample points.

7.5.1 Sampling with Generator Matrices

The Sobol prime sequence is based on a different mechanism for generating sample points than the HaltonSampler, which used the radical inverse in various dimensions. Even with the integer divides in the radical inverse function converted to multiplies and shifts, the amount of computation needed to compute the billions of samples that can be needed for high-quality, high-resolution renderings can still be significant. Most of the computational expense comes from the cost of performing non-base-2 computation on computers that natively operate in base 2. (Consider the contrast between the <<Compute base-2 radical inverse>> fragment and the RadicalInverseSpecialized() template function.)

Given the high cost of non-base-2 operations, it’s natural to try to develop sample generation algorithms that operate entirely in base 2. One such approach that has been effective has been to use generator matrices that allow all computation to be done in the same base. Instead of using a different base in each dimension, as the Halton sampler did, a different generator matrix is used in each dimension. With well-chosen matrices for each sampled dimension, it’s possible to generate very good low-discrepancy distributions of points. For example, left-parenthesis 0 comma 2 right-parenthesis -sequences can be defined using two specific generator matrices in base 2.

To see how generator matrices are used, consider an n -digit number a in base b , where the i th digit of a is d Subscript i Baseline left-parenthesis a right-parenthesis and where we have an n times n generator matrix bold upper C . Then the corresponding sample point x Subscript a Baseline element-of left-bracket 0 comma 1 right-parenthesis is defined by

x Subscript a Baseline equals left-bracket b Superscript negative 1 Baseline b Superscript negative 2 Baseline midline-horizontal-ellipsis b Superscript negative n Baseline right-bracket Start 4 By 4 Matrix 1st Row 1st Column c Subscript 1 comma 1 Baseline 2nd Column c Subscript 1 comma 2 Baseline 3rd Column midline-horizontal-ellipsis 4th Column c Subscript 1 comma n Baseline 2nd Row 1st Column c Subscript 2 comma 1 Baseline 2nd Column down-right-diagonal-ellipsis 3rd Column Blank 4th Column c Subscript 2 comma n Baseline 3rd Row 1st Column vertical-ellipsis 2nd Column Blank 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column c Subscript n comma 1 Baseline 2nd Column midline-horizontal-ellipsis 3rd Column midline-horizontal-ellipsis 4th Column c Subscript n comma n Baseline EndMatrix Start 4 By 1 Matrix 1st Row d 1 left-parenthesis a right-parenthesis 2nd Row d 2 left-parenthesis a right-parenthesis 3rd Row vertical-ellipsis 4th Row d Subscript n Baseline left-parenthesis a right-parenthesis EndMatrix comma
(7.9)

where all arithmetic is performed in the ring bold upper Z Subscript b (in other words, when all operations are performed modulo b ). This construction gives a total of b Superscript n points as a ranges from 0 to b Superscript n Baseline minus 1 . If the generator matrix is the identity matrix, then this definition corresponds to the regular radical inverse, base b . (It’s worth pausing to make sure you see this connection between Equations (7.7) and (7.9) before continuing.)

In this section, we will exclusively use b equals 2 and n equals 32 . While introducing a 32 times 32 matrix to the sample generation algorithm may not seem like a step toward better performance, we’ll see that in the end the sampling code can be mapped to an implementation that uses a small number of bit operations to perform this computation in an extremely efficient manner.

The first step toward high performance comes from the fact that we’re working in base 2; as such, all entries of bold upper C are either 0 or 1 and thus we can represent either each row or each column of the matrix with a single unsigned 32-bit integer. We’ll choose to represent columns of the matrix as uint32_ts; this choice leads to a very efficient algorithm for multiplying the d Subscript i column vector by bold upper C .

Now consider the task of computing the bold upper C left-bracket d Subscript i Baseline left-parenthesis a right-parenthesis right-bracket Superscript upper T matrix-vector product; using the definition of matrix-vector multiplication, we have:

Start 4 By 4 Matrix 1st Row 1st Column c Subscript 1 comma 1 Baseline 2nd Column c Subscript 1 comma 2 Baseline 3rd Column midline-horizontal-ellipsis 4th Column c Subscript 1 comma n Baseline 2nd Row 1st Column c Subscript 2 comma 1 Baseline 2nd Column down-right-diagonal-ellipsis 3rd Column Blank 4th Column c Subscript 2 comma n Baseline 3rd Row 1st Column vertical-ellipsis 2nd Column Blank 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column c Subscript n comma 1 Baseline 2nd Column midline-horizontal-ellipsis 3rd Column midline-horizontal-ellipsis 4th Column c Subscript n comma n Baseline EndMatrix Start 4 By 1 Matrix 1st Row d 1 left-parenthesis a right-parenthesis 2nd Row d 2 left-parenthesis a right-parenthesis 3rd Row vertical-ellipsis 4th Row d Subscript n Baseline left-parenthesis a right-parenthesis EndMatrix equals d 1 Start 4 By 1 Matrix 1st Row c Subscript 1 comma 1 Baseline 2nd Row c Subscript 2 comma 1 Baseline 3rd Row vertical-ellipsis 4th Row c Subscript n comma 1 Baseline EndMatrix plus midline-horizontal-ellipsis plus d Subscript n Baseline Start 4 By 1 Matrix 1st Row c Subscript 1 comma n Baseline 2nd Row c Subscript 2 comma n Baseline 3rd Row vertical-ellipsis 4th Row c Subscript n comma n Baseline EndMatrix period

In other words, for each digit of d Subscript i that has a value of 1, the corresponding column of bold upper C should be summed. This addition can in turn be performed very efficiently in bold upper Z 2 : in that setting, addition corresponds to the exclusive OR operation. (Consider the combinations of the two possible operand values—0 and 1—and the result of adding them normal m normal o normal d 2 , and compare to the values computed by exclusive OR with the same operand values.) Thus, the multiplication bold upper C left-bracket d Subscript i Baseline left-parenthesis a right-parenthesis right-bracket Superscript upper T is just a matter of exclusive ORing together the columns i of bold upper C where d Subscript i Baseline left-parenthesis a right-parenthesis ’s bit is 1. This computation is implemented in the MultiplyGenerator() function.

<<Low Discrepancy Inline Functions>>+=  
inline uint32_t MultiplyGenerator(const uint32_t *C, uint32_t a) { uint32_t v = 0; for (int i = 0; a != 0; ++i, a >>= 1) if (a & 1) v ^= C[i]; return v; }

Going back to Equation (7.9) now, if we denote the column vector from the product v equals bold upper C left-bracket d Subscript i Baseline left-parenthesis a right-parenthesis right-bracket Superscript upper T , then consider the vector product

x Subscript a Baseline equals left-bracket 2 Superscript negative 1 Baseline 2 Superscript negative 2 Baseline midline-horizontal-ellipsis 2 Superscript negative n Baseline right-bracket Start 4 By 1 Matrix 1st Row v 1 2nd Row v 2 3rd Row vertical-ellipsis 4th Row v Subscript n Baseline EndMatrix equals sigma-summation Underscript i equals 1 Overscript 32 Endscripts 2 Superscript negative i Baseline v Subscript i Baseline period

Because the entries of v are stored in a single uint32_t, their value interpreted as a uint32_t is

v equals v 1 plus 2 v 2 plus midline-horizontal-ellipsis equals sigma-summation Underscript i equals 1 Overscript 32 Endscripts 2 Superscript i minus 1 Baseline v Subscript i Baseline period

If we were to reverse the order of the bits in the uint32_t, then we would have the value

v prime equals sigma-summation Underscript i equals 1 Overscript 32 Endscripts 2 Superscript 32 minus i Baseline v Subscript i Baseline period

This is a more useful value: if we divide this value by 2 Superscript 32 , we get Equation (7.11), which is x Subscript a , the value we’re trying to compute.

Thus, if we take the result of the MultiplyGenerator() function, reverse the order of the bits in the returned value (e.g., by using ReverseBits32()), and then divide that integer value by 2 Superscript 32 to compute a floating-point value in left-bracket 0 comma 1 right-parenthesis , we’ve computed our sample value.

To save the small cost of reversing the bits, we can equivalently reverse the bits in all of the columns of the generator matrix C before passing it to MultiplyGenerator(). We will use that convention in the following.

To make left-parenthesis 0 comma 2 right-parenthesis -sequences useful in practice, we also need to be able to generate multiple different sets of 2D sample values for each image sample, and we would like to generate different sample values for each pixel. One approach to this problem would be to use carefully chosen nonoverlapping subsequences of the left-parenthesis 0 comma 2 right-parenthesis -sequence for each pixel. Another approach is to randomly scramble the left-parenthesis 0 comma 2 right-parenthesis -sequence, giving a new left-parenthesis 0 comma 2 right-parenthesis -sequence built by applying a random permutation to the base- b digits of the values in the original sequence.

The scrambling approach we will use is due to Kollig and Keller (2002). It repeatedly partitions and shuffles the unit square left-bracket 0 comma 1 right-parenthesis squared . In each of the two dimensions, it first divides the square in half and then swaps the two halves with 50% probability. Then it splits each of the intervals left-bracket 0 comma 0.5 right-parenthesis and left-bracket 0.5 comma 1 right-parenthesis in half and randomly exchanges each of those two halves. This process continues recursively until all of the bits of the base-2 representation have been processed. This process was carefully designed so that it preserves the low-discrepancy properties of the set of points; otherwise, the advantages of the left-parenthesis 0 comma 2 right-parenthesis -sequence would be lost from the scrambling. Figure 7.29 shows an unscrambled left-parenthesis 0 comma 2 right-parenthesis -sequence and two randomly scrambled variations of it.

Figure 7.29: (a) A low-discrepancy left-parenthesis 0 comma 2 right-parenthesis -sequence-based sampling pattern and (b, c) two randomly scrambled instances of it. Random scrambling of low-discrepancy patterns is an effective way to eliminate the artifacts that would be present in images if we used the same sampling pattern in every pixel, while still preserving the low-discrepancy properties of the point set being used.

Two things make the scrambling process efficient: first, because we are scrambling two sequences that are computed in base 2, the digits d Subscript i of the sequences are all 0 or 1, and scrambling a particular digit is equivalent to exclusive-ORing it with 0 or 1. Second, the simplification is made such that at each level  l of the recursive scrambling, the same decision will be made as to whether to swap each of the 2 Superscript l minus 1 pairs of subintervals or not. The result of these two design choices is that the scrambling can be encoded as a set of bits stored in a uint32_t and can be applied to the original digits via exclusive-OR operations.

The SampleGeneratorMatrix() function pulls these pieces together to generate sample values.

<<Low Discrepancy Inline Functions>>+=  
inline Float SampleGeneratorMatrix(const uint32_t *C, uint32_t a, uint32_t scramble = 0) { return (MultiplyGenerator(C, a) ^ scramble) * 0x1p-32f; }

The SampleGeneratorMatrix() function is already fairly efficient, performing a handful of arithmetic operations each time through the loop in MultiplyGenerator() that runs for a number of iterations equal to the base-2 logarithm of the value a. Remarkably, it’s possible to do even better by changing the order in which samples are generated, enumerating them in Gray code order.

With Gray codes, successive binary values differ in only a single bit; the third column of Table 7.4 shows the first eight integers in Gray code order. Note that not only does only a single bit change between any pair of values but also that in any power-of-two-sized number of values n starting from 0, the Gray code enumerates all of the values from 0 to n minus 1 , just in a different order than usual.

Table 7.4: The First Eight Integers, in Gray Code Order. Each Gray code value g left-parenthesis n right-parenthesis differs by just a single bit from the previous one, g left-parenthesis n minus 1 right-parenthesis . The index of the bit that changes is given by the number of trailing zeros in the binary value n . Note that within any power-of-two-sized set of n values starting from 0, all of the integers between 0 and n minus 1 are represented, just in a different order than usual.

n (base 10) n (binary) g left-parenthesis n right-parenthesis Changed Bit Index
0 000 000 n/a
1 001 001 0
2 010 011 1
3 011 010 0
4 100 110 2
5 101 111 0
6 110 101 1
7 111 100 0

Computing the n th Gray code value can be done very efficiently.

<<Low Discrepancy Inline Functions>>+=  
inline uint32_t GrayCode(uint32_t n) { return (n >> 1) ^ n; }

By enumerating samples in Gray code order, we can take great advantage of the fact that only a single bit of g left-parenthesis n right-parenthesis changes between subsequent samples. Assume that we have computed the product bold upper C left-bracket d Subscript i Baseline left-parenthesis a right-parenthesis right-bracket Superscript upper T Baseline equals v for some index a : if another value a prime differs by just one bit from a , then we only need to add or subtract one column of bold upper C from v to find v prime equals bold upper C left-bracket d Subscript upper I Baseline left-parenthesis a prime right-parenthesis right-bracket Superscript upper T (recall Equation (7.10)). Even better, both addition and subtraction normal m normal o normal d 2 can be performed with exclusive OR, so it doesn’t matter which operation is needed; we only need to know which bit changed. As can be seen from Table 7.4, the index of the bit that changes going from g left-parenthesis i right-parenthesis to g left-parenthesis i plus 1 right-parenthesis is given by the number of trailing 0s in the binary representation of i plus 1 . Most CPU instruction sets can count trailing 0 bits in a single instruction.

Putting this all together, we can very efficiently generate a series of samples using a generator matrix in Gray code order. GrayCodeSample() takes a generator matrix C, a number of samples to generate n, and stores the corresponding samples in memory at the location pointed to by p.

<<Low Discrepancy Inline Functions>>+=  
inline void GrayCodeSample(const uint32_t *C, uint32_t n, uint32_t scramble, Float *p) { uint32_t v = scramble; for (uint32_t i = 0; i < n; ++i) { p[i] = v * 0x1p-32f; /* 1/2^32 */ v ^= C[CountTrailingZeros(i + 1)]; } }

The x86 assembly code for the heart of the inner loop (with the loop control logic elided) is wonderfully brief:

xorps %xmm1, %xmm1 cvtsi2ssq %rax, %xmm1 mulss %xmm0, %xmm1 movss %xmm1, (%rcx,%rdx,4) incq %rdx bsfl %edx, %eax xorl $31, %eax xorl (%rdi,%rax,4), %esi

Even if one isn’t an x86 assembly language aficionado, one can appreciate that it’s an incredibly short sequence of instructions to generate each sample value.

There is a second version of GrayCodeSample() (not included here) for generating 2D samples; it takes a generator matrix for each dimension and fills in an array of Point2f values with the samples.

7.5.2 Sampler Implementation

The ZeroTwoSequenceSampler generates samples for positions on the film plane, lens, and other 2D samples using scrambled left-parenthesis 0 comma 2 right-parenthesis -sequences, and generates 1D samples with scrambled van der Corput sequences.

<<ZeroTwoSequenceSampler Declarations>>= 
class ZeroTwoSequenceSampler : public PixelSampler { public: <<ZeroTwoSequenceSampler Public Methods>> 
ZeroTwoSequenceSampler(int64_t samplesPerPixel, int nSampledDimensions = 4); void StartPixel(const Point2i &); std::unique_ptr<Sampler> Clone(int seed); int RoundCount(int count) const { return RoundUpPow2(count); }
};

Figure 7.30 compares the result of using a left-parenthesis 0 comma 2 right-parenthesis -sequence for sampling the lens for the depth of field to using a stratified pattern.

Figure 7.30: Comparisons of the Stratified and left-parenthesis 0 comma 2 right-parenthesis -Sequence Samplers for Rendering Depth of Field. (1) A reference image of the blurred edge of an out-of-focus sphere, (2) an image rendered using the StratifiedSampler, and (3) an image using the ZeroTwoSequenceSampler. The ZeroTwoSequenceSampler’s results are better than the stratified image, although the difference is smaller than the difference between stratified and random sampling.

The constructor rounds the number of samples per pixel up to a power of 2 if necessary, since subsets of left-parenthesis 0 comma 2 right-parenthesis -sequences that are not a power of 2 in size are much less well distributed over left-bracket 0 comma 1 right-parenthesis squared than those that are.

<<ZeroTwoSequenceSampler Method Definitions>>= 
ZeroTwoSequenceSampler::ZeroTwoSequenceSampler(int64_t samplesPerPixel, int nSampledDimensions) : PixelSampler(RoundUpPow2(samplesPerPixel), nSampledDimensions) { }

<<ZeroTwoSequenceSampler Public Methods>>= 
int RoundCount(int count) const { return RoundUpPow2(count); }

Since the ZeroTwoSequenceSampler is a PixelSampler, its StartPixel() method must not only generate array sample values for the samples in the pixel but must also generate samples for a number of dimensions of non-array samples.

<<ZeroTwoSequenceSampler Method Definitions>>+= 
void ZeroTwoSequenceSampler::StartPixel(const Point2i &p) { <<Generate 1D and 2D pixel sample components using left-parenthesis 0 comma 2 right-parenthesis -sequence>> 
for (size_t i = 0; i < samples1D.size(); ++i) VanDerCorput(1, samplesPerPixel, &samples1D[i][0], rng); for (size_t i = 0; i < samples2D.size(); ++i) Sobol2D(1, samplesPerPixel, &samples2D[i][0], rng);
<<Generate 1D and 2D array samples using left-parenthesis 0 comma 2 right-parenthesis -sequence>> 
for (size_t i = 0; i < samples1DArraySizes.size(); ++i) VanDerCorput(samples1DArraySizes[i], samplesPerPixel, &sampleArray1D[i][0], rng); for (size_t i = 0; i < samples2DArraySizes.size(); ++i) Sobol2D(samples2DArraySizes[i], samplesPerPixel, &sampleArray2D[i][0], rng);
PixelSampler::StartPixel(p); }

Generating the samples for the non-array dimensions expected by the PixelSampler is a matter of filling in the appropriate vectors with the appropriate number of sample values.

<<Generate 1D and 2D pixel sample components using left-parenthesis 0 comma 2 right-parenthesis -sequence>>= 
for (size_t i = 0; i < samples1D.size(); ++i) VanDerCorput(1, samplesPerPixel, &samples1D[i][0], rng); for (size_t i = 0; i < samples2D.size(); ++i) Sobol2D(1, samplesPerPixel, &samples2D[i][0], rng);

The sample vector dimensions with array samples are similar, though with multiple sample values in each dimension.

<<Generate 1D and 2D array samples using left-parenthesis 0 comma 2 right-parenthesis -sequence>>= 
for (size_t i = 0; i < samples1DArraySizes.size(); ++i) VanDerCorput(samples1DArraySizes[i], samplesPerPixel, &sampleArray1D[i][0], rng); for (size_t i = 0; i < samples2DArraySizes.size(); ++i) Sobol2D(samples2DArraySizes[i], samplesPerPixel, &sampleArray2D[i][0], rng);

The VanDerCorput() function generates a number of scrambled 1D sample values using the Gray code-based sampling machinery. Although a specialized implementation of this function that took advantage of the structure of the identity matrix could be written, here we use the existing Gray code implementation, which is more than sufficiently efficient.

<<Low Discrepancy Inline Functions>>+=  
inline void VanDerCorput(int nSamplesPerPixelSample, int nPixelSamples, Float *samples, RNG &rng) { uint32_t scramble = rng.UniformUInt32(); <<Define CVanDerCorput Generator Matrix>> 
const uint32_t CVanDerCorput[] = { 0b10000000000000000000000000000000, 0b1000000000000000000000000000000, 0b100000000000000000000000000000, 0b10000000000000000000000000000, <<Remainder of Van Der Corput generator matrix entries>> 
0b10000, 0b100000, 0b1000000, 0b10000000, 0b100000000, 0b1000000000, 0b10000000000, 0b100000000000, 0b1000000000000, 0b10000000000000, 0b100000000000000, 0b1000000000000000, 0b10000000000000000, 0b100000000000000000, 0b1000000000000000000, 0b10000000000000000000, 0b100000000000000000000, 0b1000000000000000000000, 0b10000000000000000000000, 0b100000000000000000000000, 0b1000000000000000000000000, 0b10000000000000000000000000, 0b100000000000000000000000000, 0b1000000000000000000000000000, 0b10000000000000000000000000000, 0b100000000000000000000000000000, 0b1000000000000000000000000000000, 0b10000000000000000000000000000000,
};
int totalSamples = nSamplesPerPixelSample * nPixelSamples; GrayCodeSample(CVanDerCorput, totalSamples, scramble, samples); <<Randomly shuffle 1D sample points>> 
for (int i = 0; i < nPixelSamples; ++i) Shuffle(samples + i * nSamplesPerPixelSample, nSamplesPerPixelSample, 1, rng); Shuffle(samples, nPixelSamples, nSamplesPerPixelSample, rng);
}

The generator matrix for the 1D van der Corput sequence is just the identity matrix but with each column’s bits reversed, as per the earlier convention.

<<Define CVanDerCorput Generator Matrix>>= 
const uint32_t CVanDerCorput[] = { 0b10000000000000000000000000000000, 0b1000000000000000000000000000000, 0b100000000000000000000000000000, 0b10000000000000000000000000000, <<Remainder of Van Der Corput generator matrix entries>> 
0b10000, 0b100000, 0b1000000, 0b10000000, 0b100000000, 0b1000000000, 0b10000000000, 0b100000000000, 0b1000000000000, 0b10000000000000, 0b100000000000000, 0b1000000000000000, 0b10000000000000000, 0b100000000000000000, 0b1000000000000000000, 0b10000000000000000000, 0b100000000000000000000, 0b1000000000000000000000, 0b10000000000000000000000, 0b100000000000000000000000, 0b1000000000000000000000000, 0b10000000000000000000000000, 0b100000000000000000000000000, 0b1000000000000000000000000000, 0b10000000000000000000000000000, 0b100000000000000000000000000000, 0b1000000000000000000000000000000, 0b10000000000000000000000000000000,
};

There is a subtle implementation detail that must be accounted for when using scrambled left-parenthesis 0 comma 2 right-parenthesis -sequences. Often, integrators will use samples from more than one of the sampling patterns that the sampler creates in the process of computing the values of particular integrals. For example, they might use a sample from a 1D pattern to select one of the upper N light sources in the scene to sample illumination from and then might use a sample from a 2D pattern to select a sample point on that light source, if it is an area light.

Even if these two patterns are computed with random scrambling with different random scramble values for each one, some correlation can still remain between elements of these patterns, such that the i th element of the 1D pattern and the i th element of the 2D pattern are related. As such, in the earlier area lighting example, the distribution of sample points on each light source would not in general cover the entire light due to this correlation, leading to unusual rendering errors.

This problem can be solved easily enough by randomly shuffling the various dimensions individually after they are generated. After generating a scrambled 1D low-discrepancy sampling pattern, giving a well-distributed set of samples across all of the image samples for this pixel, this function shuffles these samples in two ways. Consider, for example, a pixel with 8 image samples, each of which has 4 1D samples for the integrator (giving a total of 32 integrator samples). First, it shuffles samples within each of the 8 groups of 4 samples, putting each set of 4 into a random order. Next, it shuffles each of the 8 groups of 4 samples as a block, with respect to the other blocks of 4 samples.

<<Randomly shuffle 1D sample points>>= 
for (int i = 0; i < nPixelSamples; ++i) Shuffle(samples + i * nSamplesPerPixelSample, nSamplesPerPixelSample, 1, rng); Shuffle(samples, nPixelSamples, nSamplesPerPixelSample, rng);

The Sobol2D() function follows a similar structure to VanDerCorput() but uses two generator matrices to generate the first two dimensions of Sobol prime points. Its implementation isn’t included here.

<<Low Discrepancy Declarations>>+=  
inline void Sobol2D(int nSamplesPerPixelSample, int nPixelSamples, Point2f *samples, RNG &rng);

Figure 7.31 shows the result of using the left-parenthesis 0 comma 2 right-parenthesis -sequence for the area lighting example scene. Note that not only does it give a visibly better image than stratified patterns, but it also does well with one light sample per image sample, unlike the stratified sampler.

Figure 7.31: When the ZeroTwoSequenceSampler is used for the area light sampling example, similar results are generated (1) with both 1 image sample and 16 light samples as well as (2) with 16 image samples and 1 light sample, thanks to the left-parenthesis 0 comma 2 right-parenthesis -sequence sampling pattern that ensures good distribution of samples over the pixel area in both cases. Compare these images to Figure 7.24, where the stratified pattern generates a much worse set of light samples when only 1 light sample is taken for each of the 16 image samples.