5.4 Film and Imaging

After the camera’s projection or lens system forms an image of the scene on the film, it is necessary to model how the film measures light to create the final image generated by the renderer. This section starts with an overview of the radiometry of how light is measured on the film and then continues with the topic of how spectral energy is converted to tristimulus colors (typically, RGB). This leads to the PixelSensor class, which models that process as well as further processing that is generally performed by cameras. After next considering how image samples on the film are accumulated into the pixels of the final image, we introduce the Film interface and then two implementations of it that put this model into practice.

5.4.1 The Camera Measurement Equation

Given a simulation of the process of real image formation, it is also worthwhile to more carefully define the radiometry of the measurement made by a film or a camera sensor. Rays from the rear of the lens to the film carry radiance from the scene. As considered from a point on the film plane, there is thus a set of directions from which radiance is incident. The distribution of radiance leaving the lens is affected by the amount of defocus blur seen by the point on the film—Figure 5.17 shows two images of the radiance from the lens as seen from two points on the film.

Figure 5.17: The Image of the Scene on the Lens, as Seen from Two Points on the Film Plane. Both are from a rendering of the San Miguel scene. (a) As seen from a point where the scene is in sharp focus; the incident radiance is effectively constant over its area. (b) As seen from a pixel in an out-of-focus area, a small image of part of the scene is visible, with potentially rapidly varying radiance.

Given the incident radiance function, we can define the irradiance at a point on the film plane. If we start with the definition of irradiance in terms of radiance, Equation (4.7), we can then convert from an integral over solid angle to an integral over area (in this case, an area upper A Subscript normal e of the plane tangent to the rear lens element) using Equation (4.9). This gives us the irradiance for a point normal p Subscript on the film plane:

upper E left-parenthesis normal p Subscript Baseline right-parenthesis equals integral Underscript upper A Subscript normal e Baseline Endscripts upper L Subscript normal i Baseline left-parenthesis normal p Subscript Baseline comma normal p Superscript prime Baseline right-parenthesis StartFraction StartAbsoluteValue cosine theta cosine theta Superscript prime Baseline EndAbsoluteValue Over StartAbsoluteValue EndAbsoluteValue normal p Superscript prime Baseline minus normal p Subscript Baseline StartAbsoluteValue EndAbsoluteValue squared EndFraction normal d upper A Subscript normal e Baseline period

Figure 5.18 shows the geometry of the situation.

Figure 5.18: Geometric setting for the irradiance measurement equation, (5.3). Radiance can be measured as it passes through points normal p prime on the plane tangent to the rear lens element to a point on the film plane normal p Subscript . z is the axial distance from the film plane to the rear element tangent plane, and theta is the angle between the vector from normal p prime to normal p Subscript and the optical axis.

Because the film plane is parallel to the lens’s plane, theta equals theta prime . We can further take advantage of the fact that the distance between normal p Subscript and normal p prime is equal to the axial distance from the film plane to the lens (which we will denote here by z ) divided by cosine theta . Putting this all together, we have

upper E left-parenthesis normal p Subscript Baseline right-parenthesis equals StartFraction 1 Over z squared EndFraction integral Underscript upper A Subscript normal e Baseline Endscripts upper L Subscript normal i Baseline left-parenthesis normal p Subscript Baseline comma normal p Superscript prime Baseline right-parenthesis StartAbsoluteValue cosine Superscript 4 Baseline theta EndAbsoluteValue normal d upper A Subscript normal e Baseline period
(5.3)

For cameras where the extent of the film is relatively large with respect to the distance z , the cosine Superscript 4 Baseline theta term can meaningfully reduce the incident irradiance—this factor also contributes to vignetting. Most modern digital cameras correct for this effect with preset correction factors that increase pixel values toward the edges of the sensor.

Integrating irradiance at a point on the film over the time that the shutter is open gives radiant exposure, which is the radiometric unit for energy per area, normal upper J slash normal m squared .

upper H left-parenthesis normal p Subscript Baseline right-parenthesis equals StartFraction 1 Over z squared EndFraction integral Subscript t 0 Superscript t 1 Baseline integral Underscript upper A Subscript normal e Baseline Endscripts upper L Subscript normal i Baseline left-parenthesis normal p Subscript Baseline comma normal p prime comma t Superscript prime Baseline right-parenthesis StartAbsoluteValue cosine Superscript 4 Baseline theta EndAbsoluteValue normal d upper A Subscript normal e Baseline normal d t Superscript prime Baseline period
(5.4)

(Radiant exposure is also known as fluence.) Measuring radiant exposure at a point captures the effect that the amount of energy received on the film plane is partially related to the length of time the camera shutter is open.

Photographic film (or CCD or CMOS sensors in digital cameras) measures radiant energy over a small area. Taking Equation (5.4) and also integrating over sensor pixel area, upper A Subscript normal p , we have

upper J equals StartFraction 1 Over z squared EndFraction integral Underscript upper A Subscript normal p Baseline Endscripts integral Subscript t 0 Superscript t 1 Baseline integral Underscript upper A Subscript normal e Baseline Endscripts upper L Subscript normal i Baseline left-parenthesis normal p Subscript Baseline comma normal p prime comma t Superscript prime Baseline right-parenthesis StartAbsoluteValue cosine Superscript 4 Baseline theta EndAbsoluteValue normal d upper A Subscript normal e Baseline normal d t prime normal d upper A Subscript normal p Baseline comma
(5.5)

the Joules arriving at a pixel; this is called the camera measurement equation.

Although these factors apply to all of the camera models introduced in this chapter, they are only included in the implementation of the RealisticCamera. The reason is purely pragmatic: most renderers do not model this effect, so omitting it from the simpler camera models makes it easier to compare images rendered by pbrt with those rendered by other systems.

5.4.2 Modeling Sensor Response

Traditional film is based on a chemical process where silver halide crystals produce silver bromide when they are exposed to light. Silver halide is mostly sensitive to blue light, but color images can be captured using multiple layers of crystals with color filters between them and dyes that make silver halide more responsive to other wavelengths.

Modern digital cameras use CCD or CMOS sensors where each pixel effectively counts the number of photons it is exposed to by transforming photons into electrical charge. A variety of approaches to capturing color images have been developed, but the most common of them is to have a color filter over each pixel so that each measures red, green, or blue by counting only the photons that make it through the filter. Each pixel is often supplemented with a microlens that increases the amount of light that reaches the sensor.

For both film and digital sensors, color measurements at pixels can be modeled using spectral response curves that characterize the color filter or film’s chemical response to light as a function of wavelength. These functions are defined such that, for example, given an incident spectral distribution s left-parenthesis lamda right-parenthesis , a pixel’s red component is given by

r equals integral s left-parenthesis lamda right-parenthesis ModifyingAbove r With bar left-parenthesis lamda right-parenthesis normal d lamda period
(5.6)

Digital sensor pixels are typically arranged in a mosaic, with twice as many green pixels as red and blue, due to the human visual system’s greater sensitivity to green. One implication of pixel mosaics is that a demosaicing algorithm must be used to convert these sensor pixels to image pixels where the red, green, and blue color components are colocated. The naive approach of taking quads of mosaiced pixels and using their color values as is does not work well, since the constituent sensor pixels are at slightly different locations.

There are many challenges in designing digital sensors, most of them stemming from the small size of pixels, which is a result of demand for high-resolution images. The smaller a pixel is, the fewer photons it is exposed to given a lens and exposure time, and in turn, the harder it is to accurately measure the light. Pixel arrays suffer from a variety of types of noise, of which shot noise is generally the most significant. It is due to the discrete nature of photons: there is random fluctuation in the number of photons that are counted, which matters more the fewer of them that are captured. Shot noise can be modeled using a Poisson distribution.

Each pixel must receive a sufficient amount of light either to cause the necessary chemical reactions or to count enough photons to capture an accurate image. In Equation (5.5), we saw that the energy captured at a pixel depends on the incident radiance, the pixel area, the exit pupil area, and the exposure time. With pixel area fixed for a given camera design, both increasing the lens aperture area and increasing the exposure time may introduce undesired side-effects in return for the additional light provided. A larger aperture reduces depth of field, which may lead to undesired defocus blur. Longer exposures can also cause blur due to moving objects in the scene or due to camera motion while the shutter is open. Sensors and film therefore provide an additional control in the form of an ISO setting.

For physical film, ISO encodes its responsiveness to light (higher ISO values require less light to record an image). In digital cameras, ISO controls the gain—a scaling factor that is applied to pixel values as they are read from the sensor. With physical cameras, increasing gain exacerbates noise, as noise in the initial pixel measurements is amplified. Because pbrt does not model the noise present in readings from physical sensors, the ISO value can be set arbitrarily to achieve a desired exposure.

In pbrt’s sensor model, we model neither mosaicing nor noise, nor other effects like blooming, where a pixel that is exposed to enough light will “spill over” and start increasing the measured value at adjacent pixels. We also do not simulate the process of image readout from the sensor: many cameras use a rolling shutter where scanlines are read in succession. For scenes with rapidly moving objects, this can give surprising results. Exercises at the end of the chapter suggest modifying pbrt in various ways to explore these effects.

The PixelSensor class implements pbrt’s semi-idealized model of pixel color measurement. It is defined in the files film.h and film.cpp.

<<PixelSensor Definition>>= 
class PixelSensor { public: <<PixelSensor Public Methods>> 
static PixelSensor *Create(const ParameterDictionary &parameters, const RGBColorSpace *colorSpace, Float exposureTime, const FileLoc *loc, Allocator alloc); static PixelSensor *CreateDefault(Allocator alloc = {}); PixelSensor(Spectrum r, Spectrum g, Spectrum b, const RGBColorSpace *outputColorSpace, Spectrum sensorIllum, Float imagingRatio, Allocator alloc) : r_bar(r, alloc), g_bar(g, alloc), b_bar(b, alloc), imagingRatio(imagingRatio) { <<Compute XYZ from camera RGB matrix>> 
<<Compute rgbCamera values for training swatches>> 
Float rgbCamera[nSwatchReflectances][3]; for (int i = 0; i < nSwatchReflectances; ++i) { RGB rgb = ProjectReflectance<RGB>(swatchReflectances[i], sensorIllum, &r_bar, &g_bar, &b_bar); for (int c = 0; c < 3; ++c) rgbCamera[i][c] = rgb[c]; }
<<Compute xyzOutput values for training swatches>> 
Float xyzOutput[24][3]; Float sensorWhiteG = InnerProduct(sensorIllum, &g_bar); Float sensorWhiteY = InnerProduct(sensorIllum, &Spectra::Y()); for (size_t i = 0; i < nSwatchReflectances; ++i) { Spectrum s = swatchReflectances[i]; XYZ xyz = ProjectReflectance<XYZ>(s, &outputColorSpace->illuminant, &Spectra::X(), &Spectra::Y(), &Spectra::Z()) * (sensorWhiteY / sensorWhiteG); for (int c = 0; c < 3; ++c) xyzOutput[i][c] = xyz[c]; }
<<Initialize XYZFromSensorRGB using linear least squares>> 
pstd::optional<SquareMatrix<3>> m = LinearLeastSquares(rgbCamera, xyzOutput, nSwatchReflectances); if (!m) ErrorExit("Sensor XYZ from RGB matrix could not be solved."); XYZFromSensorRGB = *m;
} PixelSensor(const RGBColorSpace *outputColorSpace, Spectrum sensorIllum, Float imagingRatio, Allocator alloc) : r_bar(&Spectra::X(), alloc), g_bar(&Spectra::Y(), alloc), b_bar(&Spectra::Z(), alloc), imagingRatio(imagingRatio) { <<Compute white balancing matrix for XYZ PixelSensor>> 
if (sensorIllum) { Point2f sourceWhite = SpectrumToXYZ(sensorIllum).xy(); Point2f targetWhite = outputColorSpace->w; XYZFromSensorRGB = WhiteBalance(sourceWhite, targetWhite); }
} RGB ToSensorRGB(SampledSpectrum L, const SampledWavelengths &lambda) const { L = SafeDiv(L, lambda.PDF()); return imagingRatio * RGB((r_bar.Sample(lambda) * L).Average(), (g_bar.Sample(lambda) * L).Average(), (b_bar.Sample(lambda) * L).Average()); }
<<PixelSensor Public Members>> 
SquareMatrix<3> XYZFromSensorRGB;
private: <<PixelSensor Private Methods>> 
template <typename Triplet> static Triplet ProjectReflectance(Spectrum r, Spectrum illum, Spectrum b1, Spectrum b2, Spectrum b3);
<<PixelSensor Private Members>> 
DenselySampledSpectrum r_bar, g_bar, b_bar; Float imagingRatio; static constexpr int nSwatchReflectances = 24; static Spectrum swatchReflectances[nSwatchReflectances];
};

PixelSensor models three components of sensor pixels’ operation:

  1. Exposure controls: These are the user-settable parameters that control how bright or dark the image is.
  2. RGB response: PixelSensor uses spectral response curves that are based on measurements of physical camera sensors to model the conversion of spectral radiance to tristimulus colors.
  3. White balance: Cameras generally process the images they capture, including adjusting initial RGB values according to the color of illumination to model chromatic adaptation in the human visual system. Thus, captured images appear visually similar to what a human observer would remember having seen when taking a picture.

pbrt includes a realistic camera model as well as idealized models based on projection matrices. Because pinhole cameras have apertures with infinitesimal area, we make some pragmatic trade-offs in the implementation of the PixelSensor so that images rendered with pinhole models are not completely black. We leave it the Camera’s responsibility to model the effect of the aperture size. The idealized models do not account for it at all, while the RealisticCamera does so in the <<Compute weighting for RealisticCamera ray>> fragment. The PixelSensor then only accounts for the shutter time and the ISO setting. These two factors are collected into a single quantity called the imaging ratio.

The PixelSensor constructor takes the sensor’s RGB matching functions— r overbar , g overbar , and b overbar —and the imaging ratio as parameters. It also takes the color space requested by the user for the final output RGB values as well as the spectrum of an illuminant that specifies what color to consider to be white in the scene; together, these will make it possible to convert spectral energy to RGB as measured by the sensor and then to RGB in the output color space.

Figure 5.19 shows the effect of modeling camera response, comparing rendering a scene using the XYZ matching functions to compute initial pixel colors with rendering with the matching functions for an actual camera sensor.

Figure 5.19: The Effect of Accurately Modeling Camera Sensor Response. (a) Scene rendered using the XYZ matching functions for the PixelSensor. (b) Scene rendered using measured sensor response curves for a Canon EOS 5D camera. Note that the color tones are slightly cooler—they have less orange and more blue to them. (Scene courtesy of Beeple.)

<<PixelSensor Public Methods>>= 
PixelSensor(Spectrum r, Spectrum g, Spectrum b, const RGBColorSpace *outputColorSpace, Spectrum sensorIllum, Float imagingRatio, Allocator alloc) : r_bar(r, alloc), g_bar(g, alloc), b_bar(b, alloc), imagingRatio(imagingRatio) { <<Compute XYZ from camera RGB matrix>> 
<<Compute rgbCamera values for training swatches>> 
Float rgbCamera[nSwatchReflectances][3]; for (int i = 0; i < nSwatchReflectances; ++i) { RGB rgb = ProjectReflectance<RGB>(swatchReflectances[i], sensorIllum, &r_bar, &g_bar, &b_bar); for (int c = 0; c < 3; ++c) rgbCamera[i][c] = rgb[c]; }
<<Compute xyzOutput values for training swatches>> 
Float xyzOutput[24][3]; Float sensorWhiteG = InnerProduct(sensorIllum, &g_bar); Float sensorWhiteY = InnerProduct(sensorIllum, &Spectra::Y()); for (size_t i = 0; i < nSwatchReflectances; ++i) { Spectrum s = swatchReflectances[i]; XYZ xyz = ProjectReflectance<XYZ>(s, &outputColorSpace->illuminant, &Spectra::X(), &Spectra::Y(), &Spectra::Z()) * (sensorWhiteY / sensorWhiteG); for (int c = 0; c < 3; ++c) xyzOutput[i][c] = xyz[c]; }
<<Initialize XYZFromSensorRGB using linear least squares>> 
pstd::optional<SquareMatrix<3>> m = LinearLeastSquares(rgbCamera, xyzOutput, nSwatchReflectances); if (!m) ErrorExit("Sensor XYZ from RGB matrix could not be solved."); XYZFromSensorRGB = *m;
}

<<PixelSensor Private Members>>= 
DenselySampledSpectrum r_bar, g_bar, b_bar; Float imagingRatio;

The RGB color space in which a sensor pixel records light is generally not the same as the RGB color space that the user has specified for the final image. The former is generally specific to a camera and is determined by the physical properties of its pixel color filters, and the latter is generally a device-independent color space like sRGB or one of the other color spaces described in Section 4.6.3. Therefore, the PixelSensor constructor computes a 3 times 3 matrix that converts from its RGB space to XYZ. From there, it is easy to convert to a particular output color space.

This matrix is found by solving an optimization problem. It starts with over twenty spectral distributions, representing the reflectance of patches with a variety of colors from a standardized color chart. The constructor computes the RGB colors of those patches under the camera’s illuminant in the camera’s color space as well as their XYZ colors under the illuminant of the output color space. If these colors are respectively denoted by column vectors, then we can consider the problem of finding a 3 times 3 matrix bold upper M :

bold upper M Start 3 By 4 Matrix 1st Row 1st Column r 1 2nd Column r 2 3rd Column Blank 4th Column r Subscript n Baseline 2nd Row 1st Column g 1 2nd Column g 2 3rd Column midline-horizontal-ellipsis 4th Column g Subscript n Baseline 3rd Row 1st Column b 1 2nd Column b 2 3rd Column Blank 4th Column b Subscript n Baseline EndMatrix almost-equals Start 3 By 4 Matrix 1st Row 1st Column x 1 2nd Column x 2 3rd Column Blank 4th Column x Subscript n Baseline 2nd Row 1st Column y 1 2nd Column y 2 3rd Column midline-horizontal-ellipsis 4th Column y Subscript n Baseline 3rd Row 1st Column z 1 2nd Column z 2 3rd Column Blank 4th Column z Subscript n Baseline EndMatrix period
(5.7)

As long as there are more than three reflectances, this is an over-constrained problem that can be solved using linear least squares.

<<Compute XYZ from camera RGB matrix>>= 
<<Compute rgbCamera values for training swatches>> 
Float rgbCamera[nSwatchReflectances][3]; for (int i = 0; i < nSwatchReflectances; ++i) { RGB rgb = ProjectReflectance<RGB>(swatchReflectances[i], sensorIllum, &r_bar, &g_bar, &b_bar); for (int c = 0; c < 3; ++c) rgbCamera[i][c] = rgb[c]; }
<<Compute xyzOutput values for training swatches>> 
Float xyzOutput[24][3]; Float sensorWhiteG = InnerProduct(sensorIllum, &g_bar); Float sensorWhiteY = InnerProduct(sensorIllum, &Spectra::Y()); for (size_t i = 0; i < nSwatchReflectances; ++i) { Spectrum s = swatchReflectances[i]; XYZ xyz = ProjectReflectance<XYZ>(s, &outputColorSpace->illuminant, &Spectra::X(), &Spectra::Y(), &Spectra::Z()) * (sensorWhiteY / sensorWhiteG); for (int c = 0; c < 3; ++c) xyzOutput[i][c] = xyz[c]; }
<<Initialize XYZFromSensorRGB using linear least squares>> 
pstd::optional<SquareMatrix<3>> m = LinearLeastSquares(rgbCamera, xyzOutput, nSwatchReflectances); if (!m) ErrorExit("Sensor XYZ from RGB matrix could not be solved."); XYZFromSensorRGB = *m;

Given the sensor’s illuminant, the work of computing the RGB coefficients for each reflectance is handled by the ProjectReflectance() method.

<<Compute rgbCamera values for training swatches>>= 
Float rgbCamera[nSwatchReflectances][3]; for (int i = 0; i < nSwatchReflectances; ++i) { RGB rgb = ProjectReflectance<RGB>(swatchReflectances[i], sensorIllum, &r_bar, &g_bar, &b_bar); for (int c = 0; c < 3; ++c) rgbCamera[i][c] = rgb[c]; }

For good results, the spectra used for this optimization problem should present a good variety of representative real-world spectra. The ones used in pbrt are based on measurements of a standard color chart.

<<PixelSensor Private Members>>+= 
static constexpr int nSwatchReflectances = 24; static Spectrum swatchReflectances[nSwatchReflectances];

The ProjectReflectance() utility method takes spectral distributions for a reflectance and an illuminant as well as three spectral matching functions b overbar Subscript i for a tristimulus color space. It returns a triplet of color coefficients c Subscript i given by

c Subscript i Baseline equals integral r left-parenthesis lamda right-parenthesis upper L left-parenthesis lamda right-parenthesis ModifyingAbove b With bar Subscript i Baseline left-parenthesis lamda right-parenthesis normal d lamda comma

where r is the spectral reflectance function, upper L is the illuminant’s spectral distribution, and b overbar Subscript i is a spectral matching function. Under the assumption that the second matching function b overbar Subscript 2 generally corresponds to luminance or at least something green, the color that causes the greatest response by the human visual system, the returned color triplet is normalized by integral upper L left-parenthesis lamda right-parenthesis ModifyingAbove b With bar Subscript 2 Baseline left-parenthesis lamda right-parenthesis normal d lamda . In this way, the linear least squares fit at least roughly weights each RGB/XYZ pair according to visual importance.

The ProjectReflectance() utility function takes the color space triplet type as a template parameter and is therefore able to return both RGB and XYZ values as appropriate. Its implementation follows the same general form as Spectrum::InnerProduct(), computing a Riemann sum over 1 nm spaced wavelengths, so it is not included here.

<<PixelSensor Private Methods>>= 
template <typename Triplet> static Triplet ProjectReflectance(Spectrum r, Spectrum illum, Spectrum b1, Spectrum b2, Spectrum b3);

The fragment that computes XYZ coefficients in the output color space, <<Compute xyzOutput values for training swatches>>, is generally similar to the one for RGB, with the differences that it uses the output illuminant and the XYZ spectral matching functions and initializes the xyzOutput array. It is therefore also not included here.

Given the two matrices of color coefficients, a call to the LinearLeastSquares() function solves the optimization problem of Equation (5.7).

<<Initialize XYZFromSensorRGB using linear least squares>>= 
pstd::optional<SquareMatrix<3>> m = LinearLeastSquares(rgbCamera, xyzOutput, nSwatchReflectances); if (!m) ErrorExit("Sensor XYZ from RGB matrix could not be solved."); XYZFromSensorRGB = *m;

Because the RGB and XYZ colors are computed using the color spaces’ respective illuminants, the matrix bold upper M also performs white balancing.

<<PixelSensor Public Members>>= 
SquareMatrix<3> XYZFromSensorRGB;

A second PixelSensor constructor uses XYZ matching functions for the pixel sensor’s spectral response curves. If a specific camera sensor is not specified in the scene description file, this is the default. Note that with this usage, the member variables r_bar, g_bar, and b_bar are misnamed in that they are actually upper X , upper Y , and upper Z .

<<PixelSensor Public Methods>>+=  
PixelSensor(const RGBColorSpace *outputColorSpace, Spectrum sensorIllum, Float imagingRatio, Allocator alloc) : r_bar(&Spectra::X(), alloc), g_bar(&Spectra::Y(), alloc), b_bar(&Spectra::Z(), alloc), imagingRatio(imagingRatio) { <<Compute white balancing matrix for XYZ PixelSensor>> 
if (sensorIllum) { Point2f sourceWhite = SpectrumToXYZ(sensorIllum).xy(); Point2f targetWhite = outputColorSpace->w; XYZFromSensorRGB = WhiteBalance(sourceWhite, targetWhite); }
}

By default, no white balancing is performed when PixelSensor converts to XYZ coefficients; that task is left for post-processing. However, if the user does specify a color temperature, white balancing is handled by the XYZFromSensorRGB matrix. (It is otherwise the identity matrix.) The WhiteBalance() function that computes this matrix will be described shortly; it takes the chromaticities of the white points of two color spaces and returns a matrix that maps the first to the second.

<<Compute white balancing matrix for XYZ PixelSensor>>= 
if (sensorIllum) { Point2f sourceWhite = SpectrumToXYZ(sensorIllum).xy(); Point2f targetWhite = outputColorSpace->w; XYZFromSensorRGB = WhiteBalance(sourceWhite, targetWhite); }

The main functionality provided by the PixelSensor is the ToSensorRGB() method, which converts a point-sampled spectral distribution upper L left-parenthesis lamda Subscript i Baseline right-parenthesis in a SampledSpectrum to RGB coefficients in the sensor’s color space. It does so via Monte Carlo evaluation of the sensor response integral, Equation (5.6), giving estimators of the form

r almost-equals StartFraction 1 Over n EndFraction sigma-summation Underscript i Overscript n Endscripts StartFraction upper L left-parenthesis lamda Subscript i Baseline right-parenthesis ModifyingAbove r With bar left-parenthesis lamda Subscript i Baseline right-parenthesis Over p left-parenthesis lamda Subscript i Baseline right-parenthesis EndFraction comma
(5.8)

where n is equal to NSpectrumSamples. The associated PDF values are available from the SampledWavelengths and the sum over wavelengths and division by n is handled using SampledSpectrum::Average(). These coefficients are scaled by the imaging ratio, which completes the conversion.

<<PixelSensor Public Methods>>+= 
RGB ToSensorRGB(SampledSpectrum L, const SampledWavelengths &lambda) const { L = SafeDiv(L, lambda.PDF()); return imagingRatio * RGB((r_bar.Sample(lambda) * L).Average(), (g_bar.Sample(lambda) * L).Average(), (b_bar.Sample(lambda) * L).Average()); }

Chromatic Adaptation and White Balance

One of the remarkable properties of the human visual system is that the color of objects is generally seen as the same, even under different lighting conditions; this effect is called chromatic adaptation. Cameras perform a similar function so that photographs capture the colors that the person taking the picture remembers seeing; in that context, this process is called white balancing.

pbrt provides a WhiteBalance() function that implements a white balancing algorithm called the von Kries transform. It takes two chromaticities: one is the chromaticity of the illumination and the other the chromaticity of the color white. (Recall the discussion in Section 4.6.3 of why white is not usually a constant spectrum but is instead defined as the color that humans perceive as white.) It returns a 3 times 3 matrix that applies the corresponding white balancing operation to XYZ colors.

<<White Balance Definitions>>= 
SquareMatrix<3> WhiteBalance(Point2f srcWhite, Point2f targetWhite) { <<Find LMS coefficients for source and target white>> 
XYZ srcXYZ = XYZ::FromxyY(srcWhite), dstXYZ = XYZ::FromxyY(targetWhite); auto srcLMS = LMSFromXYZ * srcXYZ, dstLMS = LMSFromXYZ * dstXYZ;
<<Return white balancing matrix for source and target white>> 
SquareMatrix<3> LMScorrect = SquareMatrix<3>::Diag( dstLMS[0] / srcLMS[0], dstLMS[1] / srcLMS[1], dstLMS[2] / srcLMS[2]); return XYZFromLMS * LMScorrect * LMSFromXYZ;
}

White balance with the von Kries transform is performed in the LMS color space, which is a color space where the responsivity of the three matching functions is specified to match the three types of cone in the human eye. By performing white balancing in the LMS space, we can model the effect of modulating the contribution of each type of cone in the eye, which is believed to be how chromatic adaptation is implemented in humans. After computing normalized XYZ colors corresponding to the provided chromaticities, the LMSFromXYZ matrix can be used to transform to LMS from XYZ.

<<Find LMS coefficients for source and target white>>= 
XYZ srcXYZ = XYZ::FromxyY(srcWhite), dstXYZ = XYZ::FromxyY(targetWhite); auto srcLMS = LMSFromXYZ * srcXYZ, dstLMS = LMSFromXYZ * dstXYZ;

3 times 3 matrices that convert between LMS and XYZ are available as constants.

<<Color Space Constants>>= 
extern const SquareMatrix<3> LMSFromXYZ, XYZFromLMS;

Given a color in LMS space, white balancing is performed by dividing out the color of the scene’s illuminant and then multiplying by the color of the desired illuminant, which can be represented by a diagonal matrix. The complete white balance matrix that operates on XYZ colors follows directly.

<<Return white balancing matrix for source and target white>>= 
SquareMatrix<3> LMScorrect = SquareMatrix<3>::Diag( dstLMS[0] / srcLMS[0], dstLMS[1] / srcLMS[1], dstLMS[2] / srcLMS[2]); return XYZFromLMS * LMScorrect * LMSFromXYZ;

Figure 5.20 shows an image rendered with a yellowish illuminant and the image after white balancing with the illuminant’s chromaticity.

Figure 5.20: The Effect of White Balance. (a) Image of a scene with a yellow illuminant that has a similar spectral distribution to an incandescent light bulb. (b) White balanced image, using a color temperature of 3000 K. Due to chromatic adaptation, this image is much closer than (a) to what a human observer would perceive viewing this scene. (Scene courtesy of Wig42 from Blend Swap, via Benedikt Bitterli.)

Sampling Sensor Response

Because the sensor response functions used by a PixelSensor describe the sensor’s wavelength-dependent response to radiance, it is worth at least approximately accounting for their variation when sampling the wavelengths of light that a ray is to carry. At minimum, a wavelength where all of them are zero should never be chosen, as that wavelength will make no contribution to the final image. More generally, applying importance sampling according to the sensor response functions is desirable as it offers the possibility of reducing error in the estimates of Equation (5.8).

However, choosing a distribution to use for sampling is challenging since the goal is minimizing error perceived by humans rather than strictly minimizing numeric error. Figure 5.21(a) shows the plots of both the CIE upper Y matching function and the sum of upper X , upper Y , and upper Z matching functions, both of which could be used. In practice, sampling according to upper Y alone gives excessive chromatic noise, but sampling by the sum of all three matching functions devotes too many samples to wavelengths between 400 nm and 500 nm, which are relatively unimportant visually.

A parametric probability distribution function that balances these concerns and works well for sampling the visible wavelengths is

p Subscript normal v Baseline left-parenthesis lamda right-parenthesis equals left-parenthesis integral Subscript lamda Subscript normal m normal i normal n Baseline Superscript lamda Subscript normal m normal a normal x Baseline Baseline f left-parenthesis lamda right-parenthesis normal d lamda right-parenthesis Superscript negative 1 Baseline f left-parenthesis lamda right-parenthesis comma
(5.9)

with

f left-parenthesis lamda right-parenthesis equals StartFraction 1 Over hyperbolic cosine squared left-parenthesis upper A left-parenthesis lamda minus upper B right-parenthesis right-parenthesis EndFraction comma

upper A equals 0.0072 normal n normal m Superscript negative 1 , and upper B equals 538 normal n normal m . Figure 5.21(b) shows a plot of p Subscript normal v Baseline left-parenthesis lamda right-parenthesis .

Figure 5.21: (a) Plot of normalized PDFs corresponding to the CIE upper Y matching function and the sum of the upper X , upper Y , and upper Z matching functions. (b) Plot of the parametric distribution p Subscript normal v Baseline left-parenthesis lamda right-parenthesis from Equation (5.9).

Our implementation samples over the wavelength range from 360 normal n normal m to 830 normal n normal m . The normalization constant that converts f into a PDF is precomputed.

<<Sampling Inline Functions>>+=  
Float VisibleWavelengthsPDF(Float lambda) { if (lambda < 360 || lambda > 830) return 0; return 0.0039398042f / Sqr(std::cosh(0.0072f * (lambda - 538))); }

The PDF can be sampled using the inversion method; the result is implemented in SampleVisibleWavelengths().

<<Sampling Inline Functions>>+=  
Float SampleVisibleWavelengths(Float u) { return 538 - 138.888889f * std::atanh(0.85691062f - 1.82750197f * u); }

We can now implement another sampling method in the SampledWavelengths class, SampleVisible(), which uses this technique.

<<SampledWavelengths Public Methods>>+= 
static SampledWavelengths SampleVisible(Float u) { SampledWavelengths swl; for (int i = 0; i < NSpectrumSamples; ++i) { <<Compute up for i th wavelength sample>> 
Float up = u + Float(i) / NSpectrumSamples; if (up > 1) up -= 1;
swl.lambda[i] = SampleVisibleWavelengths(up); swl.pdf[i] = VisibleWavelengthsPDF(swl.lambda[i]); } return swl; }

Like SampledWavelengths::SampleUniform(), SampleVisible() uses a single random sample to generate all wavelength samples. It uses a slightly different approach, taking uniform steps across the left-bracket 0 comma 1 right-parenthesis sample space before sampling each wavelength.

<<Compute up for i th wavelength sample>>= 
Float up = u + Float(i) / NSpectrumSamples; if (up > 1) up -= 1;

Using this distribution for sampling in place of a uniform distribution is worthwhile. Figure 5.22 shows two images of a scene, one rendered using uniform wavelength samples and the other rendered using SampleVisible(). Color noise is greatly reduced, with only a 1% increase in runtime.

Figure 5.22: (a) Scene rendered with 4 samples per pixel, each with 4 wavelength samples, sampled uniformly over the visible range. (b) Rendered at the same sampling rates but instead sampling wavelengths using SampledWavelengths::SampleVisible(). This image has substantially less color noise, at a negligible cost in additional computation. (Model courtesy of Yasutoshi Mori.)

5.4.3 Filtering Image Samples

The main responsibility of Film implementations is to aggregate multiple spectral samples at each pixel in order to compute a final value for it. In a physical camera, each pixel integrates light over a small area. Its response may have some spatial variation over that area that depends on the physical design of the sensor. In Chapter 8 we will consider this operation from the perspective of signal processing and will see that the details of where the image function is sampled and how those samples are weighted can significantly affect the final image quality.

Pending those details, for now we will assume that some filter function f is used to define the spatial variation in sensor response around each image pixel. These filter functions quickly go to zero, encoding the fact that pixels only respond to light close to them on the film. They also encode any further spatial variation in the pixel’s response. With this approach, if we have an image function r left-parenthesis x comma y right-parenthesis that gives the red color at an arbitrary position on the film (e.g., as measured using a sensor response function ModifyingAbove r With bar left-parenthesis lamda right-parenthesis with Equation (5.6)), then the filtered red value r Subscript normal f at a position left-parenthesis x comma y right-parenthesis is given by

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis equals integral f left-parenthesis x minus x Superscript prime Baseline comma y minus y Superscript prime Baseline right-parenthesis r left-parenthesis x prime comma y Superscript prime Baseline right-parenthesis normal d x prime normal d y Superscript prime Baseline comma

where the filter function f is assumed to integrate to 1.

As usual, we will estimate this integral using point samples of the image function. The estimator is

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis almost-equals StartFraction 1 Over n EndFraction sigma-summation Underscript i Overscript n Endscripts StartFraction f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis Over p left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis EndFraction period

Two approaches have been used in graphics to sample the integrand. The first, which was used in all three previous versions of pbrt, is to sample the image uniformly. Each image sample may then contribute to multiple pixels’ final values, depending on the extent of the filter function being used. This approach gives the estimator

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis almost-equals StartFraction upper A Over n EndFraction sigma-summation Underscript i Overscript n Endscripts f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis comma

where upper A is the film area. Figure 5.23 illustrates the approach; it shows a pixel at location left-parenthesis x comma y right-parenthesis that has a pixel filter with extent radius.x in the x direction and radius.y in the y direction. All the samples at positions left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis inside the box given by the filter extent may contribute to the pixel’s value, depending on the filter function’s value for f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis .

Figure 5.23: 2D Image Filtering. To compute a filtered pixel value for the pixel marked with a filled circle located at left-parenthesis x comma y right-parenthesis , all the image samples inside the box around left-parenthesis x comma y right-parenthesis with extent radius.x and radius.y need to be considered. Each of the image samples left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis , denoted by open circles, is weighted by a 2D filter function, f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis . The weighted average of all samples is the final pixel value.

While Equation (5.12) gives an unbiased estimate of the pixel value, variation in the filter function leads to variance in the estimates. Consider the case of a constant image function r : in that case, we would expect the resulting image pixels to all be exactly equal to r . However, the sum of filter values f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis will not generally be equal to 1: it only equals 1 in expectation. Thus, the image will include noise, even in this simple setting. If the alternative estimator

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis almost-equals StartFraction sigma-summation Underscript i Endscripts f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis Over sigma-summation Underscript i Endscripts f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis EndFraction

is used instead, that variance is eliminated at the cost of a small amount of bias. (This is the weighted importance sampling Monte Carlo estimator.) In practice, this trade-off is worthwhile.

Equation (5.10) can also be estimated independently at each pixel. This is the approach used in this version of pbrt. In this case, it is worthwhile to sample points on the film using a distribution based on the filter function. This approach is known as filter importance sampling. With it, the spatial variation of the filter is accounted for purely via the distribution of sample locations for a pixel rather than scaling each sample’s contribution according to the filter’s value.

If p proportional-to f , then those two factors cancel in Equation (5.11) and we are left with an average of the r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis sample values scaled by the constant of proportionality. However, here we must handle the rare (for rendering) case of estimating an integral that may be negative: as we will see in Chapter 8, filter functions that are partially negative can give better results than those that are nonnegative. In that case, we have p proportional-to StartAbsoluteValue f EndAbsoluteValue , which gives

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis almost-equals left-parenthesis integral StartAbsoluteValue f left-parenthesis x prime comma y Superscript prime Baseline right-parenthesis EndAbsoluteValue normal d x prime normal d y Superscript prime Baseline right-parenthesis left-parenthesis StartFraction 1 Over n EndFraction sigma-summation Underscript i Overscript n Endscripts normal s normal i normal g normal n left-parenthesis f left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis right-parenthesis r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis right-parenthesis comma

where normal s normal i normal g normal n left-parenthesis x right-parenthesis is 1 if x greater-than 0 , 0 if it is 0, and negative 1 otherwise. However, this estimator has the same problem as Equation (5.12): even with a constant function r , the estimates will have variance depending on how many of the normal s normal i normal g normal n function evaluations give 1 and how many give  negative 1 .

Therefore, this version of pbrt continues to use the weighted importance sampling estimator, computing pixel values as

r Subscript normal f Baseline left-parenthesis x comma y right-parenthesis almost-equals StartFraction sigma-summation Underscript i Endscripts w left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis r left-parenthesis x Subscript i Baseline comma y Subscript i Baseline right-parenthesis Over sigma-summation Underscript i Endscripts w left-parenthesis x minus x Subscript i Baseline comma y minus y Subscript i Baseline right-parenthesis EndFraction

with w left-parenthesis x comma y right-parenthesis equals f left-parenthesis x comma y right-parenthesis slash p left-parenthesis x comma y right-parenthesis .

The first of these two approaches has the advantage that each image sample can contribute to multiple pixels’ final filtered values. This can be beneficial for rendering efficiency, as all the computation involved in computing the radiance for an image sample can be used to improve the accuracy of multiple pixels. However, using samples generated for other pixels is not always helpful: some of the sample generation algorithms implemented in Chapter 8 carefully position samples in ways that ensure good coverage of the sampling domain in a pixel. If samples from other pixels are mixed in with those, the full set of samples for a pixel may no longer have that same structure, which in turn can increase error. By not sharing samples across pixels, filter importance sampling does not have this problem.

Filter importance sampling has further advantages. It makes parallel rendering easier: if the renderer is parallelized in a way that has different threads working on different pixels, there is never a chance that multiple threads will need to concurrently modify the same pixel’s value. A final advantage is that if there are any samples that are much brighter than the others due to a variance spike from a poorly sampled integrand, then those samples only contribute to a single pixel, rather than being smeared over multiple pixels. It is easier to fix up the resulting single-pixel artifacts than a neighborhood of them that have been affected by such a sample.

5.4.4 The Film Interface

With the foundations of sensor response and pixel sample filtering established, we can introduce the Film interface. It is defined in the file base/film.h.

<<Film Definition>>= 
class Film : public TaggedPointer<RGBFilm, GBufferFilm, SpectralFilm> { public: <<Film Interface>> 
void AddSample(Point2i pFilm, SampledSpectrum L, const SampledWavelengths &lambda, const VisibleSurface *visibleSurface, Float weight); Bounds2f SampleBounds() const; bool UsesVisibleSurface() const; void AddSplat(Point2f p, SampledSpectrum v, const SampledWavelengths &lambda); SampledWavelengths SampleWavelengths(Float u) const; Point2i FullResolution() const; Bounds2i PixelBounds() const; Float Diagonal() const; void WriteImage(ImageMetadata metadata, Float splatScale = 1); RGB ToOutputRGB(SampledSpectrum L, const SampledWavelengths &lambda) const; Image GetImage(ImageMetadata *metadata, Float splatScale = 1); RGB GetPixelRGB(Point2i p, Float splatScale = 1) const; Filter GetFilter() const; const PixelSensor *GetPixelSensor() const; std::string GetFilename() const; using TaggedPointer::TaggedPointer; static Film Create(const std::string &name, const ParameterDictionary &parameters, Float exposureTime, const CameraTransform &cameraTransform, Filter filter, const FileLoc *loc, Allocator alloc); std::string ToString() const;
};

SpectralFilm, which is not described here, records spectral images over a specified wavelength range that is discretized into non-overlapping ranges. See the documentation of pbrt’s file format for more information about the SpectralFilm’s use.

Samples can be provided to the film in two ways. The first is from the Sampler selecting points on the film at which the Integrator estimates the radiance. These samples are provided to the Film via the AddSample() method, which takes the following parameters:

  • The sample’s pixel coordinates, pFilm.
  • The spectral radiance of the sample, L.
  • The sample’s wavelengths, lambda.
  • An optional VisibleSurface that describes the geometry at the first visible point along the sample’s camera ray.
  • A weight for the sample to use in computing Equation (5.13) that is returned by Filter::Sample().

Film implementations can assume that multiple threads will not call AddSample() concurrently with the same pFilm location (though they should assume that threads will call it concurrently with different ones). Therefore, it is not necessary to worry about mutual exclusion in this method’s implementation unless some data that is not unique to a pixel is modified.

<<Film Interface>>= 
void AddSample(Point2i pFilm, SampledSpectrum L, const SampledWavelengths &lambda, const VisibleSurface *visibleSurface, Float weight);

The Film interface also includes a method that returns a bounding box of all the samples that may be generated. Note that this is different than the bounding box of the image pixels in the common case that the pixel filter extents are wider than a pixel.

<<Film Interface>>+=  
Bounds2f SampleBounds() const;

VisibleSurface holds an assortment of information about a point on a surface.

<<VisibleSurface Definition>>= 
class VisibleSurface { public: <<VisibleSurface Public Methods>> 
VisibleSurface(const SurfaceInteraction &si, SampledSpectrum albedo, const SampledWavelengths &lambda); operator bool() const { return set; } VisibleSurface() = default; std::string ToString() const;
<<VisibleSurface Public Members>> 
Point3f p; Normal3f n, ns; Point2f uv; Float time = 0; Vector3f dpdx, dpdy; SampledSpectrum albedo; bool set = false;
};

In addition to the point, normal, shading normal, and time, VisibleSurface stores the partial derivatives of depth at each pixel, partial-differential z slash partial-differential x and partial-differential z slash partial-differential y , where x and y are in raster space and z in camera space. These values are useful in image denoising algorithms, since they make it possible to test whether the surfaces in adjacent pixels are coplanar. The surface’s albedo is its spectral distribution of reflected light under uniform illumination; this quantity can be useful for separating texture from illumination before denoising.

<<VisibleSurface Public Members>>= 
Point3f p; Normal3f n, ns; Point2f uv; Float time = 0; Vector3f dpdx, dpdy; SampledSpectrum albedo;

We will not include the VisibleSurface constructor here, as its main function is to copy appropriate values from the SurfaceInteraction into its member variables.

<<VisibleSurface Public Methods>>= 

The set member variable indicates whether a VisibleSurface has been initialized.

<<VisibleSurface Public Members>>+= 
bool set = false;

<<VisibleSurface Public Methods>>+= 
operator bool() const { return set; }

Film implementations can indicate whether they use the VisibleSurface * passed to their AddSample() method via UsesVisibleSurface(). Providing this information allows integrators to skip the expense of initializing a VisibleSurface if it will not be used.

<<Film Interface>>+=  
bool UsesVisibleSurface() const;

Light transport algorithms that sample paths starting from the light sources (such as bidirectional path require the ability to “splat” contributions to arbitrary pixels. Rather than computing the final pixel value as a weighted average of contributing splats, splats are simply summed. Generally, the more splats that are around a given pixel, the brighter the pixel will be. AddSplat() splats the provided value at the given location in the image.

In contrast to AddSample(), this method may be called concurrently by multiple threads that end up updating the same pixel. Therefore, Film implementations must either implement some form of mutual exclusion or use atomic operations in their implementations of this method.

<<Film Interface>>+=  
void AddSplat(Point2f p, SampledSpectrum v, const SampledWavelengths &lambda);

Film implementations must also provide a SampleWavelengths() method that samples from the range of wavelengths that the film’s sensor responds to (e.g., using SampledWavelengths::SampleVisible()).

<<Film Interface>>+=  
SampledWavelengths SampleWavelengths(Float u) const;

In addition, they must provide a handful of methods that give the extent of the image and the diagonal length of its sensor, measured in meters.

<<Film Interface>>+=  
Point2i FullResolution() const; Bounds2i PixelBounds() const; Float Diagonal() const;

A call to the Film::WriteImage() method directs the film to do the processing necessary to generate the final image and store it in a file. In addition to the camera transform, this method takes a scale factor that is applied to the samples provided to the AddSplat() method.

<<Film Interface>>+=  
void WriteImage(ImageMetadata metadata, Float splatScale = 1);

The ToOutputRGB() method allows callers to find the output RGB value that results for given spectral radiance samples from applying the PixelSensor’s model, performing white balancing, and then converting to the output color space. (This method is used by the SPPMIntegrator included in the online edition, which has requirements that cause it to maintain the final image itself rather than using a Film implementation.)

<<Film Interface>>+=  
RGB ToOutputRGB(SampledSpectrum L, const SampledWavelengths &lambda) const;

A caller can also request the entire image to be returned, as well as the RGB value for a single pixel. The latter method is used for displaying in-progress images during rendering.

<<Film Interface>>+=  
Image GetImage(ImageMetadata *metadata, Float splatScale = 1); RGB GetPixelRGB(Point2i p, Float splatScale = 1) const;

Finally, Film implementations must provide access to a few additional values for use in other parts of the system.

<<Film Interface>>+= 
Filter GetFilter() const; const PixelSensor *GetPixelSensor() const; std::string GetFilename() const;

5.4.5 Common Film Functionality

As we did with CameraBase for Camera implementations, we have written a FilmBase class that Film implementations can inherit from. It collects commonly used member variables and is able to provide a few of the methods required by the Film interface.

<<FilmBase Definition>>= 
class FilmBase { public: <<FilmBase Public Methods>> 
FilmBase(FilmBaseParameters p) : fullResolution(p.fullResolution), pixelBounds(p.pixelBounds), filter(p.filter), diagonal(p.diagonal * .001f), sensor(p.sensor), filename(p.filename) { } Point2i FullResolution() const { return fullResolution; } Bounds2i PixelBounds() const { return pixelBounds; } Float Diagonal() const { return diagonal; } Filter GetFilter() const { return filter; } const PixelSensor *GetPixelSensor() const { return sensor; } std::string GetFilename() const { return filename; } SampledWavelengths SampleWavelengths(Float u) const { return SampledWavelengths::SampleVisible(u); } Bounds2f SampleBounds() const; std::string BaseToString() const;
protected: <<FilmBase Protected Members>> 
Point2i fullResolution; Bounds2i pixelBounds; Filter filter; Float diagonal; const PixelSensor *sensor; std::string filename;
};

The FilmBase constructor takes a number of values: the overall resolution of the image in pixels; a bounding box that may specify a subset of the full image; a filter function; a PixelSensor; the length of the diagonal of the film’s physical area; and the filename for the output image. These are all bundled up into a small structure in order to shorten the parameter lists of forthcoming constructors.

<<FilmBaseParameters Definition>>= 
struct FilmBaseParameters { Point2i fullResolution; Bounds2i pixelBounds; Filter filter; Float diagonal; const PixelSensor *sensor; std::string filename; };

The FilmBase constructor then just copies the various values from the parameter structure, converting the film diagonal length from millimeters (as specified in scene description files) to meters, the unit used for measuring distance in pbrt.

<<FilmBase Public Methods>>= 
FilmBase(FilmBaseParameters p) : fullResolution(p.fullResolution), pixelBounds(p.pixelBounds), filter(p.filter), diagonal(p.diagonal * .001f), sensor(p.sensor), filename(p.filename) { }

<<FilmBase Protected Members>>= 
Point2i fullResolution; Bounds2i pixelBounds; Filter filter; Float diagonal; const PixelSensor *sensor; std::string filename;

Having these values makes it possible to immediately implement a number of the methods required by the Film interface.

<<FilmBase Public Methods>>+=  
Point2i FullResolution() const { return fullResolution; } Bounds2i PixelBounds() const { return pixelBounds; } Float Diagonal() const { return diagonal; } Filter GetFilter() const { return filter; } const PixelSensor *GetPixelSensor() const { return sensor; } std::string GetFilename() const { return filename; }

An implementation of SampleWavelengths() samples according to the distribution in Equation (5.9).

<<FilmBase Public Methods>>+= 
SampledWavelengths SampleWavelengths(Float u) const { return SampledWavelengths::SampleVisible(u); }

The Film::SampleBounds() method can also be easily implemented, given the Filter. Computing the sample bounds involves both expanding by the filter radius and accounting for half-pixel offsets that come from the conventions used in pbrt for pixel coordinates; these are explained in more detail in Section 8.1.4.

<<FilmBase Method Definitions>>= 
Bounds2f FilmBase::SampleBounds() const { Vector2f radius = filter.Radius(); return Bounds2f(pixelBounds.pMin - radius + Vector2f(0.5f, 0.5f), pixelBounds.pMax + radius - Vector2f(0.5f, 0.5f)); }

5.4.6 RGBFilm

RGBFilm records an image represented by RGB color.

<<RGBFilm Definition>>= 
class RGBFilm : public FilmBase { public: <<RGBFilm Public Methods>> 
bool UsesVisibleSurface() const { return false; } void AddSample(Point2i pFilm, SampledSpectrum L, const SampledWavelengths &lambda, const VisibleSurface *, Float weight) { <<Convert sample radiance to PixelSensor RGB>> 
RGB rgb = sensor->ToSensorRGB(L, lambda);
<<Optionally clamp sensor RGB value>> 
Float m = std::max({rgb.r, rgb.g, rgb.b}); if (m > maxComponentValue) rgb *= maxComponentValue / m;
<<Update pixel values with filtered sample contribution>> 
Pixel &pixel = pixels[pFilm]; for (int c = 0; c < 3; ++c) pixel.rgbSum[c] += weight * rgb[c]; pixel.weightSum += weight;
} RGB GetPixelRGB(Point2i p, Float splatScale = 1) const { const Pixel &pixel = pixels[p]; RGB rgb(pixel.rgbSum[0], pixel.rgbSum[1], pixel.rgbSum[2]); <<Normalize rgb with weight sum>> 
Float weightSum = pixel.weightSum; if (weightSum != 0) rgb /= weightSum;
<<Add splat value at pixel>> 
for (int c = 0; c < 3; ++c) rgb[c] += splatScale * pixel.rgbSplat[c] / filterIntegral;
<<Convert rgb to output RGB color space>>  return rgb; } RGBFilm(FilmBaseParameters p, const RGBColorSpace *colorSpace, Float maxComponentValue = Infinity, bool writeFP16 = true, Allocator alloc = {}); static RGBFilm *Create(const ParameterDictionary &parameters, Float exposureTime, Filter filter, const RGBColorSpace *colorSpace, const FileLoc *loc, Allocator alloc); PBRT_CPU_GPU void AddSplat(Point2f p, SampledSpectrum v, const SampledWavelengths &lambda); void WriteImage(ImageMetadata metadata, Float splatScale = 1); Image GetImage(ImageMetadata *metadata, Float splatScale = 1); std::string ToString() const; RGB ToOutputRGB(SampledSpectrum L, const SampledWavelengths &lambda) const { RGB sensorRGB = sensor->ToSensorRGB(L, lambda); return outputRGBFromSensorRGB * sensorRGB; }
private: <<RGBFilm::Pixel Definition>> 
struct Pixel { double rgbSum[3] = {0., 0., 0.}; double weightSum = 0.; AtomicDouble rgbSplat[3]; };
<<RGBFilm Private Members>> 
const RGBColorSpace *colorSpace; Float maxComponentValue; bool writeFP16; Float filterIntegral; SquareMatrix<3> outputRGBFromSensorRGB; Array2D<Pixel> pixels;
};

In addition to the parameters that are passed along to FilmBase, RGBFilm takes a color space to use for the output image, a parameter that allows specifying the maximum value of an RGB color component, and a parameter that controls the floating-point precision in the output image.

<<RGBFilm Method Definitions>>= 
RGBFilm::RGBFilm(FilmBaseParameters p, const RGBColorSpace *colorSpace, Float maxComponentValue, bool writeFP16, Allocator alloc) : FilmBase(p), pixels(p.pixelBounds, alloc), colorSpace(colorSpace), maxComponentValue(maxComponentValue), writeFP16(writeFP16) { filterIntegral = filter.Integral(); <<Compute outputRGBFromSensorRGB matrix>>  }

The integral of the filter function will be useful to normalize the filter values used for samples provided via AddSplat(), so it is cached in a member variable.

<<RGBFilm Private Members>>= 
const RGBColorSpace *colorSpace; Float maxComponentValue; bool writeFP16; Float filterIntegral;

The color space for the final image is given by a user-specified RGBColorSpace that is unlikely to be the same as the sensor’s RGB color space. The constructor therefore computes a 3 times 3 matrix that transforms sensor RGB values to the output color space.

<<Compute outputRGBFromSensorRGB matrix>>= 

<<RGBFilm Private Members>>+=  
SquareMatrix<3> outputRGBFromSensorRGB;

Given the pixel resolution of the (possibly cropped) image, the constructor allocates a 2D array of Pixel structures, with one for each pixel. The running weighted sums of pixel contributions are represented using RGB colors in the rgbSum member variable. weightSum holds the sum of filter weight values for the sample contributions to the pixel. These respectively correspond to the numerator and denominator in Equation (5.13). Finally, rgbSplat holds an (unweighted) sum of sample splats.

Double-precision floating point is used for all of these quantities. Single-precision floats are almost always sufficient, but when used for reference images rendered with high sample counts they may have insufficient precision to accurately store their associated sums. Although it is rare for this error to be visually evident, it can cause problems with reference images that are used to evaluate the error of Monte Carlo sampling algorithms.

Figure 5.24 shows an example of this problem. We rendered a reference image of a test scene using 4 million samples in each pixel, using both 32-bit and 64-bit floating-point values for the RGBFilm pixel values. We then plotted mean squared error (MSE) as a function of sample count. For an unbiased Monte Carlo estimator, MSE is upper O left-parenthesis 1 slash n right-parenthesis in the number of samples taken n ; on a log–log plot, it should be a straight line with slope negative 1 . However, we can see that for n greater-than 1000 with a 32-bit float reference image, the reduction in MSE seems to flatten out—more samples do not seem to reduce error. With 64-bit floats, the curve maintains its expected path.

Figure 5.24: Mean Squared Error as a Function of Sample Count. When rendering a scene using an unbiased Monte Carlo estimator, we expect MSE to be related to the number of samples n by upper O left-parenthesis 1 slash n right-parenthesis . With a log–log plot, this rate corresponds to a straight line with slope negative 1 . For the test scene considered here, we can see that using 32-bit floats for the reference image causes reported error to inaccurately stop decreasing after 1,000 or so samples.

<<RGBFilm::Pixel Definition>>= 
struct Pixel { double rgbSum[3] = {0., 0., 0.}; double weightSum = 0.; AtomicDouble rgbSplat[3]; };

<<RGBFilm Private Members>>+= 
Array2D<Pixel> pixels;

The RGBFilm does not use the VisibleSurface * passed to AddSample().

<<RGBFilm Public Methods>>= 
bool UsesVisibleSurface() const { return false; }

AddSample() converts spectral radiance to sensor RGB before updating the Pixel corresponding to the point pFilm.

<<RGBFilm Public Methods>>+=  
void AddSample(Point2i pFilm, SampledSpectrum L, const SampledWavelengths &lambda, const VisibleSurface *, Float weight) { <<Convert sample radiance to PixelSensor RGB>> 
RGB rgb = sensor->ToSensorRGB(L, lambda);
<<Optionally clamp sensor RGB value>> 
Float m = std::max({rgb.r, rgb.g, rgb.b}); if (m > maxComponentValue) rgb *= maxComponentValue / m;
<<Update pixel values with filtered sample contribution>> 
Pixel &pixel = pixels[pFilm]; for (int c = 0; c < 3; ++c) pixel.rgbSum[c] += weight * rgb[c]; pixel.weightSum += weight;
}

The radiance value is first converted to RGB by the sensor.

<<Convert sample radiance to PixelSensor RGB>>= 
RGB rgb = sensor->ToSensorRGB(L, lambda);

Images rendered with Monte Carlo integration can exhibit bright spikes of noise in pixels where the sampling distributions that were used do not match the integrand well such that when f left-parenthesis x right-parenthesis slash p left-parenthesis x right-parenthesis is computed in the Monte Carlo estimator, f left-parenthesis x right-parenthesis is very large and p left-parenthesis x right-parenthesis is very small. (Such pixels are colloquially called “fireflies.”) Many additional samples may be required to get an accurate estimate for that pixel.

A widely used technique to reduce the effect of fireflies is to clamp all sample contributions to some maximum amount. Doing so introduces error: energy is lost, and the image is no longer an unbiased estimate of the true image. However, when the aesthetics of rendered images are more important than their mathematics, this can be a useful remedy. Figure 5.25 shows an example of its use.

Figure 5.25: Image with High Variance in Some Pixels. This scene suffers from variance spikes in pixels due to difficult-to-sample light paths that occasionally intersect the sun. (a) Image rendered normally. (b) Image rendered with clamping, where pixel sample RGB values are clamped to have values no larger than 10. The image looks much better with clamping, though at a cost of some loss of energy. (Model courtesy of Yasutoshi Mori.)

The RGBFilm’s maxComponentValue parameter can be set to a threshold that is used for clamping. It is infinite by default, and no clamping is performed.

<<Optionally clamp sensor RGB value>>= 
Float m = std::max({rgb.r, rgb.g, rgb.b}); if (m > maxComponentValue) rgb *= maxComponentValue / m;

Given the possibly clamped RGB value, the pixel it lies in can be updated by adding its contributions to the running sums of the numerator and denominator of Equation (5.13).

<<Update pixel values with filtered sample contribution>>= 
Pixel &pixel = pixels[pFilm]; for (int c = 0; c < 3; ++c) pixel.rgbSum[c] += weight * rgb[c]; pixel.weightSum += weight;

The AddSplat() method first reuses the first two fragments from AddSample() to compute the RGB value of the provided radiance L.

<<RGBFilm Method Definitions>>+= 
void RGBFilm::AddSplat(Point2f p, SampledSpectrum L, const SampledWavelengths &lambda) { <<Convert sample radiance to PixelSensor RGB>> 
RGB rgb = sensor->ToSensorRGB(L, lambda);
<<Optionally clamp sensor RGB value>> 
Float m = std::max({rgb.r, rgb.g, rgb.b}); if (m > maxComponentValue) rgb *= maxComponentValue / m;
<<Compute bounds of affected pixels for splat, splatBounds>> 
Point2f pDiscrete = p + Vector2f(0.5, 0.5); Vector2f radius = filter.Radius(); Bounds2i splatBounds(Point2i(Floor(pDiscrete - radius)), Point2i(Floor(pDiscrete + radius)) + Vector2i(1, 1)); splatBounds = Intersect(splatBounds, pixelBounds);
for (Point2i pi : splatBounds) { <<Evaluate filter at pi and add splat contribution>> 
Float wt = filter.Evaluate(Point2f(p - pi - Vector2f(0.5, 0.5))); if (wt != 0) { Pixel &pixel = pixels[pi]; for (int i = 0; i < 3; ++i) pixel.rgbSplat[i].Add(wt * rgb[i]); }
} }

Because splatted contributions are not a result of pixel samples but are points in the scene that are projected onto the film plane, it is necessary to consider their contribution to multiple pixels, since each pixel’s reconstruction filter generally extends out to include contributions from nearby pixels.

First, a bounding box of potentially affected pixels is found using the filter’s radius. See Section 8.1.4, which explains the conventions for indexing into pixels in pbrt and, in particular, the addition of left-parenthesis 0.5 comma 0.5 right-parenthesis to the pixel coordinate here.

<<Compute bounds of affected pixels for splat, splatBounds>>= 
Point2f pDiscrete = p + Vector2f(0.5, 0.5); Vector2f radius = filter.Radius(); Bounds2i splatBounds(Point2i(Floor(pDiscrete - radius)), Point2i(Floor(pDiscrete + radius)) + Vector2i(1, 1)); splatBounds = Intersect(splatBounds, pixelBounds);

If the filter weight is nonzero, the splat’s weighted contribution is added. Unlike with AddSample(), no sum of filter weights is maintained; normalization is handled later using the filter’s integral, as per Equation (5.10).

<<Evaluate filter at pi and add splat contribution>>= 
Float wt = filter.Evaluate(Point2f(p - pi - Vector2f(0.5, 0.5))); if (wt != 0) { Pixel &pixel = pixels[pi]; for (int i = 0; i < 3; ++i) pixel.rgbSplat[i].Add(wt * rgb[i]); }

GetPixelRGB() returns the final RGB value for a given pixel in the RGBFilm’s output color space.

<<RGBFilm Public Methods>>+=  
RGB GetPixelRGB(Point2i p, Float splatScale = 1) const { const Pixel &pixel = pixels[p]; RGB rgb(pixel.rgbSum[0], pixel.rgbSum[1], pixel.rgbSum[2]); <<Normalize rgb with weight sum>> 
Float weightSum = pixel.weightSum; if (weightSum != 0) rgb /= weightSum;
<<Add splat value at pixel>> 
for (int c = 0; c < 3; ++c) rgb[c] += splatScale * pixel.rgbSplat[c] / filterIntegral;
<<Convert rgb to output RGB color space>>  return rgb; }

First, the final pixel contribution from the values provided by AddSample() is computed via Equation (5.13).

<<Normalize rgb with weight sum>>= 
Float weightSum = pixel.weightSum; if (weightSum != 0) rgb /= weightSum;

Then Equation (5.10) can be applied to incorporate any splatted values.

<<Add splat value at pixel>>= 
for (int c = 0; c < 3; ++c) rgb[c] += splatScale * pixel.rgbSplat[c] / filterIntegral;

Finally, the color conversion matrix brings the RGB value into the output color space.

<<Convert rgb to output RGB color space>>= 

ToOutputRGB()’s implementation first uses the sensor to compute a sensor RGB and then converts to the output color space.

<<RGBFilm Public Methods>>+= 
RGB ToOutputRGB(SampledSpectrum L, const SampledWavelengths &lambda) const { RGB sensorRGB = sensor->ToSensorRGB(L, lambda); return outputRGBFromSensorRGB * sensorRGB; }

We will not include the straightforward RGBFilm WriteImage() or GetImage() method implementations in the book. The former calls GetImage() before calling Image::Write(), and the latter fills in an image using GetPixelRGB() to get each pixel’s value.

5.4.7 GBufferFilm

The GBufferFilm stores not only RGB at each pixel, but also additional information about the geometry at the first visible intersection point. This additional information is useful for a variety of applications, ranging from image denoising algorithms to providing training data for machine learning applications.

<<GBufferFilm Definition>>= 
class GBufferFilm : public FilmBase { public: <<GBufferFilm Public Methods>> 
GBufferFilm(FilmBaseParameters p, const AnimatedTransform &outputFromRender, bool applyInverse, const RGBColorSpace *colorSpace, Float maxComponentValue = Infinity, bool writeFP16 = true, Allocator alloc = {}); static GBufferFilm *Create(const ParameterDictionary &parameters, Float exposureTime, const CameraTransform &cameraTransform, Filter filter, const RGBColorSpace *colorSpace, const FileLoc *loc, Allocator alloc); PBRT_CPU_GPU void AddSample(Point2i pFilm, SampledSpectrum L, const SampledWavelengths &lambda, const VisibleSurface *visibleSurface, Float weight); PBRT_CPU_GPU void AddSplat(Point2f p, SampledSpectrum v, const SampledWavelengths &lambda); PBRT_CPU_GPU RGB ToOutputRGB(SampledSpectrum L, const SampledWavelengths &lambda) const { RGB cameraRGB = sensor->ToSensorRGB(L, lambda); return outputRGBFromSensorRGB * cameraRGB; } PBRT_CPU_GPU bool UsesVisibleSurface() const { return true; } PBRT_CPU_GPU RGB GetPixelRGB(Point2i p, Float splatScale = 1) const { const Pixel &pixel = pixels[p]; RGB rgb(pixel.rgbSum[0], pixel.rgbSum[1], pixel.rgbSum[2]); // Normalize pixel with weight sum Float weightSum = pixel.weightSum; if (weightSum != 0) rgb /= weightSum; // Add splat value at pixel for (int c = 0; c < 3; ++c) rgb[c] += splatScale * pixel.rgbSplat[c] / filterIntegral; rgb = outputRGBFromSensorRGB * rgb; return rgb; } void WriteImage(ImageMetadata metadata, Float splatScale = 1); Image GetImage(ImageMetadata *metadata, Float splatScale = 1); std::string ToString() const;
private: <<GBufferFilm::Pixel Definition>> 
struct Pixel { double rgbSum[3] = {0., 0., 0.}; double weightSum = 0., gBufferWeightSum = 0.; AtomicDouble rgbSplat[3]; Point3f pSum; Float dzdxSum = 0, dzdySum = 0; Normal3f nSum, nsSum; Point2f uvSum; double rgbAlbedoSum[3] = {0., 0., 0.}; VarianceEstimator<Float> rgbVariance[3]; };
<<GBufferFilm Private Members>> 
AnimatedTransform outputFromRender; bool applyInverse; Array2D<Pixel> pixels; const RGBColorSpace *colorSpace; Float maxComponentValue; bool writeFP16; Float filterIntegral; SquareMatrix<3> outputRGBFromSensorRGB;
};

We will not include any of the GBufferFilm implementation other than its Pixel structure, which augments the one used in RGBFilm with additional fields that store geometric information. It also stores estimates of the variance of the red, green, and blue color values at each pixel using the VarianceEstimator class, which is defined in Section B.2.11. The rest of the implementation is a straightforward generalization of RGBFilm that also updates these additional values.

<<GBufferFilm::Pixel Definition>>= 
struct Pixel { double rgbSum[3] = {0., 0., 0.}; double weightSum = 0., gBufferWeightSum = 0.; AtomicDouble rgbSplat[3]; Point3f pSum; Float dzdxSum = 0, dzdySum = 0; Normal3f nSum, nsSum; Point2f uvSum; double rgbAlbedoSum[3] = {0., 0., 0.}; VarianceEstimator<Float> rgbVariance[3]; };