Further Reading

Hacker’s Delight (Warren 2006) is a delightful and thought-provoking exploration of bit-twiddling algorithms like those used in some of the utility routines in this appendix. Sean Anderson (2004) has a Web page filled with a collection of bit-twiddling techniques like the ones in IsPowerOf2() and RoundUpPow2() at graphics.stanford.edu/~seander/bithacks.html.

The MurmurHash hashing function that is wrapped by pbrt’s Hash() and HashBuffer() functions is due to Appleby (2011) and the implementation of MixBits() is due to Stafford (2011), who found the various constant values used in the implementation via search.

The inverse bilinear interpolation function implemented in InvertBilinear() is due to Quilez (2010) and SinXOverX() is thanks to Hatch (2003).

The algorithm implemented in TwoSum() is due to Møller (1965) and Knuth (1969), and the FMA-based TwoProd() was developed by Ogita et al. (2005). The approach used in the CompensatedSum class is due to Kahan (1965). The approach used in DifferenceOfProducts() is also attributed to Kahan; its error was analyzed by Jeannerod et al. (2013).

Welford (1962) developed the algorithm that is implemented in the VarianceEstimator class. Its Merge() method is based on an algorithm developed by Chan et al. (1979).

Atkinson’s book (1993) on numerical analysis discusses algorithms for matrix inversion and solving linear systems. See Moore’s book (1966) for an introduction to interval arithmetic.

Farin’s book (2001) is a good introduction to splines. The blossoming approach was introduced by Ramshaw (1987); his report remains a readable introduction to the topic. A subsequent publication drew further connections to polar forms and related work (Ramshaw 1989).

The PCG random number generator was developed by O’Neill (2014). The paper describing its implementation is well written and also features extensive discussion of a range of previous pseudo-random number generators and the challenges that they have faced in passing rigorous tests of their quality (L’Ecuyer and Simard 2007).

The article “UTF-8 Everywhere” by Radzivilovsky et al. (2012) is a good introduction to Unicode and also makes a strong case for adopting the UTF-8 representation. pbrt follows the approach they propose for interoperating with Windows’s UTF-16-based APIs. At over 1,000 pages, the length of the official Unicode specification gives some sense of the complexities in representing multi-lingual text (Unicode Consortium 2020).

Gamma correction has a long history in computer graphics. Poynton (2002a, 2002b) has written comprehensive FAQs on issues related to color representation and gamma correction. The sRGB encoding was described by the International Electrotechnical Commission (1999). See Gritz and d’Eon (2008) for a detailed discussion of the implications of gamma correction for rendering and how to correctly account for it in rendering systems.

McKenney’s book on parallel programming is wonderfully written and has comprehensive coverage of the underlying issues, as well as many useful techniques for high-performance parallel programming on CPUs (2021). Drepper’s paper (2007) is a useful resource for understanding performance issues related to caches, cache coherence, and main memory access, particularly in multicore systems.

Boehm’s paper “Threads Cannot Be Implemented as a Library” (2005) makes the remarkable (and disconcerting) observation that multi-threading cannot be reliably implemented without the compiler having explicit knowledge of the fact that multi-threaded execution is expected. Boehm presented a number of examples that demonstrate the corresponding dangers in 2005-era compilers and language standards like C and C++ that did not have awareness of threading. Fortunately, the C++11 and C11 standards addressed the issues that he identified.

pbrt’s parallel for loop–based approach to multi-threading is a widely used technique for multi-threaded programming; the OpenMP standard supports a similar construct (and much more) (OpenMP Architecture Review Board 2013). A slightly more general model for multi-core parallelism is available from task systems, where computations are broken up into a set of independent tasks that can be executed concurrently. That model is supported through RunAsync(). Blumofe et al. (1996) described the task scheduler in Cilk, and Blumofe and Leiserson (1999) described the work-stealing algorithm that is the mainstay of many current high-performance task systems.

References

  1. Anderson, S. 2004. Bit twiddling hacks. graphics.stanford.edu/ seander/bithacks.html.
  2. Appleby, A. 2011. MurmurHash3. https://sites.google.com/site/murmurhash/.
  3. Atkinson, K. 1993. Elementary Numerical Analysis. New York: John Wiley & Sons.
  4. Blumofe, R., and C. Leiserson. 1999. Scheduling multithreaded computations by work stealing. Journal of the ACM 46 (5), 720–48.
  5. Blumofe, R., C. Joerg, B. Kuszmaul, C. Leiserson, K. Randall, and Y. Zhou. 1996. Cilk: An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing 37 (1), 55–69.
  6. Boehm, H.-J. 2005. Threads cannot be implemented as a library. ACM SIGPLAN Notices 40 (6), 261–68.
  7. Chan, T. F., G. Golub, R. J. LeVeque. 1979. Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Department of Computer Science, Stanford University.
  8. Drepper, U. 2007. What every programmer should know about memory. people.redhat.com/drepper/cpumemory.pdf.
  9. Farin, G. 2001. Curves and Surfaces for CAGD: A Practical Guide (5th ed.). San Francisco: Morgan Kaufmann.
  10. Gritz, L., and E. d’Eon. 2008. The importance of being linear. In H. Nguyen (ed.), GPU Gems 3, 529–42. Boston, Massachusetts: Addison-Wesley.
  11. Guthe, S., and P. Heckbert 2005. Non-power-of-two Mipmap creation. NVIDIA Technical Report.
  12. Hatch, D. 2003. The right way to calculate stuff. http://www.plunk.org/ hatch/rightway.html.
  13. International Electrotechnical Commission (IEC). 1999. Multimedia systems and equipment—Colour measurement and management—Part 2-1: Colour management—Default RGB colour space—sRGB. IEC Standard 61966-2-1.
  14. Jeannerod, C.-P., N. Louvet, and J.-M. Muller. 2013. Further analysis of Kahan’s algorithm for the accurate computation of 2 times 2 determinants. Mathematics of Computation 82 (284), 2245–64.
  15. Kahan, W. 1965. Further remarks on reducing truncation errors. Communications of the ACM 8 (1), 40.
  16. Kainz, F., R. Bogart, and D. Hess. 2004. The OpenEXR File Format. In R. Fernando (ed.), GPU Gems, 425–44. Reading, Massachusetts: Addison-Wesley.
  17. Knuth, D. E. 1969. The Art of Computer Programming: Seminumerical Algorithms. Reading, Massachusetts: Addison-Wesley.
  18. L’Ecuyer, P., and R. Simard. 2007. TestU01: A C library for empirical testing of random number generators. ACM Transactions on Mathematical Software 33 (4), 22:1–40.
  19. Møller, O. 1965. Quasi double precision in floating-point arithmetic. BIT Numerical Mathematics 5, 37–50.
  20. McKenney, P. E. 2021. Is Parallel Programming Hard, and, If So, What Can You Do About It? https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html.
  21. Moore, R. E. 1966. Interval Analysis. Englewood Cliffs, New Jersey: Prentice Hall.
  22. O’Neill, M. 2014. PCG: A family of simple fast space-efficient statistically good algorithms for random number generation. Unpublished manuscript. http://www.pcg-random.org/paper.html.
  23. Ogita, T., S. M. Rump, and S. Oishi. 2005. Accurate sum and dot product. SIAM Journal on Scientific Computing 26 (6), 1955–88.
  24. OpenMP Architecture Review Board. 2013. OpenMP Application Program Interface. http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf.
  25. Poynton, C. 2002a. Frequently-asked questions about color. www.poynton.com/ColorFAQ.html.
  26. Poynton, C. 2002b. Frequently-asked questions about gamma. www.poynton.com/GammaFAQ.html.
  27. Quilez, I. 2010. Inverse bilinear interpolation. https://www.iquilezles.org/www/articles/ibilinear/ibilinear.htm.
  28. Radzivilovsky, P., Y. Galka, and S. Novgorodov. 2012. UTF-8 everywhere. http://utf8everywhere.org.
  29. Ramshaw, L. 1987. Blossoming: A connect-the-dots approach to splines. Digital Systems Research Center Technical Report.
  30. Ramshaw, R. 1989. Blossoms are polar forms. Computer Aided Geometric Design 6 (4), 323–58.
  31. Stafford, D. 2011. Better bit mixing—improving on MurmurHash3’s 64-bit finalizer. http://zimbry.blogspot.com/2011/09/better-bit-mixing-improving-on.html.
  32. Unicode Consortium. 2020. The Unicode Standard: Version 13.0. https://www.unicode.org/versions/Unicode13.0.0/UnicodeStandard-13.0.pdf.
  33. Warren, H. 2006. Hacker’s Delight. Reading, Massachusetts: Addison-Wesley.
  34. Welford, B. P. 1962. Note on a method for calculating corrected sums of squares and products. Technometrics 4 (3), 419–20.