Further Reading

Many books have been written on Monte Carlo integration. Hammersley and Handscomb (1964), Spanier and Gelbard (1969), and Kalos and Whitlock (1986) are classic references. More recent books on the topic include those by Fishman (1996) and Liu (2001). Chib and Greenberg (1995) have written an approachable but rigorous introduction to the Metropolis algorithm. The Monte Carlo and Quasi Monte Carlo Web site is a useful gateway to recent work in the field (www.mcqmc.org).

Good general references about Monte Carlo and its application to computer graphics are the theses by Lafortune (1996) and Veach (1997). Dutré’s Global Illumination Compendium (2003) also has much useful information related to this topic. The course notes from the Monte Carlo ray-tracing course at SIGGRAPH have a wealth of practical information (Jensen et al. 2001a, 2003).

The square to disk mapping was described by Shirley and Chiu (1997). The implementation here benefits by observations by Cline and “franz” that the logic could be simplified considerably from the original algorithm (Shirley 2011). Marques et al. (2013) note that well-distributed samples on left-bracket 0 comma 1 right-parenthesis squared still suffer some distortion when they are mapped to the sphere of directions and show how to generate low-discrepancy samples on the unit sphere.

Steigleder and McCool (2003) described an alternative to the multidimensional sampling approach from Section 13.6.7: they linearized 2D and higher dimensional domains into 1D using a Hilbert curve and then sampled using 1D samples over the 1D domain. This leads to a simpler implementation that still maintains desirable stratification properties of the sampling distribution, thanks to the spatial coherence preserving properties of the Hilbert curve.

Lawrence et al. (2005) described an adaptive representation for CDFs, where the CDF is approximated with a piecewise linear function with fewer, but irregularly spaced, vertices compared to the complete CDF. This approach can substantially reduce storage requirements and improve lookup efficiency, taking advantage of the fact that large ranges of the CDF may be efficiently approximated with a single linear function.

Cline et al. (2009b) observed that the time spent in binary searches for sampling from precomputed distributions (like Distribution1D does) can take a meaningful amount of execution time. (Indeed, pbrt spends nearly 7% of the time when rendering the car scene lit by an InfiniteAreaLight in the Distribution1D::SampleContinuous() method, which is used by InfiniteAreaLight::Sample_Li().) They presented two improved methods for doing this sort of search: the first stores a lookup table with n integer values, indexed by left floor n xi Subscript Baseline right floor , which gives the first entry in the CDF array that is less than or equal to xi Subscript . Starting a linear search from this offset in the CDF array can be much more efficient than a complete binary search over the entire array. They also presented a method based on approximating the inverse CDF as a piecewise linear function of xi Subscript and thus enabling constant-time lookups at a cost of some accuracy (and thus some additional variance).

The alias method is a technique that makes it possible to sample from discrete distributions in upper O left-parenthesis 1 right-parenthesis time (Walker 1974, 1977); this is much better than the upper O left-parenthesis log n right-parenthesis of the Distribution1D class when it is used for sampling discrete distributions. The downside of this approach is that it does not preserve sample stratification. See Schwarz’s writeup (2011) for information about implementing this approach well.

Arithmetic coding offers another interesting way to approach sampling from distributions (MacKay 2003, p. 118; Piponi 2012). If we have a discrete set of probabilities we’d like to generate samples from, one way to approach the problem is to encode the CDF as a binary tree where each node splits the left-bracket 0 comma 1 right-parenthesis interval at some point and where, given a random sample xi Subscript , we determine which sample value it corresponds to by traversing the tree until we reach the leaf node for its sample value. Ideally, we’d like to have leaf nodes that represent higher probabilities be higher up in the tree, so that it takes fewer traversal steps to find them (and thus, those more frequently generated samples are found more quickly). Looking at the problem from this perspective, it can be shown that the optimal structure of such a tree is given by Huffman coding, which is normally used for compression.

Mitchell (1996b) wrote a paper that investigates the effects of stratification for integration problems in graphics (including the 2D problem of pixel antialiasing). In particular, he investigated the connection between the complexity of the function being integrated and the effect of stratification. In general, the smoother or simpler the function, the more stratification helps: for pixels with smooth variation over their areas or with just a few edges passing through them, stratification helps substantially, but as the complexity in a pixel is increased, the gain from stratification is reduced. Nevertheless, because stratification never increases variance, there’s no reason not to do it.

Starting with Durand et al.’s work (2005), a number of researchers have approached the analysis of light transport and variance from Monte Carlo integration for rendering using Fourier analysis. See Pilleboue et al.’s paper (2015) for recent work in this area, including references to previous work. Among other results, they show that Poisson disk patterns give higher variance than simple jittered patterns. They found that the blue noise pattern of de Goes et al. (2012) was fairly effective. Other work in this area includes the paper by Subr and Kautz (2013).

Multiple importance sampling was developed by Veach and Guibas (Veach and Guibas 1995; Veach 1997). Normally, a pre-determined number of samples are taken using each sampling technique; see Pajot et al. (2011) and Lu et al. (2013) for approaches to adaptively distributing the samples over strategies in an effort to reduce variance by choosing those that are the best match to the integrand.


  1. Booth, T. E. 1986. A Monte Carlo learning/biasing experiment with intelligent random numbers. Nuclear Science and Engineering 92, 465–81.
  2. Chib, S., and E. Greenberg. 1995. Understanding the Metropolis–Hastings algorithm. The American Statistician 49 (4), 327–35.
  3. Cline, D., A. Razdan, and P. Wonka. 2009. A comparison of tabular PDF inversion methods. Computer Graphics Forum 28 (1), 154–60.
  4. Cline, D., D. Adams, and P. Egbert. 2008. Table-driven adaptive importance sampling. Computer Graphics Forum (Proceedings of the 2008 Eurographics Symposium on Rendering) 27 (4), 1115–23.
  5. de Goes, F., K. Breeden, V. Ostromoukhov, and M. Desbrun. Blue noise through optimal transport. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 31 (6), 171:1–171:11.
  6. Durand, F., N. Holzschuch, C. Soler, E. Chan, and F. X. Sillion. A frequency analysis of light transport. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2005) 24 (3), 1115–26.
  7. Dutré, P. 2003. Global illumination compendium. www.cs.kuleuven.ac.be/~phil/GI/.
  8. Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms, and Applications. New York: Springer-Verlag.
  9. Hammersley, J., and D. Handscomb. 1964. Monte Carlo Methods. New York: John Wiley.
  10. Jensen, H. W., J. Arvo, M. Fajardo, P. Hanrahan, D. Mitchell, M. Pharr, and P. Shirley. 2001a. State of the art in Monte Carlo ray tracing for realistic image synthesis. In SIGGRAPH 2001 Course 29, Los Angeles.
  11. Jensen, H. W., J. Arvo, P. Dutré, A. Keller, A. Owen, M. Pharr, and P. Shirley. 2003. Monte Carlo ray tracing. In SIGGRAPH 2003 Course, San Diego.
  12. Kalos, M. H., and P. A. Whitlock. 1986. Monte Carlo Methods: Volume I: Basics. New York: Wiley.
  13. Lafortune, E. 1996. Mathematical models and Monte Carlo algorithms for physically based rendering. Ph.D. thesis, Katholieke Universiteit Leuven.
  14. Lawrence, J., S. Rusinkiewicz, and R. Ramamoorthi. 2005. Adaptive numerical cumulative distribution functions for efficient importance sampling. Rendering Techniques 2005: 16th Eurographics Workshop on Rendering, 11–20.
  15. Liu, J. S. 2001. Monte Carlo Strategies in Scientific Computing. New York: Springer-Verlag.
  16. Lu, H., R. Pacanowski, and X. Granier. Second-order approximation for variance reduction in multiple importance sampling. Computer Graphics Forum 32 (7), 131–36.
  17. MacKay, D. 2003. Information Theory, Inference, and Learning Algorithms. Cambridge: Cambridge University Press.
  18. Marques, R., C. Bouville, M. Ribardière, L. P. Santos, and K. Bouatouch. Spherical Fibonacci point sets for illumination integrals. Computer Graphics Forum (Proceedings of the 2013 Eurographics Symposium on Rendering) 32 (4), 134–43.
  19. Metropolis, N., A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller. 1953. Equation of state calculations by fast computing machines. Journal of Chemical Physics 21 (6), 1087–92.
  20. Mitchell, D. P. 1996b. Consequences of stratified sampling in graphics. In Proceedings of SIGGRAPH ’96, Computer Graphics Proceedings, Annual Conference Series, New Orleans, Louisiana, 277–80.
  21. Motwani, R., and P. Raghavan. 1995. Randomized Algorithms. Cambridge, U.K.: Cambridge University Press.
  22. Owen, A. B. 1998. Latin supercube sampling for very high-dimensional simulations. Modeling and Computer Simulation 8 (1), 71–102.
  23. Pajot, A., L. Barthe, M. Paulin, and P. Poulin. Representativity for robust and adaptive multiple importance sampling. IEEE Transactions on Visualization and Computer Graphics 17 (8), 1108–21.
  24. Pilleboue, A., G. Singh, D. Coeurjolly, M. Kazhdan, and V. Ostromoukhov. Variance analysis for Monte Carlo integration. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2015) 34 (4), 124:1–124:14.
  25. Piponi, D. 2012. Lossless decompression and the generation of random samples. http://blog.sigfpe.com/2012/01/lossless-decompression-and-generation.html.
  26. Ross, S. M. 2002. Introduction to Probability Models (8th ed.). San Diego: Academic Press.
  27. Schwarz, K. 2011. Darts, dice, and coins: sampling from a discrete distribution. http://www.keithschwarz.com/darts-dice-coins/.
  28. Shirley, P. Improved code for concentric map. http://psgraphics.blogspot.com/2011/01/improved-code-for-concentric-map.html.
  29. Shirley, P., and K. Chiu. 1997. A low distortion map between disk and square. Journal of Graphics Tools 2 (3), 45–52.
  30. Spanier, J., and E. M. Gelbard. 1969. Monte Carlo Principles and Neutron Transport Problems. Reading, Massachusetts: Addison-Wesley.
  31. Steigleder, M., and M. McCool. 2003. Generalized stratified sampling using the Hilbert curve. Journal of Graphics Tools 8 (3), 41–47.
  32. Subr, K., and J. Kautz. Fourier analysis of stochastic sampling strategies for assessing bias and variance in integration. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2013) 32 (4), 128:1–128:12.
  33. Veach, E. 1997. Robust Monte Carlo methods for light transport simulation. Ph.D. thesis, Stanford University.
  34. Veach, E., and L. J. Guibas. 1995. Optimally combining sampling techniques for Monte Carlo rendering. In Computer Graphics (SIGGRAPH ’95 Proceedings), 419–28.
  35. Walker, A. J. An efficient method for generating discrete random variables with general distributions. ACM Transactions on Mathematical Software 3 (3), 253–56.
  36. Walker, A. J. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters 10 (8): 127–28.