7.3 Stratified Sampling
The first Sampler implementation that we will introduce subdivides pixel areas into rectangular regions and generates a single sample inside each region. These regions are commonly called strata, and this sampler is called the StratifiedSampler. The key idea behind stratification is that by subdividing the sampling domain into nonoverlapping regions and taking a single sample from each one, we are less likely to miss important features of the image entirely, since the samples are guaranteed not to all be close together. Put another way, it does us no good if many samples are taken from nearby points in the sample space, since each new sample doesn’t add much new information about the behavior of the image function. From a signal processing viewpoint, we are implicitly defining an overall sampling rate such that the smaller the strata are, the more of them we have, and thus the higher the sampling rate.
The stratified sampler places each sample at a random point inside each stratum by jittering the center point of the stratum by a random amount up to half the stratum’s width and height. The nonuniformity that results from this jittering helps turn aliasing into noise, as discussed in Section 7.1. The sampler also offers an unjittered mode, which gives uniform sampling in the strata; this mode is mostly useful for comparisons between different sampling techniques rather than for rendering high quality images.
Direct application of stratification to high-dimensional sampling quickly leads to an intractable number of samples. For example, if we divided the 5D image, lens, and time sample space into four strata in each dimension, the total number of samples per pixel would be . We could reduce this impact by taking fewer samples in some dimensions (or not stratifying some dimensions, effectively using a single stratum), but we would then lose the benefit of having well-stratified samples in those dimensions. This problem with stratification is known as the curse of dimensionality.
We can reap most of the benefits of stratification without paying the price in excessive total sampling by computing lower dimensional stratified patterns for subsets of the domain’s dimensions and then randomly associating samples from each set of dimensions. (This process is sometimes called padding.) Figure 7.16 shows the basic idea: we might want to take just four samples per pixel but still have the samples be stratified over all dimensions. We independently generate four 2D stratified image samples, four 1D stratified time samples, and four 2D stratified lens samples. Then we randomly associate a time and lens sample value with each image sample. The result is that each pixel has samples that together have good coverage of the sample space.
Figure 7.17 shows the improvement in image quality from using stratified lens samples versus using unstratified random samples when rendering depth of field.
Figure 7.18 shows a comparison of a few sampling patterns. The first is a completely random pattern: we generated a number of samples without using the strata at all. The result is terrible; some regions have few samples and other areas have clumps of many samples. The second is a uniform stratified pattern. In the last, the uniform pattern has been jittered, with a random offset added to each sample’s location, keeping it inside its cell. This gives a better overall distribution than the purely random pattern while preserving the benefits of stratification, though there are still some clumps of samples and some regions that are undersampled.
As a PixelSampler subclass, the implementation of StartPixel() must both generate 1D and 2D samples for the number of dimensions nSampledDimensions passed to the PixelSampler constructor as well as samples for the requested arrays.
After the initial stratified samples are generated, they are randomly shuffled; this is the padding approach described at the start of the section. If this shuffling wasn’t done, then the sample dimensions’ values would be correlated in a way that would lead to errors in images—for example, both the first 2D sample used to choose the film location as well as the first 2D lens sample would always both be in the lower left stratum adjacent to the origin.
The 1D and 2D stratified sampling routines are implemented as utility functions. Both loop over the given number of strata in the domain and place a sample point in each one.
StratifiedSample2D() similarly generates samples in the range .
The Shuffle() function randomly permutes an array of count sample values, each of which has nDimensions dimensions. (In other words, blocks of values of size nDimensions are permuted.)
Arrays of samples present us with a quandary: for example, if an integrator asks for an array of 64 2D sample values in the sample vector for each sample in a pixel, the sampler has two different goals to try to fulfill:
- It’s desirable that the samples in the array themselves be well distributed in 2D (e.g., by using an stratified grid). Stratification here will improve the quality of the computed results for each individual sample vector.
- It’s desirable to ensure that each of the samples in the array for one image sample isn’t too similar to any of the sample values for samples nearby in the image. Rather, we’d like the points to be well distributed with respect to their neighbors, so that over the region around a single pixel, there is good coverage of the entire sample space.
Rather than trying to solve both of these problems simultaneously here, the StratifiedSampler only addresses the first one. The other samplers later in this chapter will revisit this issue with more sophisticated techniques and solve both of them simultaneously to various degrees.
A second complication comes from the fact that the caller may have asked for an arbitrary number of samples per image sample, so stratification may not be easily applied. (For example, how do we generate a stratified 2D pattern of seven samples?) We could just generate an or stratified pattern, but this only gives us the benefit of stratification in one dimension and no guarantee of a good pattern in the other dimension. A StratifiedSampler::RoundSize() method could round requests up to the next number that’s the square of integers, but instead we will use an approach called Latin hypercube sampling (LHS), which can generate any number of samples in any number of dimensions with a reasonably good distribution.
LHS uniformly divides each dimension’s axis into regions and generates a jittered sample in each of the regions along the diagonal, as shown on the left in Figure 7.20. These samples are then randomly shuffled in each dimension, creating a pattern with good distribution.
An advantage of LHS is that it minimizes clumping of the samples when they are projected onto any of the axes of the sampling dimensions. This property is in contrast to stratified sampling, where of the samples in a 2D pattern may project to essentially the same point on each of the axes. Figure 7.21 shows this worst-case situation for a stratified sampling pattern.
In spite of addressing the clumping problem, LHS isn’t necessarily an improvement to stratified sampling; it’s easy to construct cases where the sample positions are essentially colinear and large areas of the sampling domain have no samples near them (e.g., when the permutation of the original samples is the identity, leaving them all where they started). In particular, as increases, Latin hypercube patterns are less and less effective compared to stratified patterns.
The general-purpose LatinHypercube() function generates an arbitrary number of LHS samples in an arbitrary dimension. The number of elements in the samples array should thus be nSamples*nDim.
To do the permutation, this function loops over the samples, randomly permuting the sample points in one dimension at a time. Note that this is a different permutation than the earlier Shuffle() routine: that routine does one permutation, keeping all nDim sample points in each sample together, while here nDim separate permutations of a single dimension at a time are done (Figure 7.22).
Given the LatinHypercube() function, we can now write the code to compute sample arrays for the current pixel. 1D samples are stratified and then randomly shuffled, while 2D samples are generated using Latin hypercube sampling.
We’ll use the scene in Figure 7.23 to demonstrate properties of some of the Sampler implementations.
Figure 7.24 shows the improvement from good samples for the DirectLightingIntegrator. The first image was computed with 1 image sample per pixel, each with 16 shadow samples. The second was computed with 16 image samples per pixel, each with 1 shadow sample. Because the StratifiedSampler could generate a good LHS pattern for the first case, the quality of the shadow is much better, even with the same total number of shadow samples taken.