7.2 Sampling Interface
As first introduced in Section 7.1.5, the rendering approach implemented in pbrt involves choosing sample points in additional dimensions beyond 2D points on the image plane. Various algorithms will be used to generate these points, but all of their implementations inherit from an abstract Sampler class that defines their interface. The core sampling declarations and functions are in the files core/sampler.h and core/sampler.cpp. Each of the sample generation implementations is in its own source files in the samplers/ directory.
The task of a Sampler is to generate a sequence of -dimensional samples in , where one such sample vector is generated for each image sample and where the number of dimensions in each sample may vary, depending on the calculations performed by the light transport algorithms. (See Figure 7.13.)
Because sample values must be strictly less than 1, it’s useful to define a constant, OneMinusEpsilon, that represents the largest representable floating-point constant that is less than 1. Later, we will clamp sample vector values to be no larger than this value.
The simplest possible implementation of a Sampler would just return uniform random values in each time an additional component of the sample vector was needed. Such a sampler would produce correct images but would require many more samples (and thus, many more rays traced and much more time) to create images of the same quality achievable with more sophisticated samplers. The run-time expense for using better sampling patterns is approximately the same as that for lower-quality patterns like uniform random numbers; because evaluating the radiance for each image sample is much more expensive than computing the sample’s component values, doing this work pays dividends (Figure 7.14).
A few characteristics of these sample vectors are assumed in the following:
- The first five dimensions generated by Samplers are generally used by the Camera. In this case, the first two are specifically used to choose a point on the image inside the current pixel area; the third is used to compute the time at which the sample should be taken; and the fourth and fifth dimensions give a lens position for depth of field.
- Some sampling algorithms generate better samples in some dimensions of the sample vector than in others. Elsewhere in the system, we assume that in general, the earlier dimensions have the most well-placed sample values.
Note also that the -dimensional samples generated by the Sampler are generally not represented explicitly or stored in their entirety but are often generated incrementally as needed by the light transport algorithm. (However, storing the entire sample vector and making incremental changes to its components is the basis of the MLTSampler in Section 16.4.4, which is used by the MLTIntegrator in Section 16.4.5.)
7.2.1 Evaluating Sample Patterns: Discrepancy
Fourier analysis gave us one way of evaluating the quality of a 2D sampling pattern, but it took us only as far as being able to quantify the improvement from adding more evenly spaced samples in terms of the band-limited frequencies that could be represented. Given the presence of infinite frequency content from edges in images and given the need for -dimensional sample vectors for Monte Carlo light transport algorithms, Fourier analysis alone isn’t enough for our needs.
Given a renderer and a candidate algorithm for placing samples, one way to evaluate the algorithm’s effectiveness is to use that sampling pattern to render an image and to compute the error in the image compared to a reference image rendered with a large number of samples. We will use this approach to compare sampling algorithms later in this chapter, though it only tells us how well the algorithm did for one specific scene, and it doesn’t give us a sense of the quality of the sample points without going through the rendering process.
Outside of Fourier analysis, mathematicians have developed a concept called discrepancy that can be used to evaluate the quality of a pattern of -dimensional sample positions. Patterns that are well distributed (in a manner to be formalized shortly) have low discrepancy values, and thus the sample pattern generation problem can be considered to be one of finding a suitable low-discrepancy pattern of points. A number of deterministic techniques have been developed that generate low-discrepancy point sets, even in high-dimensional spaces. (Most of the sampling algorithms used later in this chapter use these techniques.)
The basic idea of discrepancy is that the quality of a set of points in an -dimensional space can be evaluated by looking at regions of the domain , counting the number of points inside each region, and comparing the volume of each region to the number of sample points inside. In general, a given fraction of the volume should have roughly the same fraction of the total number of sample points inside of it. While it’s not possible for this always to be the case, we can still try to use patterns that minimize the maximum difference between the actual volume and the volume estimated by the points (the discrepancy). Figure 7.15 shows an example of the idea in two dimensions.
To compute the discrepancy of a set of points, we first pick a family of shapes that are subsets of . For example, boxes with one corner at the origin are often used. This corresponds to
where . Given a sequence of sample points , the discrepancy of with respect to is
where is the number of points in and is the volume of .
The intuition for why Equation (7.4) is a reasonable measure of quality is that the value is an approximation of the volume of the box given by the particular points . Therefore, the discrepancy is the worst error over all possible boxes from this way of approximating the volume. When the set of shapes is the set of boxes with a corner at the origin, this value is called the star discrepancy, . Another popular option for is the set of all axis-aligned boxes, where the restriction that one corner be at the origin has been removed.
For a few particular point sets, the discrepancy can be computed analytically. For example, consider the set of points in one dimension
We can see that the star discrepancy of is
For example, take the interval . Then , but . This interval (and the intervals , etc.) is the interval where the largest differences between volume and fraction of points inside the volume are seen.
The star discrepancy of this sequence can be improved by modifying it slightly:
The bounds for the star discrepancy of a sequence of points in one dimension have been shown to be
Thus, the earlier sequence from Equation (7.5) has the lowest possible discrepancy for a sequence in 1D. In general, it is much easier to analyze and compute bounds for the discrepancy of sequences in 1D than for those in higher dimensions. For less simply constructed point sequences and for sequences in higher dimensions and for more irregular shapes than boxes, the discrepancy often must be estimated numerically by constructing a large number of shapes , computing their discrepancy, and reporting the maximum value found.
The astute reader will notice that according to the low-discrepancy measure, this uniform sequence in 1D is optimal, but earlier in this chapter we claimed that irregular jittered patterns were perceptually superior to uniform patterns for image sampling in 2D since they replaced aliasing error with noise. In that framework, uniform samples are clearly not optimal. Fortunately, low-discrepancy patterns in higher dimensions are much less uniform than they are in one dimension and thus usually work reasonably well as sample patterns in practice. Nevertheless, their underlying uniformity means that low-discrepancy patterns can be more prone to visually objectionable aliasing than patterns with pseudo-random variation.
Discrepancy alone isn’t necessarily a good metric: some low-discrepancy point sets exhibit some clumping of samples, where two or more samples may be quite close together. The Sobol sampler in Section 7.7 particularly suffers from this issue—see Figure 7.36, which shows a plot of its first two dimensions. Intuitively, samples that are too close together aren’t a good use of sampling resources: the closer one sample is to another, the less likely it is to give useful new information about the function being sampled. Therefore, computing the minimum distance between any two samples in a set of points has also proved to be a useful metric of sample pattern quality; the higher the minimum distance, the better.
There are a variety of algorithms for generating Poisson disk sampling patterns that score well by this metric. By construction, no two points in a Poisson disk pattern are closer than some distance . Studies have shown that the rods and cones in the eye are distributed in a similar way, which further validates the idea that this distribution is a good one for imaging. In practice, we have found that Poisson disk patterns work very well for sampling 2D images but are less effective than the better low discrepancy patterns for the higher-dimensional sampling done in more complex rendering situations; see the “Further Reading” section for more information.
7.2.2 Basic Sampler Interface
The Sampler base class not only defines the interface to samplers but also provides some common functionality for use by Sampler implementations.
All Sampler implementations must supply the constructor with the number of samples that will be generated for each pixel in the final image. In rare cases, it may be useful for the system to model the film as having only a single “pixel” that covers the entire viewing region. (This overloading of the definition of pixel is something of a stretch, but we allow it to simplify certain implementation aspects.) Since this “pixel” could potentially have billions of samples, we store the sample count using a variable with 64 bits of precision.
When the rendering algorithm is ready to start work on a given pixel, it starts by calling StartPixel(), providing the coordinates of the pixel in the image. Some Sampler implementations use the knowledge of which pixel is being sampled to improve the overall distribution of the samples that they generate for the pixel, while others ignore this information.
The Get1D() method returns the sample value for the next dimension of the current sample vector, and Get2D() returns the sample values for the next two dimensions. While a 2D sample value could be constructed by using values returned by a pair of calls to Get1D(), some samplers can generate better point distributions if they know that two dimensions will be used together.
In pbrt, we don’t support requests for 3D or higher dimensional sample values from samplers because these are generally not needed for the types of rendering algorithms implemented here. If necessary, multiple values from lower dimensional components can be used to construct higher dimensional sample points.
A sharp edge of these interfaces is that code that uses sample values must be carefully written so that it always requests sample dimensions in the same order. Consider the following code:
In this case, the first dimension of the sample vector will always be passed to the function a(); when the code path that calls b() is executed, b() will receive the second dimension. However, if the if test isn’t always true or false, then c() will sometimes receive a sample from the second dimension of the sample vector and otherwise receive a sample from the third dimension. Thus, efforts by the sampler to provide well-distributed sample points in each dimension being evaluated have been thwarted. Code that uses Samplers should therefore be carefully written so that it consistently consumes sample vector dimensions to avoid this issue.
For convenience, the Sampler base class provides a method that initializes a CameraSample for a given pixel.
Some rendering algorithms make use of arrays of sample values for some of the dimensions they sample; most sample-generation algorithms can generate higher quality arrays of samples than series of individual samples by accounting for the distribution of sample values across all elements of the array and across the samples in a pixel.
If arrays of samples are needed, they must be requested before rendering begins. The RequestDArray() methods should be called for each such dimension’s array before rendering begins—for example, in methods that override the SamplerIntegrator::Preprocess() method. For example, in a scene with two area light sources, where the integrator traces four shadow rays to the first source and eight to the second, the integrator would ask for two 2D sample arrays for each image sample, with four and eight samples each. (A 2D array is required because two dimensions are needed to parameterize the surface of a light.) In Section 13.7, we will see how using arrays of samples corresponds to more densely sampling some of the dimensions of the light transport integral using the Monte Carlo technique of “splitting.”
Most Samplers can do a better job of generating some particular sizes of these arrays than others. For example, samples from the ZeroTwoSequenceSampler are much better distributed in quantities that are in powers of 2. The Sampler::RoundCount() method helps communicate this information. Code that needs arrays of samples should call this method with the desired number of samples to be taken, giving the Sampler an opportunity to adjust the number of samples to a better number. The returned value should then be used as the number of samples to actually request from the Sampler. The default implementation returns the given count unchanged.
During rendering, the GetDArray() methods can be called to get a pointer to the start of a previously requested array of samples for the current dimension. Along the lines of Get1D() and Get2D(), these return a pointer to an array of samples whose size is given by the parameter n to the corresponding call to RequestDArray() during initialization. The caller must also provide the array size to the “get” method, which is used to verify that the returned buffer has the expected size.
When the work for one sample is complete, the integrator calls StartNextSample(). This call notifies the Sampler that subsequent requests for sample components should return values starting at the first dimension of the next sample for the current pixel. This method returns true until the number of the originally requested samples per pixel have been generated (at which point the caller should either start work on another pixel or stop trying to use more samples.)
Sampler implementations store a variety of state about the current sample: which pixel is being sampled, how many dimensions of the sample have been used, and so forth. It is therefore natural for it to be unsafe for a single Sampler to be used concurrently by multiple threads. The Clone() method generates a new instance of an initial Sampler for use by a rendering thread; it takes a seed value for the sampler’s random number generator (if any), so that different threads see different sequences of random numbers. Reusing the same pseudo-random number sequence across multiple image tiles can lead to subtle image artifacts, such as repeating noise patterns.
Implementations of the various Clone() methods aren’t generally interesting, so they won’t be included in the text here.
Some light transport algorithms (notably stochastic progressive photon mapping in Section 16.2) don’t use all of the samples in a pixel before going to the next pixel, but instead jump around pixels, taking one sample at a time in each one. The SetSampleNumber() method allows integrators to set the index of the sample in the current pixel to generate next. This method returns false once sampleNum is greater than or equal to the number of originally requested samples per pixel.
7.2.3 Sampler Implementation
The Sampler base class provides implementations of some of the methods in its interface. First, the StartPixel() method implementation records the coordinates of the current pixel being sampled and resets currentPixelSampleIndex, the sample number in the pixel currently being generated, to zero. Note that this is a virtual method with an implementation; subclasses that override this method are required to explicitly call Sampler::StartPixel().
The current pixel coordinates and sample number within the pixel are made available to Sampler subclasses, though they should treat these as read-only values.
When the pixel sample is advanced or explicitly set, currentPixelSampleIndex is updated accordingly. As with StartPixel(), the methods StartNextSample() and SetSampleNumber() are both virtual implementations; these implementations also must be explicitly called by overridden implementations of them in Sampler subclasses.
The base Sampler implementation also takes care of recording requests for arrays of sample components and allocating storage for their values. The sizes of the requested sample arrays are stored in samples1DArraySizes and samples2DArraySizes, and memory for an entire pixel’s worth of array samples is allocated in sampleArray1D and sampleArray2D. The first n values in each allocation are used for the corresponding array for the first sample in the pixel, and so forth.
As arrays in the current sample are accessed by the GetDArray() methods, array1DOffset and array2DOffset are updated to hold the index of the next array to return for the sample vector.
When a new pixel is started or when the sample number in the current pixel changes, these array offsets must be reset to 0.
Returning the appropriate array pointer is a matter of first choosing the appropriate array based on how many have been consumed in the current sample vector and then returning the appropriate instance of it based on the current pixel sample index.
7.2.4 Pixel Sampler
While some sampling algorithms can easily incrementally generate elements of each sample vector, others more naturally generate all of the dimensions’ sample values for all of the sample vectors for a pixel at the same time. The PixelSampler class implements some functionality that is useful for the implementation of these types of samplers.
The number of dimensions of the sample vectors that will be used by the rendering algorithm isn’t known ahead of time. (Indeed, it’s only determined implicitly by the number of Get1D() and Get2D() calls and the requested arrays.) Therefore, the PixelSampler constructor takes a maximum number of dimensions for which non-array sample values will be computed by the Sampler. If all of these dimensions of components are consumed, then the PixelSampler just returns uniform random values for additional dimensions.
For each precomputed dimension, the constructor allocates a vector to store sample values, with one value for each sample in the pixel. These vectors are indexed as sample1D[dim][pixelSample]; while interchanging the order of these indices might seem more sensible, this memory layout—where all of the sample component values for all of the samples for a given dimension are contiguous in memory—turns out to be more convenient for code that generates these values.
The key responsibility of Sampler implementations that inherit from PixelSampler then is to fill in the samples1D and samples2D arrays (in addition to sampleArray1D and sampleArray2D) in their StartPixel() methods.
current1DDimension and current2DDimension store the offsets into the respective arrays for the current pixel sample. They must be reset to 0 at the start of each new sample.
Given sample values in the arrays computed by the PixelSampler subclass, the implementation of Get1D() is just a matter of returning values for successive dimensions until all of the computed dimensions have been consumed, at which point uniform random values are returned.
The PixelSampler::Get2D() follows similarly, so it won’t be included here.
The random number generator used by the PixelSampler is protected rather than private; this is a convenience for some of its subclasses that also need random numbers when they initialize samples1D and samples2D.
7.2.5 Global Sampler
Other algorithms for generating samples are very much not pixel-based but naturally generate consecutive samples that are spread across the entire image, visiting completely different pixels in succession. (Many such samplers are effectively placing each additional sample such that it fills the biggest hole in the -dimensional sample space, which naturally leads to subsequent samples being inside different pixels.) These sampling algorithms are somewhat problematic with the Sampler interface as described so far: consider, for example, a sampler that generates the series of sample values shown in the middle column of Table 7.2 for the first two dimensions. These sample values are multiplied by the image resolution in each dimension to get sample positions in the image plane (here we’re considering a image for simplicity.) Note that for the sampler here (actually the HaltonSampler), each pixel is visited by each sixth sample. If we are rendering an image with three samples per pixel, then to generate all of the samples for the pixel , we need to generate the samples with indices 0, 6, and 12, and so forth.
|Sample index||sample coordinates||Pixel sample coordinates|
Given the existence of such samplers, we could have defined the Sampler interface so that it specifies the pixel being rendered for each sample rather than the other way around (i.e., the Sampler being told which pixel to render).
However, there were good reasons to adopt the current design: this approach makes it easy to decompose the film into small image tiles for multi-threaded rendering, where each thread computes pixels in a local region that can be efficiently merged into the final image. Thus, we must require that such samplers generate samples out of order, so that all samples for each pixel are generated in succession.
The GlobalSampler helps bridge between the expectations of the Sampler interface and the natural operation of these types of samplers. It provides implementations of all of the pure virtual Sampler methods, implementing them in terms of three new pure virtual methods that its subclasses must implement instead.
There are two methods that implementations must provide. The first one, GetIndexForSample(), performs the inverse mapping from the current pixel and given sample index to a global index into the overall set of sample vectors. For example, for the Sampler that generated the values in Table 7.2, if currentPixel was , then GetIndexForSample(0) would return 2, since the corresponding pixel sample coordinates for sample index 2, correspond to the first sample that lands in that pixel’s area.
Closely related, SampleDimension() returns the sample value for the given dimension of the indexth sample vector in the sequence. Because the first two dimensions are used to offset into the current pixel, they are handled specially: the value returned by implementations of this method should be the sample offset within the current pixel, rather than the original sample value. For the example in Table 7.2, SampleDimension(4,1) would return , since the second dimension of the sample with index 4 is that offset into the pixel .
When it’s time to start to generate samples for a pixel, it’s necessary to reset the dimension of the sample and find the index of the first sample in the pixel. As with all samplers, values for sample arrays are all generated next.
The dimension member variable tracks the next dimension that the sampler implementation will be asked to generate a sample value for; it’s incremented as Get1D() and Get2D() are called. intervalSampleIndex records the index of the sample that corresponds to the current sample in the current pixel.
It’s necessary to decide which dimensions of the sample vector to use for array samples. Under the assumption that the earlier dimensions will be better quality than later dimensions, it’s important to set aside the first few dimensions for the CameraSample, since the quality of those sample values often has a large impact on final image quality.
Therefore, the first dimensions up to arrayStartDim are devoted to regular 1D and 2D samples, and then the subsequent dimensions are devoted to first 1D and then 2D array samples. Finally, higher dimensions starting at arrayEndDim are used for further non-array 1D and 2D samples. It isn’t possible to compute arrayEndDim when the GlobalSampler constructor runs, since array samples haven’t been requested yet by the integrators. Therefore, this value is computed (repeatedly and redundantly) in the StartPixel() method.
The total number of array samples for all pixel samples is given by the product of the number of pixel samples and the requested sample array size.
Actually generating the array samples is just a matter of computing the number of needed values in the current sample dimension.
The 2D sample arrays are generated analogously; the <<Compute 2D array samples for GlobalSampler>> fragment isn’t included here.
When the pixel sample changes, it’s necessary to reset the current sample dimension counter and to compute the sample index for the next sample inside the pixel.
Given this machinery, getting regular 1D sample values is just a matter of skipping over the dimensions allocated to array samples and passing the current sample index and dimension to the implementation’s SampleDimension() method.
2D samples follow analogously.