## 13.3 Sampling Random Variables

In order to evaluate the Monte Carlo estimator in Equation (13.3), it is necessary to be able to draw random samples from the chosen probability distribution. This section will introduce the basics of this process and demonstrate it with some straightforward examples. The next two sections will introduce more complex approaches to sampling before Section 13.6 develops the approach for the general multidimensional case. In Chapters 14, 15, and 16, we’ll then see how to use these techniques to generate samples from the distributions defined by BSDFs, light sources, cameras, and scattering media.

### 13.3.1 The Inversion Method

The inversion method uses one or more uniform random variables and maps them to random variables from the desired distribution. To explain how this process works in general, we will start with a simple discrete example. Suppose we have a process with four possible outcomes. The probabilities of each of the four outcomes are given by , , , and , respectively, with the requirement that . The corresponding PDF is shown in Figure 13.1.

In order to draw a sample from this distribution, we first find the CDF . In the continuous case, is the indefinite integral of . In the discrete case, we can directly construct the CDF by stacking the bars on top of each other, starting at the left. This idea is shown in Figure 13.2. Notice that the height of the rightmost bar must be 1 because of the requirement that all probabilities sum to 1.

To draw a sample from the distribution, we then take a uniform random number and use it to select one of the possible outcomes using the CDF, doing so in a way that chooses a particular outcome with probability equal to the outcome’s own probability. This idea is illustrated in Figure 13.3, where the events’ probabilities are projected onto the vertical axis and a random variable selects among them. It should be clear that this draws from the correct distribution—the probability of the uniform sample hitting any particular bar is exactly equal to the height of that bar.

In order to generalize
this technique to continuous distributions, consider what happens as the
number of discrete possibilities approaches infinity. The PDF from
Figure 13.1 becomes a smooth curve, and the CDF from
Figure 13.2 becomes its integral. The projection process
is still the same, although if the
function is continuous, the projection has a convenient mathematical
interpretation—it represents inverting the CDF and evaluating the inverse
at . This
technique is thus called the *inversion method*.

More precisely, we can draw a sample from an arbitrary PDF with the following steps:

- Compute the CDF .
- Compute the inverse .
- Obtain a uniformly distributed random number .
- Compute .

#### Example: Power Distribution

As an example of how this procedure works, consider the task of drawing
samples from a *power distribution*, . The PDF of
the power distribution is

for the constant that normalizes the PDF. The first task to tackle is to find the PDF. In most cases, this simply involves computing the value of the proportionality constant , which can be found using the constraint that :

Therefore, . We can integrate this PDF to get the CDF:

and inversion is simple: . Therefore, given a uniform random variable , samples can be drawn from the power distribution as

Another approach is to use a sampling trick that works only for the power
distribution, selecting
. This random variable is
distributed according to the power distribution as well. To see why,
note that is the probability that *all* the . But the are independent, so

which is exactly the desired CDF. Depending on the speed of your random number generator, this technique can be faster than the inversion method for small values of .

#### Example: Exponential Distribution

When rendering images with participating media, it is frequently useful to draw samples from a distribution . As before, the first step is to normalize this distribution so that it integrates to one. In this case, we’ll assume for now that the range of values we’d like the generated samples to cover is rather than , so

Thus we know that , and our PDF is . Now, we integrate to find :

This function is easy to invert:

and we can draw samples thus:

It may be tempting to simplify the log term from to , under the theory that because , these are effectively the same and a subtraction can thus be saved. The problem with this idea is that may have the value 0 but never has the value 1. With the simplification, it’s possible that we’d try to take the logarithm of 0, which is undefined; this danger is avoided with the first formulation. While a value of 0 may seem very unlikely, it does happen (especially in the world of floating-point arithmetic, rather than the real numbers). Some of the low-discrepancy sampling patterns introduced in Chapter 7 are particularly prone to generating the value 0.

#### Example: Piecewise-Constant 1D Functions

An interesting exercise is to work out how to sample from 1D piecewise-constant functions (step functions). Without loss of generality, we will just consider piecewise-constant functions defined over .

Assume that the 1D function’s domain is split into equal-sized pieces of size . These regions start and end at points , where ranges from 0 to , inclusive. Within each region, the value of the function is a constant (Figure 13.4(a)).

The value of is

The integral is

and so it is easy to construct the PDF for as . By direct application of the relevant formulae, the CDF is a piecewise linear function defined at points by

Between two points and , the CDF is linearly increasing with slope .

Recall that in order to sample we need to invert the CDF to find the value such that

Because the CDF is monotonically increasing, the value of must be between the and such that and . Given an array of CDF values, this pair of values can be efficiently found with a binary search.

`Distribution1D` is a small utility class that represents a
piecewise-constant 1D function’s PDF and CDF and provides methods to
perform this sampling efficiently.

`offset`>>

`offset`>>

The `Distribution1D` constructor takes `n` values of a
piecewise-constant function `f`. It makes its own copy of the
function values, computes the function’s CDF, and also stores
the integral of the function, `funcInt`. Note that the constructor
allocates `n+1` `Float`s for the `cdf` array because if has
step values, then we need to store the value of the CDF at each of the
values of . Storing the CDF value of 1 at the end of the array
is redundant but simplifies the sampling code later.

This constructor computes the integral of using
Equation (13.5). It stores the result in the
`cdf` array for now so that it doesn’t need to allocate additional
temporary space for it.

Now that the value of the integral over all of is stored in
`cdf[n]`, this value can be copied into `funcInt` and the CDF can
be normalized by dividing through all entries by this value.

The `Distribution1D::SampleContinuous()` method uses the given random sample
`u` to sample from its distribution. It returns the corresponding
value and the value of the PDF . If the optional
`off` parameter is not `nullptr`, it returns the offset into the
array of function values of the largest index where the CDF was less than
or equal to `u`. (In other words, `cdf[*off] <= u < cdf[*off+1]`.)

`offset`>>

Mapping `u` to an interval matching the above criterion is carried out
using the efficient binary search implemented in `FindInterval()` (see
Appendix A for details).

`offset`>>=

Given the pair of CDF values that straddle `u`, we can compute .
First, we determine how far `u` is between `cdf[offset]` and
`cdf[offset+1]`, `du`, where `du` is 0 if `u ==
cdf[offset]` and goes up to 1 if `u == cdf[offset+1]`. Because the
CDF is piecewise linear, the sample value is the same offset between
and (Figure 13.4(b)).

The PDF for this sample is easily computed since we have the
function’s integral in `funcInt`. (Note that the offset `offset`
into the CDF array has been computed in a way so that `func[offset]`
gives the value of the function in the CDF range that the sample landed in.)

Finally, the appropriate value of is computed and returned.

In a small overloading of semantics, `Distribution1D` can also be used
for discrete 1D probability distributions where there are some number of
buckets , each with some weight, and we’d like to sample among the
buckets with probability proportional to their relative weights. This
functionality is used, for example, by some of the `Integrator`s
that compute a discrete distribution for
the light sources in the scene with weights given by the lights’ powers.
Sampling from the discrete distribution just requires figuring out which
pair of CDF values the sample value lies between; the PDF is computed as
the discrete probability of sampling the corresponding bucket.

`offset`>>

It’s also useful to be able to compute the PDF for sampling a given value from the discrete PDF.

### 13.3.2 The Rejection Method

For some functions , it may not be possible to integrate them in
order to find their PDFs, or it may not be possible to analytically
invert their CDFs. The *rejection method* is a technique for
generating samples according to a function’s distribution without needing
to do either of these steps; it is essentially a dart-throwing approach.
Assume that we want to draw samples from some such function but we do
have a PDF that satisfies for some scalar constant
, and suppose that we do know how to sample from . The rejection
method is then:

This procedure repeatedly chooses a pair of random variables . If the point lies under , then the sample is accepted. Otherwise, it is rejected and a new sample pair is chosen. This idea is illustrated in Figure 13.5. Without going into too much detail, it should be clear that the efficiency of this scheme depends on how tightly bounds . This technique works in any number of dimensions.

Rejection sampling isn’t actually used in any of the Monte Carlo
algorithms currently implemented in `pbrt`. We will normally prefer to find
distributions that are similar to that can be sampled directly, so
that well-distributed points on can be mapped to sample points
that are in turn well-distributed,
as will be discussed in Section 13.8.
Nevertheless,
rejection sampling is an important technique to be aware of, particularly
when debugging Monte Carlo implementations. For example, if one suspects
the presence of a bug in code that draws samples from some distribution
using the inversion method, then one can replace it with a straightforward
implementation based on the rejection method and see if the Monte Carlo
estimator computes the same result. Of course, it’s necessary to take
many samples in situations like these, so that variance in the estimates
doesn’t mask errors.

#### Example: Rejection Sampling a Unit Circle

Suppose we want to select a uniformly distributed point inside a unit circle. Using the rejection method, we simply select a random position inside the circumscribed square and return it if it falls inside the circle. This process is shown in Figure 13.6.

The function `RejectionSampleDisk()` implements this algorithm. A
similar approach will work to generate uniformly distributed samples on the inside
of any complex shape as long as it has an inside–outside test.

In general, the efficiency of rejection sampling depends on the percentage of samples that are expected to be rejected. For the problem of finding uniform points in the 2D case, this is easy to compute. It is the area of the circle divided by the area of the square: . If the method is applied to generate samples in hyperspheres in the general -dimensional case, however, the volume of an -dimensional hypersphere actually goes to 0 as increases, and this approach becomes increasingly inefficient.