## 10.1 Texture Sampling and Antialiasing

The sampling task from Chapter 8 was a frustrating one since the aliasing problem was known to be unsolvable from the start. The infinite frequency content of geometric edges and hard shadows guarantees aliasing in the final images, no matter how high the image sampling rate. (Our only consolation is that the visual impact of this remaining aliasing can be reduced to unobjectionable levels with a sufficient number of well-placed samples.)

Fortunately, things are not this difficult from the start for textures: either there is often a convenient analytic form of the texture function available, which makes it possible to remove excessively high frequencies before sampling it, or it is possible to be careful when evaluating the function so as not to introduce high frequencies in the first place. When this problem is carefully addressed in texture implementations, as is done through the rest of this chapter, there is usually no need for more than one sample per pixel in order to render an image without texture aliasing. (Of course, sufficiently reducing Monte Carlo noise from lighting calculations may be another matter.)

Two problems must be addressed in order to remove aliasing from texture functions:

- The sampling rate in texture space must be computed. The screen-space sampling rate is known from the image resolution and pixel sampling rate, but here we need to determine the resulting sampling rate on a surface in the scene in order to find the rate at which the texture function is being sampled.
- Given the texture sampling rate, sampling theory must be applied to guide the computation of a texture value that does not have higher frequency variation than can be represented by the sampling rate (e.g., by removing excess frequencies beyond the Nyquist limit from the texture function).

These two issues will be addressed in turn throughout the rest of this section.

### 10.1.1 Finding the Texture Sampling Rate

Consider an arbitrary texture function that is a function of position, , defined on a surface in the scene. If we ignore the complications introduced by visibility—the possibility that another object may occlude the surface at nearby image samples or that the surface may have a limited extent on the image plane—this texture function can also be expressed as a function over points on the image plane, , where is the function that maps image points to points on the surface. Thus, gives the value of the texture function as seen at image position .

As a simple example of this idea, consider a 2D texture function applied to a quadrilateral that is perpendicular to the axis and has corners at the world-space points , , , and . If an orthographic camera is placed looking down the axis such that the quadrilateral precisely fills the image plane and if points on the quadrilateral are mapped to 2D texture coordinates by

then the relationship between and screen pixels is straightforward:

where the overall image resolution is (Figure 10.2). Thus, given a sample spacing of one pixel in the image plane, the sample spacing in texture parameter space is , and the texture function must remove any detail at a higher frequency than can be represented at that sampling rate.

This relationship between pixel coordinates and texture coordinates, and thus the relationship between their sampling rates, is the key bit of information that determines the maximum frequency content allowable in the texture function. As a slightly more complex example, given a triangle with texture coordinates at its vertices and viewed with a perspective projection, it is possible to analytically find the differences in and across the sample points on the image plane. This approach was the basis of texture antialiasing in graphics processors before they became programmable.

For more complex scene geometry, camera projections, and mappings to texture coordinates, it is much more difficult to precisely determine the relationship between image positions and texture parameter values. Fortunately, for texture antialiasing, we do not need to be able to evaluate for arbitrary but just need to find the relationship between changes in pixel sample position and the resulting change in texture sample position at a particular point on the image. This relationship is given by the partial derivatives of this function, and . For example, these can be used to find a first-order approximation to the value of ,

If these partial derivatives are changing slowly with respect to the distances and , this is a reasonable approximation. More importantly, the values of these partial derivatives give an approximation to the change in texture sample position for a shift of one pixel in the and directions, respectively, and thus directly yield the texture sampling rate. For example, in the previous quadrilateral example, , , , and .

The key to finding the values of these derivatives in the general
case lies in values from the `RayDifferential` structure, which was defined in
Section 3.6.1. This structure is initialized for
each camera ray by the `Camera::GenerateRayDifferential()` method; it
contains not only the ray being traced through the scene but also
two additional rays, one offset horizontally one pixel sample from the camera ray
and the other offset vertically by one pixel sample. All the geometric ray
intersection routines use only the main camera ray for their computations;
the auxiliary rays are ignored (this is easy to do because
`RayDifferential` is a subclass of `Ray`).

We can use the offset rays to estimate the
partial derivatives of the mapping p from image position to
world-space position and the partial derivatives of the mappings and
from to parametric coordinates, giving the partial
derivatives of rendering-space positions and and the
derivatives of parametric coordinates , , ,
and . In Section 10.2, we will see how these can be
used to compute the screen-space derivatives of arbitrary quantities based
on or and consequently the sampling rates of these
quantities. The values of these derivatives at the intersection
point are stored in the `SurfaceInteraction` structure.

The `SurfaceInteraction::ComputeDifferentials()` method computes these
values. It is called by `SurfaceInteraction::GetBSDF()` before the
`Material`’s `GetBxDF()`
method is called so that these values will be available for any texture
evaluation routines that are called by the material.

Ray differentials are not available for all rays traced by the system—for example, rays starting from light sources traced for photon mapping or bidirectional path tracing. Further, although we will see how to compute ray differentials after rays undergo specular reflection and transmission in Section 10.1.3, how to compute ray differentials after diffuse reflection is less clear. In cases like those as well as the corner case where one of the differentials’ directions is perpendicular to the surface normal, which leads to undefined numerical values in the following, an alternative approach based on approximating the ray differentials of a ray from the camera to the intersection point is used.

The key to estimating the derivatives is the assumption that the surface is locally flat with respect to the sampling rate at the point being shaded. This is a reasonable approximation in practice, and it is hard to do much better. Because ray tracing is a point-sampling technique, we have no additional information about the scene in between the rays we have traced. For highly curved surfaces or at silhouette edges, this approximation can break down, though this is rarely a source of noticeable error.

For this approximation, we need the plane through the point intersected by the main ray that is tangent to the surface. This plane is given by the implicit equation

where , , , and . We can then compute the intersection points and between the auxiliary rays and and this plane (Figure 10.3). These new points give an approximation to the partial derivatives of position on the surface and , based on forward differences:

Because the differential rays are offset one pixel sample in each direction, there is no need to divide these differences by a value, since .

The ray–plane intersection algorithm described in Section 6.1.2 gives the value where a ray described by origin and direction intersects a plane described by :

To compute this value for the two auxiliary rays, the
plane’s coefficient is computed first. It is not necessary to compute the , , and
coefficients, since they are available in `n`. We can then apply
the formula directly.

`px`and

`py`>>=

For cases where ray differentials are not available, we will add a method
to the `Camera` interface that returns approximate values for and
at a point on a surface in the scene. These
should be a reasonable approximation to the differentials of a ray from the
camera that found an intersection at the given point. Cameras’
implementations of this method must return
reasonable results even for points outside of their viewing volumes for
which they cannot actually generate rays.

`CameraBase` provides an implementation of an approach to
approximating these differentials that is based on the minimum of the camera ray
differentials across the entire image. Because all of `pbrt`’s current camera
implementations inherit from `CameraBase`, the following method takes
care of all of them.

This method starts by orienting the camera so that the camera-space axis is aligned with the vector from the camera position to the intersection point. It then uses lower bounds on the spread of rays over the image that are provided by the camera to find approximate differential rays. It then intersects these rays with the tangent plane at the intersection point. (See Figure 10.4.)

`CameraBase::Approximate_dp_dxy()`effectively reorients the camera to point at the provided intersection point. In camera space, the ray to the intersection then has origin and direction . The extent of ray differentials on the tangent plane defined by the surface normal at the intersection point can then be found.

`CameraBase::Approximate_dp_dxy()`. We represent the filter area as the product and visualize the base 2 logarithm of the ratio of areas computed by the two techniques. Log 2 ratios greater than 0 indicate that the camera-based approximation estimated a larger filter area.

There are a number of sources of error in this approximation. Beyond the fact that it does not account for how light was scattered at intermediate surfaces for multiple-bounce ray paths, there is also the fact that it is based on the minimum of the camera’s differentials for all rays. In general, it tries to underestimate those derivatives rather than overestimate them, as we prefer aliasing over blurring here. The former error can at least be addressed with additional pixel samples. In order to give a sense of the impact of some of these approximations, Figure 10.5 has visualization that compares the local area estimated by those derivatives at intersections to the area computed using the actual ray differentials generated by the camera.

For the first step of the algorithm, we have an intersection point in
rendering space `p` that we would like to transform into a coordinate
system where it is along the axis with the camera at the origin.
Transforming to camera space gets us started and an additional
rotation that transforms the vector from the origin to the intersection
point to be aligned with finishes the job. The coefficient of the
plane equation can then be found by taking the dot product of the
transformed point and surface normal. Because the and components
of the transformed point are equal to 0, the dot
product can be optimized to be a single multiply.

Camera implementations that inherit from `CameraBase` and use this
method must initialize the following member variables with values that are
lower bounds on each of the respective position and direction differentials
over all the pixels in the image.

The main ray in this coordinate system has origin and direction . Adding the position and direction differential vectors to those gives the origin and direction of each differential ray. Given those, the same calculation as earlier gives us the values for the ray–plane intersections for the differential rays and thence the intersection points.

For an orthographic camera, these differentials can be computed directly.
There is no change in the direction vector, and the position differentials
are the same at every pixel. Their values are already computed in the
`OrthographicCamera` constructor, so can be used directly to initialize
the base class’s member variables.

All the other cameras call
`FindMinimumDifferentials()`,
which estimates these values by sampling at many points across the diagonal
of the image and storing the minimum of all the differentials
encountered. That function is not very interesting, so it is not included here.

`PerspectiveCamera`>>=

Given the intersection points `px` and `py`, and
can now be estimated by taking their differences with the main intersection
point. To get final estimates of the partial derivatives, these vectors
must be transformed back out into rendering space and scaled to account for
the actual pixel sampling rate. As with the initial ray differentials that
were generated in the <<Scale camera ray differentials based on
image sampling rate>> fragment, these are scaled to account for the pixel
sampling rate.

A call to this method takes care of computing the and
differentials in the `ComputeDifferentials()` method.

We now have both the partial derivatives and as well as, one way or another, and . From them, we would now like to compute , , , and . Using the chain rule, we can find that

( has a similar expression with replaced by and replaced by .)

Equation (10.1) can be written as a matrix equation where the two following matrices that include have three rows, one for each of ’s , , and components:

This is an overdetermined linear system since there are three equations but only two unknowns, and . An effective solution approach in this case is to apply linear least squares, which says that for a linear system of the form with and known, the least-squares solution for is given by

In this case, , , and .

is a matrix with elements given by dot products of partial derivatives of position:

Its inverse is

Note that in both matrices the two off-diagonal entries are equal. Thus,
the fragment that computes the entries of only needs to compute three values. The inverse of the
matrix determinant is computed here as well. If its value is infinite, the
linear system cannot be solved; setting `invDet` to 0 causes the
subsequently computed derivatives to be 0, which leads to point-sampled
textures, the best remaining option in that case.

The portion of the solution is easily computed. For the derivatives with respect to screen-space , we have the two-element matrix

The solution for screen-space is analogous.

The solution to Equation (10.2) for each partial derivative can be found by taking the product of Equations (10.3) and (10.4). We will gloss past the algebra; its result can be directly expressed in terms of the values computed so far.

In certain tricky cases (e.g., with highly distorted parameterizations or at object silhouette edges), the estimated partial derivatives may be infinite or have very large magnitudes. It is worth clamping them to reasonable values in that case to prevent overflow and not-a-number values in subsequent computations that are based on them.

### 10.1.2 Ray Differentials at Medium Transitions

Now is a good time to take care of another detail related to ray
differentials: recall from Section 9.1.5 that materials may return
an unset `BSDF` to indicate an interface between two scattering media
that does not itself scatter light. In this case, it is necessary to spawn
a new ray in the same direction, but past the intersection on the surface.
In this case we would like the effect of the ray differentials to
be the same as if no scattering had occurred. This can be achieved by
setting the differential origins to the points given by evaluating the ray
equation at the intersection (see
Figure 10.6).

### 10.1.3 Ray Differentials for Specular Reflection and Transmission

Given the effectiveness of ray differentials for finding filter regions for
texture antialiasing for camera rays, it is useful to extend the method
to make it possible to determine texture-space sampling rates for objects
that are seen indirectly via specular reflection or refraction; objects
seen in mirrors, for example, should not have texture aliasing, identical to
the case for directly visible objects.
Igehy (1999) developed an elegant solution to
the problem of how to find the appropriate differential rays for specular
reflection and refraction, which is the approach used in
`pbrt`.

Figure 10.7 illustrates the difference that proper texture filtering for specular reflection and transmission can make: it shows a glass ball and a mirrored ball on a plane with a texture map containing high-frequency components. Ray differentials ensure that the images of the texture seen via reflection and refraction from the balls are free of aliasing artifacts. Here, ray differentials eliminate aliasing without excessively blurring the texture.

`pbrt`estimates the change in reflected direction as a function of image-space position and approximates the ray differential’s direction with the main ray’s direction added to the estimated change in direction.

To compute the reflected or transmitted ray differentials at a
surface intersection point, we need an approximation to the rays that would
have been traced at the intersection points for the two offset rays in the
ray differential that hit the surface
(Figure 10.8). The new ray for the main
ray is found by sampling the BSDF, so here we only need to compute the
outgoing rays for the and differentials. This task is handled
by another `SurfaceInteraction::SpawnRay()` variant that takes an
incident ray differential as well as information about the BSDF and the
type of scattering that occurred.

It is not well defined what the ray differentials should be in the case of
non-specular scattering. Therefore, this method handles the two types of
specular scattering only; for all other types of rays, approximate
differentials will be computed
at their subsequent intersection points with `Camera::Approximate_dp_dxy()`.

A few variables will be used for both types of scattering, including the partial derivatives of the surface normal with respect to and on the image and and , which are computed using the chain rule.

For both reflection and transmission, the origin of each differential ray can be found using the already-computed approximations of how much the surface position changes with respect to position on the image plane and .

Finding the directions of these rays is slightly trickier. If we know how much the reflected direction changes with respect to a shift of a pixel sample in the and directions on the image plane, we can use this information to approximate the direction of the offset rays. For example, the direction for the ray offset in is

Recall from Equation (9.1) that for a normal and outgoing direction the direction for perfect specular reflection is

The partial derivatives of this expression are easily computed:

Using the properties of the dot product, it can further be shown that

The value of has already been computed from the difference
between the direction of the ray differential’s main ray and the direction
of the offset ray, and
all the other necessary quantities are readily available from the
`SurfaceInteraction`.

A similar process of differentiating the equation for the direction of a
specularly transmitted ray, Equation (9.4), gives
the equation to find the differential change in the transmitted direction.
`pbrt` computes refracted rays as

where is flipped if necessary to lie in the same hemisphere as , and where is the relative index of refraction from ’s medium to ’s medium.

If we denote the term in brackets by , then we have . Taking the partial derivative in , we have

Using some of the values found from computing specularly reflected ray differentials, we can find that we already know how to compute all of these values except for .

Before we get to the computation of ’s partial derivatives, we will
start by reorienting the surface normal if necessary so that it lies on the
same side of the surface as . This matches `pbrt`’s computation of
refracted ray directions.

Returning to and considering , we have

Its first term can be evaluated with already known values. For the second term, we will start with Snell’s law, which gives

If we square both sides of the equation and take the partial derivative , we find

We now can solve for :

Putting it all together and simplifying, we have

The partial derivative in is analogous and the implementation follows.

If a ray undergoes many specular bounces, ray differentials sometimes drift
off to have very large magnitudes, which can leave a trail of infinite and
not-a-number values in their wake when they are used for texture filtering
calculations. Therefore, the final fragment in this `SpawnRay()`
method computes the squared length of all the differentials. If
any is greater than , the ray differentials are discarded and the
`RayDifferential` `hasDifferentials` value is set to
`false`. The fragment that handles this, <<Squash potentially
troublesome differentials>>, is simple and thus not included here.

### 10.1.4 Filtering Texture Functions

To eliminate texture aliasing, it is necessary to
remove frequencies in texture functions that are past the Nyquist limit for
the texture sampling rate. The goal is to compute, with as few approximations as
possible, the result of the *ideal texture resampling* process, which says
that in order to evaluate a texture function at a point on the
image without aliasing, we must first
band-limit it, removing frequencies beyond the Nyquist limit by
convolving it with the sinc filter:

where, as in Section 10.1.1, maps pixel locations to points in the texture function’s domain. The band-limited function in turn should then be convolved with the pixel filter centered at the point on the screen at which we want to evaluate the texture function:

This gives the theoretically perfect value for the texture as projected onto the screen.

In practice, there are many simplifications that can be made to this process. For example, a box filter may be used for the band-limiting step, and the second step is usually ignored completely, effectively acting as if the pixel filter were a box filter, which makes it possible to do the antialiasing work completely in texture space. (The EWA filtering algorithm in Section 10.4.4 is a notable exception in that it assumes a Gaussian pixel filter.)

Assuming box filters then if, for example, the texture function is defined over parametric coordinates, the filtering task is to average it over a region in :

The extent of the filter region can be determined using the derivatives from the previous sections—for example, setting

and similarly for and to conservatively specify the box’s extent.

The box filter is easy to use, since it can be applied
analytically by computing the average of the texture function over the
appropriate region. Intuitively, this is a reasonable approach to the
texture filtering problem, and it can be computed directly for many texture
functions. Indeed, through the rest of this chapter, we will often use a
box filter to average texture function values between samples and
informally use the term *filter region* to describe the area being
averaged over. This is the most common approach when filtering texture
functions.

Even the box filter, with all of its shortcomings, gives acceptable results for texture filtering in many cases. One factor that helps is the fact that a number of samples are usually taken in each pixel. Thus, even if the filtered texture values used in each one are suboptimal, once they are filtered by the pixel reconstruction filter, the end result generally does not suffer too much.

An alternative to using the box filter to filter texture functions is to use the observation that the effect of the ideal sinc filter is to let frequency components below the Nyquist limit pass through unchanged but to remove frequencies past it. Therefore, if we know the frequency content of the texture function (e.g., if it is a sum of terms, each one with known frequency content), then if we replace the high-frequency terms with their average values, we are effectively doing the work of the sinc prefilter.

Finally, for texture functions where none of these techniques is easily
applied, a final option is *supersampling*—the function
is evaluated and filtered at multiple locations near the main evaluation
point, thus increasing the sampling rate in texture space. If a box filter
is used to filter these sample values, this is equivalent to averaging the
value of the function. This approach can be expensive if the texture
function is complex to evaluate, and as with image sampling, a very large
number of samples may be needed to remove aliasing. Although this is a
brute-force solution, it is still more efficient than increasing the image
sampling rate, since it does not incur the cost of tracing more rays through
the scene.