## 10.4 Image Texture

The `ImageTexture` class stores a 2D array of point-sampled values of
a texture function. It uses these samples to reconstruct a continuous
image function that can be evaluated at an arbitrary position.
These sample values are often called *texels*, since they are similar
to pixels in an image but are used in the context of a texture. Image textures are the
most widely used type of texture in computer graphics; digital photographs,
scanned artwork, images created with image-editing programs, and images
generated by renderers are all extremely useful sources of data for this
particular texture representation (Figure 10.8).

The
term *texture map* is often used to refer to this type of texture, although
this usage blurs the distinction between the mapping that computes
texture coordinates and the texture function itself. The implementation of this
texture is in the files `textures/imagemap.h`
and
`textures/imagemap.cpp`.

The `ImageTexture` class is different from other textures in the system
in that it is parameterized on both the data type of the texels it stores in
memory as well as the data type of the value that it returns. Making this
distinction allows us to create, for example, `ImageTexture`s that
store `RGBSpectrum` values in memory, but always return `Spectrum`
values. In this way, when the system is compiled with full-spectral
rendering enabled, the memory cost to store full `SampledSpectrum`
texels doesn’t need to be paid for source images that only have
RGB components.

The caller provides the `ImageTexture` with the filename of an image
map, parameters that control the filtering of the map for antialiasing, and
parameters that make it possible to scale and gamma-correct the texture
values. The scale parameter will be explained later in this
section, and the texture filtering parameters will be explained in
Section 10.4.3. The contents of the file are used to create an
instance of the `MIPMap` class that stores the texels in memory and
handles the details of reconstruction and filtering to reduce aliasing.

For an `ImageTexture` that stores `RGBSpectrum` values in
memory, its `MIPMap` stores the image data using
three floating-point values for each sample. This is a somewhat
wasteful representation, since a single image map may have millions of
texels and may not need the full 32 bits of accuracy from the `Float`s
used to store RGB values for each of them. Exercise 10.1 at the end of
this chapter discusses this issue further.

### 10.4.1 Texture Memory Management

Each image map may require a meaningful amount of memory, and a complex
scene may have thousands of image maps. Because each image texture may be
reused many times within a scene, `pbrt` maintains a table of image maps
that have been loaded so far, so that they are only loaded into memory once
even if they are used in more than one `ImageTexture`. The
`ImageTexture` constructor calls the `static`
`ImageTexture::GetTexture()` method to get a `MIPMap`
representation of the desired texture.

`MIPMap`from texture cache if present>>

`MIPMap`for

`filename`>>

`Tmemory`and create

`MIPMap`>>

`MIPMap`>> } textures[texInfo].reset(mipmap);

`TexInfo` is a simple structure that holds the
image map’s filename and filtering parameters; all of these must match for
a `MIPMap` to be reused in another `ImageTexture`. Its definition
is straightforward (its members exactly correspond to the parameters of the
`GetTexture()` method) and won’t be included here.

`MIPMap`from texture cache if present>>=

If the texture hasn’t been loaded yet, a call to `ReadImage()` yields
the image contents.

`MIPMap`for

`filename`>>=

`Tmemory`and create

`MIPMap`>>

`MIPMap`>> } textures[texInfo].reset(mipmap);

Because `ReadImage()` returns an array of `RGBSpectrum`
values for the texels, it is necessary to convert these
values to the particular type `Tmemory` of texel that this
`MIPMap` is storing (e.g., `Float`) if the type of `Tmemory`
isn’t `RGBSpectrum`. The per-texel conversion is handled by the
utility routine `ImageTexture::convertIn()`.

`Tmemory`and create

`MIPMap`>>=

Per-texel conversion is done using C++ function overloading. For every
type to which we would like to be able to convert these values, a separate
`ImageTexture::convertIn()` function must be provided. In the loop over
texels earlier, C++’s function overloading mechanism will select the
appropriate instance of `ImageTexture::convertIn()` based on the
destination type. Unfortunately, it is not possible to return the
converted value from the function, since C++ doesn’t support overloading by
return type.

In addition to converting types, these functions optionally scale and gamma correct the texel values to map them to a desired range. Gamma correction is particularly important to handle carefully: computer displays generally don’t exhibit a linear relationship between the pixel values to be displayed and the radiance that they emit. Thus, an artist may create a texture map where, as seen on an LCD display, one part of the image appears twice as bright as another. However, the corresponding pixel values won’t in fact have a 2:1 relationship. (Conversely, pixels whose values have a 2:1 relationship don’t lead to a 2:1 brightness ratio.) This discrepancy is a problem for a renderer using such an image as a texture map, since the renderer usually expects a linear relationship between texel values and the quantity that they represent.

`pbrt` follows the sRGB standard, which prescribes a specific transfer curve
matching the typical behavior of CRT displays. This standard is widely
supported on 2015-era devices; other (non-CRT) devices such as LCDs or inkjet
printers typically accept sRGB gamma-corrected values as input and then re-map
them internally to match the device-specific behavior.

The sRGB gamma curve is a piecewise function with a linear term for low values and a power law for medium to large values.

The `GammaCorrect()` will be used to write sRGB-compatible 8-bit image
files in the `WriteImage()` function. To import textures into `pbrt`, we are
interested in the opposite direction: removing an existing gamma correction to
reestablish a linear relationship between brightness and pixel values.

Refer to the “Further Reading” section for a more detailed discussion of gamma correction.

`InverseGammaCorrect()` is only applied when indicated via the `gamma`
argument of `convertIn()`. By default, this is the case when the input
image has an 8-bit color depth.

If the texture file wasn’t found or was unreadable, an image map with a single
sample with a value of one is created so that the renderer can continue to
generate an image of the scene without needing to abort execution. The
`ReadImage()` function will issue a warning message in this
case.

`MIPMap`>>=

After the image is rendered and the system is cleaning up, the
`ClearCache()` method is called to free the memory for the entries in
the texture cache.

### 10.4.2 ImageTexture Evaluation

The `ImageTexture::Evaluate()` routine does the usual texture
coordinate computation and then hands the image map lookup to the
`MIPMap`, which does the image filtering work for antialiasing. The
returned value is still of type `Tmemory`; another conversion step
similar to `ImageTexture::convertIn()` above converts to the returned
type `Treturn`.

### 10.4.3 MIP Maps

As always, if the image function has higher frequency detail than can be represented by the texture sampling rate, aliasing will be present in the final image. Any frequencies higher than the Nyquist limit must be removed by prefiltering before the function is evaluated. Figure 10.9 shows the basic problem we face: an image texture has texels that are samples of some image function at a fixed frequency. The filter region for the lookup is given by its center point and offsets to the estimated texture coordinate locations for the adjacent image samples. Because these offsets are estimates of the texture sampling rate, we must remove any frequencies higher than twice the distance to the adjacent samples in order to satisfy the Nyquist criterion.

The texture sampling and reconstruction process has a few key differences from the image sampling process discussed in Chapter 7. These differences make it possible to address the antialiasing problem with more effective and less computationally expensive techniques. For example, here it is inexpensive to get the value of a sample—only an array lookup is necessary (as opposed to having to trace a number of rays to compute radiance). Further, because the texture image function is fully defined by the set of samples and there is no mystery about what its highest frequency could be, there is no uncertainty related to the function’s behavior between samples. These differences make it possible to remove detail from the texture before sampling, thus eliminating aliasing.

However, the texture sampling rate will typically change from pixel to pixel. The sampling rate is determined by scene geometry and its orientation, the texture coordinate mapping function, and the camera projection and image sampling rate. Because the texture sampling rate is not fixed, texture filtering algorithms need to be able to filter over arbitrary regions of texture samples efficiently.

The `MIPMap` class implements two methods for efficient texture filtering
with spatially varying filter widths. The first, trilinear interpolation, is
fast and easy to implement and was widely used for texture filtering in
early graphics hardware. The second, elliptically weighted averaging, is slower and
more complex but returns extremely high-quality results.
Figure 10.10
compares the result of texture filtering using trilinear interpolation and the EWA algorithm.

To limit the potential number of texels that need to be accessed, both of these
filtering methods use an *image pyramid* of increasingly lower resolution
prefiltered versions of the original image to accelerate their
operation.
The original
image texels are at the bottom level of the pyramid,
and the image at each level is half the resolution of the previous level,
up to the top level, which has a single texel representing the average of all of
the texels in the original image. This collection of
images needs at most more memory than storing the
most detailed level alone and can be used to quickly find filtered values over
large regions of the original image. The basic idea behind the pyramid is
that if a large area of texels needs to be filtered a reasonable
approximation is to use a higher level of the pyramid and do the filtering
over the same area there, accessing many fewer texels.

`i`th texel>>

In the constructor, the `MIPMap` copies the image data provided by the
caller, resizes the image if necessary to ensure that its resolution is a power
of two in each direction, and initializes a lookup table used by the
elliptically weighted average filtering method in Section 10.4.5. It
also records the desired behavior for texture coordinates that fall outside of
the legal range in the `wrapMode` argument.

`sWeights`to zoom in direction>>

`MIPMap` is a template class that is parameterized by the data type of
the image texels. `pbrt` creates `MIPMap`s of both `RGBSpectrum` and
`Float` images; `Float` MIP maps are used for representing
directional distributions of intensity from goniometric light sources (Section 12.3.3), for
example. The `MIPMap` implementation requires that the type `T`
support just a few basic operations, including addition and multiplication
by a scalar.

The `ImageWrap` enumerant, passed to the `MIPMap` constructor,
specifies the desired behavior when the supplied
texture coordinates are not in the legal range.

Implementation of an image pyramid is somewhat easier if the resolution of
the original image is an exact power of two in each direction; this ensures
that there is a straightforward relationship between the level of the
pyramid and the number of texels at that level. If the user has provided
an image where the resolution in one or both of the dimensions is not a
power of two, then the `MIPMap` constructor starts by resizing the
image up to the next power-of-two resolution greater than the original
resolution before constructing the pyramid. Exercise 10.5 at the end of
the chapter describes an approach to building image pyramids with
non-power-of-two resolutions.

Image resizing here involves more application of the sampling and reconstruction theory from Chapter 7: we have an image function that has been sampled at one sampling rate, and we’d like to reconstruct a continuous image function from the original samples to resample at a new set of sample positions. Because this represents an increase in the sampling rate from the original rate, we don’t have to worry about introducing aliasing due to undersampling high-frequency components in this step; we only need to reconstruct and directly resample the new function. Figure 10.11 illustrates this task in 1D.

`MIPMap`performs two 1D resampling steps with a separable reconstruction filter. (a) A 1D function reconstructed from four samples, denoted by dots. (b) To represent the same image function with more samples, we just need to reconstruct the continuous function and evaluate it at the new positions.

The `MIPMap` uses a separable reconstruction filter for this task;
recall from Section 7.8 that separable filters can
be written as the product of 1D filters: .
One advantage of using a separable filter is that if we are using one to
resample an image from one resolution to another , then we can
implement the resampling as two 1D resampling steps, first
resampling in to create an image of resolution and then resampling
that image to create the final image of resolution . Resampling the
image via two 1D steps in this manner simplifies implementation and makes the
number of texels accessed for each texel in the final image a linear function
of the filter width, rather than a quadratic one.

`sWeights`to zoom in direction>>

Reconstructing the original image function and sampling it at a new texel’s position are mathematically equivalent to centering the reconstruction filter kernel at the new texel’s position and weighting the nearby texels in the original image appropriately. Thus, each new texel is a weighted average of a small number of texels in the original image.

The `MIPMap::resampleWeights()` method determines which original texels
contribute to each new texel and what the values are of the contribution
weights for each new texel. It returns the values in an array of
`ResampleWeight` structures for all of the texels in a 1D row or column
of the image. Because this information is the same for all rows of the
image when resampling in and all columns when resampling
in , it’s more efficient to compute it once for each of the two passes
and then reuse it many times for each one. Given these weights, the image
is first magnified in the direction, turning the original image with
resolution `resolution` into an image with resolution
`(resPow2[0], resolution[1])`, which is stored in `resampledImage`.
The implementation here allocates enough space in `resampledImage` to
hold the final zoomed image with resolution `(resPow2[0], resPow2[1])`, so two
large allocations can be avoided.

`sWeights`to zoom in direction>>

For the reconstruction filter used here, no more than four of the original
texels will contribute to each new texel after zooming, so `ResampleWeight`
only needs to hold four weights. Because the four texels are contiguous, we
only store the offset to the first one.

`i`th texel>>

Just as it was important to distinguish between discrete and continuous
pixel coordinates in Chapter 7, the same
issues need to be addressed with texel coordinates here. We will use the
same conventions as described in Section 7.1.7.
For each new texel, this function starts by computing its
continuous coordinates in terms of the old texel coordinates. This value
is stored in `center`, because it is the center of the reconstruction filter
for the new texel. Next, it is necessary to find the offset to the first texel
that contributes to the new texel. This is a slightly tricky
calculation—after subtracting the filter width to find the start of the
filter’s nonzero range, it is necessary to add an extra offset to
the continuous coordinate before taking the floor to find the discrete
coordinate. Figure 10.12 illustrates why this
offset is needed.

Starting from this first contributing texel, this function loops over four texels, computing each one’s offset to the center of the filter kernel and the corresponding filter weight.

`i`th texel>>=

The reconstruction filter function used to compute the weights,
`Lanczos()`, is the same as the one in `LanczosSincFilter::Sinc()`.

Depending on the filter function used, the four filter weights may not sum to one. Therefore, to ensure that the resampled image won’t be any brighter or darker than the original image, the weights are normalized here.

Once the weights have been computed, it’s easy to apply them to compute the
zoomed texels. For each of the `resolution[1]` horizontal scan lines
in the original image, a pass is made across the `resPow2[0]` texels
in the -zoomed image using the precomputed weights to compute their
values. Because the computation for each texel is completely independent
of the computation for every other texel, and because this computation
requires a bit of processing, it’s worthwhile to split the image into
sections and work on them in parallel with multiple threads.

`sWeights`to zoom in direction>>=

The `ImageWrap` parameter to the `MIPMap` constructor determines
the convention to be used for out-of-bounds texel coordinates. It either
remaps them to valid values with a modulus or clamp calculation or uses a
black texel value.

The process for resampling in the direction is almost the same as for , so we won’t include the implementation here.

Once we have an image with resolutions that are powers of two, the levels of the MIP map can be initialized, starting from the bottom (finest) level. Each higher level is found by filtering the texels from the previous level.

Because image maps use a fair amount of memory, and because 8 to 20 texels are
typically used per image texture lookup to compute a filtered value, it’s
worth carefully considering how the texels are laid out in memory, since
reducing cache misses while accessing the texture map can noticeably
improve the renderer’s performance. Because both of the texture
filtering methods implemented in this section access a set of texels in a
rectangular region of the image map each time a lookup is performed, the
`MIPMap` uses the `BlockedArray` template class to store the 2D
arrays of texel values, rather than a standard C++ array. The
`BlockedArray` reorders the array values in memory in a way that
improves cache coherence when the values are accessed with these kinds of
rectangular patterns; it is described in Section A.4.4 in
Appendix A.

The base level of the MIP map, which holds the original data (or the
resampled data, if it didn’t originally have power-of-two resolutions), is
initialized by the default `BlockedArray` constructor.

Before showing how the rest of the levels are initialized, we will first
define a texel access function that will be used during that process.
`MIPMap::Texel()` returns a reference to the texel value for the given
discrete integer-valued texel position. As described earlier, if an
out-of-range texel coordinate is passed in, then based on the value of `wrapMode`,
this method effectively
repeats the texture over the entire 2D texture coordinate domain by taking
the modulus of the coordinate with respect
to the texture size, clamps the
coordinates to the valid range so that the border pixels are used, or
returns a black texel for out-of-bounds coordinates.

For non-square images, the resolution in one direction must be clamped to
one for the upper levels of the image pyramid, where there is still
downsampling to do in the larger of the two resolutions. This is handled
by the following `std::max()` calls:

The `MIPMap` uses a simple box filter to average four texels from the
previous level to find the value at the current texel. Using the Lanczos
filter here would give a slightly better result for this computation,
although this modification is left for Exercise 10.4 at the end of the
chapter. As with resampling to power-of-two resolutions, doing this
downfiltering using multiple threads rather than with a regular `for`
loop is worthwhile here.

### 10.4.4 Isotropic Triangle Filter

The first of the two `MIPMap::Lookup()` methods uses a triangle filter
over the texture samples to remove high frequencies. Although this filter
function does not give high-quality results, it can be implemented very
efficiently. In addition to the coordinates of the evaluation point,
the caller passes this method a filter width for the lookup, giving the extent of
the region of the texture to filter across. This method filters over a square
region in texture space, so the width should be conservatively chosen to avoid
aliasing in both the and directions. Filtering techniques like this one
that do not support a filter extent that is non-square or non-axis-aligned are
known as *isotropic*. The primary disadvantage of isotropic filtering
algorithms is that textures viewed at an oblique angle will appear
blurry, since the required sampling rate along one axis will be very
different from the sampling rate along the other in this case.

Because filtering over many texels for wide filter widths would be inefficient, this method chooses a MIP map level from the pyramid such that the filter region at that level would cover four texels at that level. Figure 10.13 illustrates this idea.

Since the resolutions of the levels of the pyramid are all powers of two, the resolution of level is . Therefore, to find the level with a texel spacing width requires solving

for . In general, this will be a floating-point value between two MIP map levels.

As shown by Figure 10.13, applying a triangle filter to
the four texels around the sample point will either filter over too small a
region or too large a region (except for very carefully selected filter
widths). The implementation here applies the triangle filter at both of
these levels and blends between them according to how close `level` is
to each of them. This helps hide the transitions from one MIP map level to
the next at nearby pixels in the final image. While applying a triangle
filter to four texels at two levels in this manner doesn’t give exactly the
same result as applying it to the original highest resolution texels, the
difference isn’t too bad in practice, and the efficiency of this approach
is worth this penalty. In any case, the elliptically weighted average
filtering in the next section should be used when texture quality is
important.

Given floating-point texture coordinates in , the
`MIPMap::triangle()` routine uses a triangle filter to interpolate
between the four texels that surround the sample point, as shown in
Figure 10.14. This method first scales the coordinates
by the texture resolution at the given MIP map level in each direction,
turning them into continuous texel coordinates. Because these are
continuous coordinates, but the texels in the image map are defined at
discrete texture coordinates, it’s important to carefully convert into a
common representation. Here, we will do all of our work in discrete
coordinates, mapping the continuous texture coordinates to discrete space.

`MIPMap::triangle()`finds the four texels around and weights them according to a triangle filter based on their distance to . One way to implement this is as a series of linear interpolations, as shown here: First, the two texels below are linearly interpolated to find a value at , and the two texels above it are interpolated to find . Then, and are linearly interpolated again to find the value at .

For example, consider the 1D case with a continuous texture coordinate of : this coordinate is a distance of below the discrete texel coordinate (which corresponds to a continuous coordinate of ) and is above the discrete coordinate (continuous coordinate ). Thus, if we subtract from the continuous coordinate , giving , we can correctly compute the correct distances to the discrete coordinates and by subtracting coordinates.

After computing the distances in and to the texel at the lower left of
the given coordinates, `ds` and `dt`, `MIPMap::triangle()`
determines weights for the four texels and computes the filtered value. Recall
that the triangle filter is

the appropriate weights follow directly. Notice the similarity between
this computation and `BilerpTexture::Evaluate()`.

### 10.4.5 Elliptically Weighted Average

The elliptically weighted average (EWA) algorithm fits an ellipse to the
two axes in texture space given by the texture coordinate differentials and
then filters the texture with a Gaussian filter function
(Figure 10.15). It is widely
regarded as one of the
best texture filtering algorithms in graphics and has been carefully
derived from the basic principles of sampling theory. Unlike the triangle
filter in the previous section, it can filter over arbitrarily oriented
regions of the texture, with some flexibility of having different filter extents
in different directions. This type of filter is known as
*anisotropic*. This capability greatly improves the quality of its
results, since it can properly adapt to different sampling rates along the
two image axes.

We won’t show the full derivation of this filter here, although we do note that it is
distinguished by being a *unified resampling filter*: it simultaneously
computes the result of a Gaussian filtered texture function convolved with a
Gaussian reconstruction filter in image space. This is in contrast to many
other texture filtering methods that ignore the effect of the image space
filter or equivalently assume that it is a box. Even if a Gaussian isn’t
being used for filtering the samples for the image being rendered, taking some
account of the spatial variation of the image filter improves the results,
assuming that the filter being used is somewhat similar in shape to the
Gaussian, as the Mitchell and windowed sinc filters are.

The screen space partial derivatives of the texture coordinates define the
axes of the ellipse. The lookup method starts out by determining which of the
two axes is the major axis (the longer of the two) and which is the minor,
swapping them if needed so that `dst0` is the major axis. The
length of the minor axis will be used shortly to select a MIP map level.

Next the *eccentricity* of the ellipse is computed—the ratio of
the length of the major axis to the length of the minor axis. A large
eccentricity indicates a very long and skinny ellipse. Because this method
filters texels from a MIP map level chosen based on the length of the minor
axis, highly eccentric ellipses mean that a large number of texels need to be
filtered. To avoid this expense (and to ensure that any EWA lookup takes a
bounded amount of time), the length of the minor axis may be increased to limit
the eccentricity. The result may be an increase in blurring,
although this effect usually isn’t noticeable in practice.

Like the triangle filter, the EWA filter uses the image pyramid to reduce the number of texels to be filtered for a particular texture lookup, choosing a MIP map level based on the length of the minor axis. Given the limited eccentricity of the ellipse due to the clamping above, the total number of texels used is thus bounded. Given the length of the minor axis, the computation to find the appropriate pyramid level is the same as was used for the triangle filter. Similarly, the implementation here blends between the filtered results at the two levels around the computed level of detail, again to reduce artifacts from transitions from one level to another.

The `MIPMap::EWA()` method actually applies the filter at a particular
level.

This method first converts from texture coordinates in to coordinates
and differentials in terms of the resolution of the chosen MIP map level. It
also subtracts from the continuous position coordinate to align the
sample point with the discrete texel coordinates, as was done in
`MIPMap::triangle()`.

It next computes the coefficients of the implicit equation for
the ellipse with axes `(ds0,dt0)` and `(ds1,dt1)` and centered at the
origin. Placing the ellipse at the origin rather than at simplifies
the implicit equation and the computation of its coefficients and can be
easily corrected for when the equation is evaluated later. The general form of
the implicit equation for all points inside such an ellipse is

although it is more computationally efficient to divide through by and express this as

We will not derive the equations that give the values of the coefficients, although the interested reader can easily verify their correctness.

The next step is to find the axis-aligned bounding box in discrete integer texel coordinates of the texels that are potentially inside the ellipse. The EWA algorithm loops over all of these candidate texels, filtering the contributions of those that are in fact inside the ellipse. The bounding box is found by determining the minimum and maximum values that the ellipse takes in the and directions. These extrema can be calculated by finding the partial derivatives and , finding their solutions for and , and adding the offset to the ellipse center. For brevity, we will not include the derivation for these expressions here.

Now that the bounding box is known, the EWA algorithm loops over the texels, transforming each one to the coordinate system where the texture lookup point is at the origin with a translation. It then evaluates the ellipse equation to see if the texel is inside the ellipse (Figure 10.16) and computes the filter weight for the texel if so. The final filtered value returned is a weighted sum over texels inside the ellipse, where is the Gaussian filter function:

A nice feature of the implicit equation is that its value at a particular texel is the squared ratio of the distance from the center of the ellipse to the texel to the distance from the center of the ellipse to the ellipse boundary along the line through that texel (Figure 10.16). This value can be used to index into a precomputed lookup table of Gaussian filter function values.

The lookup table is initialized the first time a `MIPMap` is constructed.
Because it will be indexed with squared distances from the filter center ,
each entry stores a value , rather than .

So that the filter function goes to zero at the end of its extent rather
than having an abrupt step, `std::exp(-alpha)` is subtracted from the
filter values here.