## 10.4 Image Texture

Image textures store 2D arrays of point-sampled values of a texture
function. They use these samples to reconstruct a continuous image
function that can be evaluated at an arbitrary
position.
These sample values are often called *texels*, since they are similar
to pixels in an image but are used in the context of a texture. Image
textures are the most widely used type of texture in computer graphics;
digital photographs, scanned artwork, images created with image-editing
programs, and images generated by renderers are all extremely useful
sources of data for this particular texture representation
(Figure 10.14).

*Watercolor*scene to represent spatially varying surface appearance properties. (a) Scene rendered with image textures. (b) Each image texture has been replaced with its average value. Note how much visual richness is lost.

*(Scene courtesy of Angelo Ferretti.)*

As with most of the other types of texture, `pbrt` provides both
`Float` and spectral variants. Both implementations inherit from
`ImageTextureBase`, which provides some common functionality.

`MIPMap`from texture cache if present>>

`MIPMap`for

`filename`and add to texture cache>>

In the following, we will present the implementation of
`SpectrumImageTexture`;
`FloatImageTexture` is analogous and
does not add anything new.

### 10.4.1 Texture Memory Management

The caller of `SpectrumImageTexture`’s constructor provides a texture
mapping function, the filename of an image, various parameters that control
the filtering of the image map, how boundary conditions are managed, and
how colors are converted to spectral samples. All the necessary initialization is handled by
`ImageTextureBase`.

As was discussed in Section 4.6.6, RGB colors are
transformed into spectra differently depending on whether or not they
represent reflectances. The `spectrumType` records what type of RGB
a texture represents.

The contents of the image file are used to create an instance of the
`MIPMap` class that stores the texels in memory and handles the details
of reconstruction and filtering to reduce aliasing.

`MIPMap`from texture cache if present>>

`MIPMap`for

`filename`and add to texture cache>>

A floating-point scale can be specified with each texture; it is applied to
the values returned by the `Evaluate()` method. Further, a
`true` value for the `invert` parameter causes the texture value
to be subtracted from 1 before it is returned.
While the same functionality can be achieved with scale and mix textures,
it is easy to also provide that functionality
directly in the texture here. Doing so can lead to more efficient texture
evaluation on GPUs, as is discussed further in Section 15.3.9.

Each MIP map may require a meaningful amount of memory, and a complex scene
may have thousands of image textures. Because an on-disk image may be
reused for multiple textures in a scene, `pbrt` maintains a table of MIP
maps that have been loaded so far so that they are only loaded into memory
once even if they are used in more than one image texture.

`pbrt` loads textures in parallel after the scene description has been
parsed; doing so reduces startup time before rendering begins. Therefore,
a mutex is used here to ensure that only one thread accesses the texture
cache at a time. Note that if the `MIPMap` is not found in the cache,
the lock is released before it is read so that other threads can access the
cache in the meantime.

`MIPMap`from texture cache if present>>=

The texture cache itself is managed with a `std::map`.

`TexInfo` is a simple structure that acts as a key for the texture
cache `std::map`. It holds all the specifics that must match
for a `MIPMap` to be reused in another image texture.

The `TexInfo` constructor, not included here, sets its member
variables with provided values. Its only other method is a comparison
operator, which is required by `std::map`.

If the texture has not yet been loaded, a call to `CreateFromFile()`
yields a `MIPMap` for it. If the file is not found or there is an
error reading it, `pbrt` exits with an error message, so a `nullptr`
return value does not need to be handled here.

`MIPMap`for

`filename`and add to texture cache>>=

### 10.4.2 Image Texture Evaluation

Before describing the `MIPMap` implementation, we will discuss the
`SpectrumImageTexture` `Evaluate()` method.

`MIPMap`>>

`SampledSpectrum`for RGB image texture value>>

It is easy to compute the texture coordinates and their derivatives
for filtering with the `TextureMapping2D`’s `Map()` method.
However, the coordinate must be flipped, because `pbrt`’s `Image`
class (and in turn, `MIPMap`, which is based on it) defines to
be the upper left corner of the image, while image textures have at the
lower left. (These are the typical conventions for indexing these entities
in computer graphics.)

The `MIPMap`’s `Filter()` method provides the filtered
value of the image texture over the specified region; any specified
scale or inversion is easily applied to the value it returns. A call to
`ClampZero()` here ensures that no negative values are returned after
inversion.

`MIPMap`>>=

As discussed in Section 4.6.2, an RGB color space is
necessary in order to interpret the meaning of an `RGB` color value.
Normally, the code that reads image file formats from disk returns an
`RGBCcolorSpace` with the read image. Most RGB image formats default to sRGB, and
some allow specifying an alternative color space. (For example, OpenEXR
allows specifying the primaries of an arbitrary RGB color space in the
image file’s metadata.)
A color space and the value of `spectrumType` make it possible to
create the appropriate type RGB spectrum, and in turn, its
`Spectrum::Sample()` can be called to get the
`SampledSpectrum` that will be returned.

If the `MIPMap` has no associated color space, the image is
assumed to have the same value in all channels and a constant value is
returned for all the spectrum samples. This assumption is verified by a
`DCHECK()` call in non-optimized builds.

`SampledSpectrum`for RGB image texture value>>=

### 10.4.3 MIP Maps

As always, if the image texture function has higher frequency detail than can be represented by the texture sampling rate, aliasing will be present in the final image. Any frequencies higher than the Nyquist limit must be removed by prefiltering before the function is evaluated. Figure 10.15 shows the basic problem we face: an image texture has texels that are samples of some image function at a fixed frequency. The filter region for the lookup is given by its center point and offsets to the estimated texture coordinate locations for the adjacent image samples. Because these offsets are estimates of the texture sampling rate, we must remove any frequencies higher than twice the distance to the adjacent samples in order to satisfy the Nyquist criterion.

The texture sampling and reconstruction process has a few key differences from the image sampling process discussed in Chapter 8. These differences make it possible to address the antialiasing problem with more effective and less computationally expensive techniques. For example, here it is inexpensive to get the value of a sample—only an array lookup is necessary (as opposed to having to trace a number of rays to compute radiance). Further, because the texture image function is fully defined by the set of samples and there is no mystery about what its highest frequency could be, there is no uncertainty related to the function’s behavior between samples. These differences make it possible to remove detail from the texture before sampling, thus eliminating aliasing.

However, the texture sampling rate will typically change from pixel to pixel. The sampling rate is determined by scene geometry and its orientation, the texture coordinate mapping function, and the camera projection and image sampling rate. Because the texture sampling rate is not fixed, texture filtering algorithms need to be able to filter over arbitrary regions of texture samples efficiently.

The `MIPMap` class implements a number of methods for texture filtering
with spatially varying filter widths. It can be found in the files
`util/mipmap.h` and `util/mipmap.cpp`. The filtering
algorithms it offers range from simple point sampling to bilinear
interpolation and trilinear interpolation,
which is fast and easy to implement and was widely used for texture
filtering in early graphics hardware, to elliptically weighted averaging,
which is more complex but returns extremely high-quality results.
Figure 10.16 compares the result of texture
filtering using trilinear interpolation and the EWA algorithm.

If an RGB image is provided to the `MIPMap` constructor, its channels
should be stored in R, G, B order in memory; for efficiency, the following code assumes
that this is the case.
All the code that currently uses `MIPMaps` in `pbrt` ensures that
this is so.

To limit the potential number of texels that need to be accessed, these
filtering methods use an *image pyramid* of increasingly lower
resolution prefiltered versions of the original image to accelerate their
operation.
The original image texels are at the bottom level of the pyramid, and the
image at each level is half the resolution of the previous level, up to the
top level, which has a single texel representing the average of all the
texels in the original image. This collection of images needs at most
more memory than storing the most detailed level alone and can be
used to quickly find filtered values over large regions of the original
image. The basic idea behind the pyramid is that if a large area of texels
needs to be filtered, a reasonable approximation is to use a higher level of
the pyramid and do the filtering over the same area there, accessing many
fewer texels.

The `MIPMap`’s image pyramid is represented by a vector of
`Image`s. See Section B.5 for the implementation of
`Image` and Section B.5.5 for its
`GeneratePyramid()` method, which
generates image pyramids.

The choice of filtering algorithm and a parameter used by the `EWA` method
are represented by `MIPMapFilterOptions`.

A few simple utility methods return information about the image pyramid and
the `MIPMap`’s color space.

Given the image pyramid, we will define some utility `MIPMap` methods
that retrieve the texel value at a specified pyramid level and discrete
integer pixel coordinates. For the `RGB` variant, there is an
implicit assumption that the image channels are laid out in R, G, B (and
maybe A) order.

The `Float` specialization of `Texel()`, not included here, is
analogous.

### 10.4.4 Image Map Filtering

The `MIPMap` `Filter()` method returns a filtered image function
value at the provided coordinates. It takes two derivatives that
give the change in with respect to image pixel samples.

`width`and handle very wide filter>>

The `EWA` filtering technique to be described shortly uses both
derivatives of to compute an *anisotropic filter*—one that
filters by different amounts in the different dimensions. The other three
use an *isotropic filter* that filters both equally. The isotropic
filters are more
computationally efficient than the anisotropic filter,
though they do not give results that are as good.
For them, only a single value is needed to specify the width of the
filter. The width here is conservatively chosen to avoid aliasing in both
the and directions, though this choice means that textures viewed
at an oblique angle will appear blurry, since the required sampling rate
in one direction will be very different from the sampling rate along the
other in this case.

`width`and handle very wide filter>>

Because filtering over many texels for wide filter widths would be inefficient, this method chooses a MIP map level from the pyramid such that the filter region at that level would cover four texels at that level. Figure 10.17 illustrates this idea.

Since the resolutions of the levels of the pyramid are all powers of two, the resolution of level is . Therefore, to find the level with a texel spacing width requires solving

for . In general, this will be a floating-point value between two MIP map levels. Values of greater than the number of pyramid levels correspond to a filter width wider than the image, in which case the single pixel at the top level is returned.

`width`and handle very wide filter>>=

For a point-sampled texture lookup, it is only necessary to convert the
continuous texture coordinates over to discrete coordinates over
the image resolution and to retrieve the appropriate texel value via
the `MIPMap`’s `Texel()` method.

Bilinear filtering, which is equivalent to filtering using a triangle
filter, is easily implemented via a call to `Bilerp()`.

Bilinear interpolation is provided in a separate method so that it can also be used for trilinear filtering.

As shown by Figure 10.17, applying a triangle filter to
the four texels around the sample point will either filter over too small a
region or too large a region (except for very carefully selected filter
widths). Therefore, the `Trilinear` filtering option applies the
triangle filter at both of these levels and blends between them according
to how close `level` is to each of them. This helps hide the
transitions from one MIP map level to the next at nearby pixels in the
final image. While applying a triangle filter to four texels at two levels
in this manner does not generally give exactly the same result as applying
a triangle filter to the original pixels, the difference is not too bad in
practice, and the efficiency of this approach is worth this penalty. In
any case, the following elliptically weighted average filtering approach
should be used when texture quality is important.

The elliptically weighted average (EWA) algorithm fits an ellipse to the two differential vectors in texture space and then filters the texture with a Gaussian filter function (Figure 10.18). It is widely regarded as one of the best texture filtering algorithms in graphics and has been carefully derived from the basic principles of sampling theory. Unlike the triangle filter, it can filter over arbitrarily oriented regions of the texture, with different filter extents in different directions. The quality of its results is improved by it being an anisotropic filter, since it can adapt to different sampling rates along the two image axes.

We will not show the full derivation of this filter here, although we do note
that it is distinguished by being a *unified resampling filter*: it
simultaneously computes the result of a Gaussian filtered texture function
convolved with a Gaussian reconstruction filter in image space. This is in
contrast to many other texture filtering methods that ignore the effect of
the image-space filter or equivalently assume that it is a box. Even if a
Gaussian is not being used for filtering the samples for the image being
rendered, taking some account of the spatial variation of the image filter
improves the results, assuming that the filter being used is somewhat
similar in shape to the Gaussian, as the Mitchell and windowed sinc filters are.

The screen-space partial derivatives of the texture coordinates define the
ellipse. The lookup method starts out by determining which of the
two axes is the longer of the two,
swapping them if needed so that `dst0` is the longer vector. The
length of the shorter vector will be used to select a MIP map level.

Next the ratio of the length of the longer vector to the length of the shorter one is considered. A large ratio indicates a very long and skinny ellipse. Because this method filters texels from a MIP map level chosen based on the length of the shorter differential vector, a large ratio means that a large number of texels need to be filtered. To avoid this expense (and to ensure that any EWA lookup takes a bounded amount of time), the length of the shorter vector may be increased to limit this ratio. The result may be an increase in blurring, although this effect usually is not noticeable in practice.

Like the triangle filter, the EWA filter uses the image pyramid to reduce the number of texels to be filtered for a particular texture lookup, choosing a MIP map level based on the length of the shorter vector. Given the limited ratio from the clamping above, the total number of texels used is thus bounded. Given the length of the shorter vector, the computation to find the appropriate pyramid level is the same as was used for the triangle filter. Similarly, the implementation here blends between the filtered results at the two levels around the computed level of detail, again to reduce artifacts from transitions from one level to another.

The `MIPMap::EWA()` method actually applies the filter at a particular
level.

This method first converts from texture coordinates in to coordinates
and differentials in terms of the resolution of the chosen MIP map level. It
also subtracts from the continuous position coordinate to align the
sample point with the discrete texel coordinates, as was done in
`MIPMap::Bilerp()`.

It next computes the coefficients of the implicit equation for
the ellipse centered at the origin that is defined by the vectors `(ds0,dt0)` and `(ds1,dt1)`.
Placing the ellipse at the origin rather than at simplifies
the implicit equation and the computation of its coefficients and can be
easily corrected for when the equation is evaluated later. The general form of
the implicit equation for all points inside such an ellipse is

although it is more computationally efficient to divide through by and express this as

We will not derive the equations that give the values of the coefficients, although the interested reader can easily verify their correctness.

The next step is to find the axis-aligned bounding box in discrete integer texel coordinates of the texels that are potentially inside the ellipse. The EWA algorithm loops over all of these candidate texels, filtering the contributions of those that are in fact inside the ellipse. The bounding box is found by determining the minimum and maximum values that the ellipse takes in the and directions. These extrema can be calculated by finding the partial derivatives and , finding their solutions for and , and adding the offset to the ellipse center. For brevity, we will not include the derivation for these expressions here.

Now that the bounding box is known, the EWA algorithm loops over the texels, transforming each one to the coordinate system where the texture lookup point is at the origin with a translation. It then evaluates the ellipse equation to see if the texel is inside the ellipse (Figure 10.19) and computes the filter weight for the texel if so. The final filtered value returned is a weighted sum over texels inside the ellipse, where is the Gaussian filter function:

A nice feature of the implicit equation is that its value at a particular texel is the squared ratio of the distance from the center of the ellipse to the texel to the distance from the center of the ellipse to the ellipse boundary along the line through that texel (Figure 10.19). This value can be used to index into a precomputed lookup table of Gaussian filter function values.

The lookup table is precomputed and available as a constant array.
Similar to the `GaussianFilter` used for image reconstruction,
the filter function is offset so that it goes to zero at the end of its extent rather
than having an abrupt step. It is

was used for the table in `pbrt`. Because the table is indexed
with squared distances from the filter center , each entry stores a
value , rather than .