16.2 Stochastic Progressive Photon Mapping
Photon mapping is one of a family of particle-tracing algorithms, which are based on the idea of constructing paths starting from the lights and connecting vertices in these paths to the camera to deposit energy on the film. In this section, we will start by introducing a theory of particle-tracing algorithms and will discuss the conditions that must be fulfilled by a particle-tracing algorithm so that arbitrary measurements can be computed correctly using the particles created by the algorithm. We will then describe an implementation of a photon mapping integrator that uses particles to estimate illumination by interpolating lighting contributions from particles close to but not quite at the point being shaded.
16.2.1 Theoretical Basis for Particle Tracing
Particle-tracing algorithms in computer graphics are often explained in terms of packets of energy being shot from the light sources in the scene that deposit energy at surfaces they intersect before scattering in new directions. This is an intuitive way of thinking about particle tracing, but the intuition that it provides doesn’t make it easy to answer basic questions about how propagation and scattering affect the particles. For example, does their contribution fall off with squared distance like flux density? Or, which terms, if any, affect particles after they scatter from a surface?
In order to give a solid theoretical basis for particle tracing, we will describe it using a framework introduced by Veach (1997, Appendix 4.A), which instead interprets the stored particle histories as samples from the scene’s equilibrium radiance distribution. Under certain conditions on the distribution and weights of the particles, the particles can be used to compute estimates of nearly any measurement based on the light distribution in the scene. In this framework, it is quite easy to answer questions about the details of particle propagation like the ones earlier. After developing this theory here, the remainder of this section will demonstrate its application to photon mapping.
A particle-tracing algorithm generates a set of samples of illumination at points , on surfaces in the scene
where each sample records incident illumination from direction and has some throughput weight associated with it (Figure 16.5). As the notation already indicates, this weight will contain ratios of terms of the throughput function and the associated sampling PDFs much like the variable of the path tracer (Section 14.5.4). We would like to determine the conditions on the weights and distribution of particle positions so that we can use them to correctly compute estimates of arbitrary measurements.
Given an importance function that describes the measurement to be taken, the natural condition we would like to be fulfilled is that the particles should be distributed and weighted such that using them to compute an estimate has the same expected value as the measurement equation for the same importance function:
For example, we might want to use the particles to compute the total flux incident on a wall. Using the definition of flux,
the following importance function selects the particles that lie on the wall and arrived from the hemisphere around the normal:
If the conditions on the distribution of particle weights and positions are true for arbitrary importance functions such that Equation (16.7) holds, then the flux estimate can be computed directly as a sum of the particle weights for the particles on the wall. If we want to estimate flux over a different wall, a subset of the original wall, and so on, we only need to recompute the weighted sum with an updated importance function. The particles and weights can be reused, and we have an unbiased estimate for all of these measurements. (The estimates will be correlated, however, which is potentially a source of image artifacts.)
To see how to generate and weight particles that fulfill these conditions, consider the task of evaluating the measurement equation integral
where the vertex densities are expressed as a probability per unit area and where the importance function that describes the measurement is a black box and thus cannot be used to drive the sampling of the integral at all. We can still compute an estimate of the integral with Monte Carlo integration but must sample a set of points and from all of the surfaces in the scene, using some sampling distribution that doesn’t depend on (e.g., by uniformly sampling points by surface area).
By expanding the LTE in the integrand and applying the standard Monte Carlo estimator for samples, we can find the estimator for this measurement,
We can further expand out the term into the sum over paths and use the fact that and the fact that for a particular sample, the expected value
can be written as a finite sum of terms in just the same way that we generated a finite set of weighted path vertices for path tracing. If the sum is truncated with Russian roulette such that the probability of terminating the sum after terms is , then the th term of the th sample has contribution
Note that the path integral framework provides us with the freedom to generate a set of particles in all sorts of different ways—i.e., with different underlying vertex probability densities . Although the natural approach is to start from points on lights and incrementally sample paths using the BSDFs at the path vertices, similar to how the path-tracing integrator generates paths (starting here from the light, rather than from the camera), we could generate them with any number of different sampling strategies, as long as there is nonzero probability of generating a particle at any point where the numerator is nonzero and the particle weights are computed using the above definition.
If we only had a single measurement to make, it would be better if we used information about and could compute the estimate more intelligently, since the general particle-tracing approach described here may generate many useless samples if only covers a small subset of the points on scene objects. If we will be computing many measurements, however, the key advantage that particle tracing brings is that we can generate the samples and weights once and can then reuse them over a large number of measurements, potentially computing results much more efficiently than if the measurements were all computed from scratch.
16.2.2 Photon Mapping
The photon mapping algorithm is based on tracing particles into the scene and blurring their contribution to approximate the incident illumination at shading points. For consistency with other descriptions of the algorithm, we will refer to particles generated for photon mapping as photons.
In order to compute reflected radiance at a point, we need to estimate the exitant radiance equation at a point in a direction , which can equivalently (and cumbersomely) be written as a measurement over all points on surfaces in the scene where a Dirac delta distribution selects only particles precisely at :
and so, from Equation (16.7), the function that describes the measurement is
Unfortunately, because there is a delta distribution in , all of the particle histories that were generated during the particle-tracing step have zero probability of having nonzero contribution if Equation (16.7) is used to compute the estimate of the measurement value (just as we will never be able to choose a direction from a diffuse surface that intersects a point light source unless the direction is sampled based on the light source’s position).
Here is the point at which bias is introduced into the photon mapping algorithm. Under the assumption that the information about illumination at nearby points gives a reasonable approximation to illumination at the shading point, photon mapping interpolates illumination from nearby photons around the point being shaded; the delta function of position in Equation (16.9) is effectively converted to a filter function. Given Equation (16.7), the more photons there are around the point and the higher their weights, the more radiance is estimated to be incident at the point.
One factor that contributes to photon mapping’s efficiency is this reuse of photons: having gone through the trouble to compute a light-carrying path, allowing it to potentially contribute illumination at multiple points amortizes the cost of generating it. While photon mapping derives some benefit from this efficiency improvement, there’s a subtler but much more important benefit from using nearby photons at a point: some light-carrying paths are impossible to sample with unbiased algorithms based on incremental path construction (including path tracing and bidirectional path tracing), but are handled well with photon mapping. These paths can arise in very common situations.
For example, consider the task of rendering an image of a photograph with a plate of glass in front of it. Assume a pinhole camera model and a point light illuminating the scene, and also assume for simplicity that the glass is only transmissive (Figure 16.6). If we start a path from the camera that passes through the glass, then the point that it intersects the photograph is completely determined by the effect of the refraction. At this point, none of the direct lighting strategies we have available has any chance of sampling an incident direction that will reach the light: because any sampled incident direction leaving the diffuse surface will be refracted on the way out through the glass, there’s no chance that the refracted ray will hit the point light. Even with an area light source, some refracted rays may be lucky enough to hit the light, but in general, variance will be high since most will miss it.
With photon mapping, we can trace photons leaving the light, let them refract through the glass, and deposit illumination on the diffuse surface. With a sufficient number of photons, the surface will be densely covered, and the photons around a point give a good estimate of the incident illumination.
A statistical technique called density estimation provides the mathematical tools to perform this interpolation. Density estimation constructs a PDF given a set of sample points under the assumption that the samples are distributed according to the overall distribution of some function of interest. Histogramming is a straightforward example of the idea. In 1D, the line is divided into intervals with some width, and one can count how many samples land in each interval and normalize so that the areas of the intervals sum to one.
Kernel methods are a more sophisticated density estimation technique. They generally give better results and smoother PDFs that don’t suffer from the discontinuities that histograms do. Given a kernel function that integrates to 1.
the kernel estimator for samples at locations is
where is the window width (also known as the smoothing parameter or kernel bandwidth). Kernel methods can be thought of as placing a series of bumps at observation points, where the sum of the bumps forms a PDF since they individually integrate to 1 and the sum is normalized. Figure 16.7 shows an example of density estimation in 1D, where a smooth PDF is computed from a set of sample points.
The key question with kernel methods is how the window width is chosen. If it is too wide, the PDF will blur out relevant detail in parts of the domain with many samples; if it is too narrow, the PDF will be too bumpy in the tails of the distribution where there aren’t many samples. Nearest-neighbor techniques solve this problem by choosing adaptively based on local density of samples. Where there are many samples, the width is small; where there are few samples, the width is large. For example, one approach is to pick a number and find the distance to the th nearest sample from the point and use that distance, , for the window width. This is the generalized nth nearest-neighbor estimate:
In dimensions, this generalizes to
Substituting into the measurement equation, it can be shown that the appropriate estimator for the measurement we’d like to compute, the exitant radiance at the point in direction , is given by
where we’ve switched to using to denote the total number of emitted photons, the sum is over all of the photons, and scale factors for the photons are computed based on the density estimation Equation, (16.10). Because we know that the kernel function is zero for points farther away than the th nearest neighbor distance , implementations of this sum only need to sum over the closest neighbors.
The error introduced by this interpolation can be difficult to quantify. Tracing more photons generally increases photon density and will almost always improve the results. When the photons are more closely spaced, it isn’t necessary to use photons from as far away in the nearest-neighbor estimate. In general, the error at any given point will depend on how quickly the illumination is changing in its vicinity. One can always construct pathological cases where this error is unacceptable, but in practice it usually isn’t too bad. Because the interpolation step tends to blur out illumination, high-frequency changes in lighting are sometimes poorly reconstructed with photon mapping. If traditional methods are used for direct lighting, then this is generally less of a problem since indirect illumination tends to be low frequency.
The original formulation of photon mapping was based on a two-pass algorithm where photons are first traced from the light sources. Photon interactions are recorded on surfaces of the scene and then organized in a spatial data structure (generally a kd-tree) for use at rendering time. A second pass follows paths starting from the camera; at each path vertex, nearby photons are used to estimate indirect illumination.
While this approach is effective, it has the limitation that the number of photons that can be used is limited by available memory since all of the photons must be stored. No improvement is possible once memory is full, even if one wants a higher quality result from using more photons. (In contrast, with path tracing, for example, one can always add more samples per pixel without any incremental storage cost—only the computational expense increases.)
Progressive Photon Mapping
The progressive photon mapping algorithm addressed this issue by restructuring the algorithm: first, a camera pass traces paths starting from the camera. Each pixel stores a representation of all of the non-specular path vertices found when generating paths that started in its extent. (For example, if a camera ray hits a diffuse surface, we might record the geometric information about the intersection point and the diffuse reflectance. If it hit a perfectly specular surface and then a diffuse surface, we would record the diffuse reflectance scaled by the specular BSDF value, and so forth.) We will dub these stored path vertices visible points in the following. A second pass traces photons from the light sources; at each photon–surface intersection, the photon contributes to the reflected radiance estimate for nearby visible points.
To understand how progressive photon mapping works, we consider a decomposition of the LTE into separate integrals over direct and indirect incident radiance at each vertex. (Recall the discussion of partitioning the LTE in Section 14.4.6.)
The emitted term is straightforward, and direct lighting can be handled using the usual approaches from Section 14.3. The indirect term with the integral over is handled in one of two ways. First, a ray may be sampled from the BSDF’s sampling distribution and traced to find the next vertex in the path, where the same issue of how to estimate outgoing radiance is faced—just like a path tracer. Alternatively, the current point can be saved to receive illumination from photons. The final contribution of such a point to radiance at the film plane is found by summing the product of the photons’ weights and the path throughput weight of the sequence of vertices before it in the path.
For a perfectly specular BSDF, the only reasonable choice is to trace another ray: a photon will never arrive at exactly the right direction to match the delta distribution in the BSDF. For highly glossy surfaces, it’s also advisable to trace a ray, since it will take many photons for enough to hit a narrow specular lobe to compute an accurate estimate.
For diffuse surfaces, it’s generally worth using photons, though it can be worthwhile to trace one last bounce. This approach is called final gathering; if photons are used only after a diffuse bounce, then any errors from insufficient photon density are generally less visible, though more camera paths may need to be traced to eliminate noise. (See Exercise 16.8 for further discussion of final gathering.)
With this approach, no photon storage is needed, and an arbitrary number of photons can be traced; the memory limit is instead tied to the storage required for the visible points and their reflection information. For high-resolution images or images that require many samples per pixel to resolve motion blur or depth of field, memory can still be a limitation.
Stochastic Progressive Photon Mapping
Stochastic progressive photon mapping is a modification of progressive photon mapping that doesn’t suffer from either of these memory limits. Like progressive photon mapping, it generates a set of visible points from the camera but at a low sampling rate (e.g., following just one camera path per pixel). Next, a number of photons are shot from lights, accumulating contributions at nearby visible points. This process then repeats: the visible points are discarded, a new set is generated at different positions, another round of photons is traced, and so forth.
SPPM starts with the photon estimation equation, (16.11), and makes two adjustments. First, it uses a constant kernel function; in conjunction with the fact that the estimation is over 2D (the local tangent plane around the visible point), we have
where, as before, is the number of photons emitted from the light sources, and is the surface area of the disk-shaped kernel function.
The second adjustment, based on an approach first implemented in progressive photon mapping, is to progressively reduce the photon search radius as more photons contribute to the visible point. The general idea is that as more photons are found within the search radius, we have more evidence that a sufficient density of photons is arriving to estimate the incident radiance distribution well. By reducing the radius, we ensure that future photons that are used will be closer to the point and thus contribute to a more accurate estimate of the incident radiance distribution.
Reducing the radius requires an adjustment to how the reflected radiance estimate is computed, as now the photons in the sum in Equation (16.11) come from different radii. The following three update rules describe how to update the radius and dependent variables:
where is the number of photons that have contributed to the point after the th iteration, is the number of photons that contributed during the current iteration, is the search radius to use for the th iteration, maintains the sum of products of photons with BSDF values, and is computed during the th iteration as
The parameter, which is typically around , determines how quickly the contributions from photons from earlier iterations, with wider search radii, are faded out. (Hachisuka and Jensen’s original paper on SPPM (2009) uses the notation for this quantity; we opt for here, having used for other quantities already.)
Note that the radius is a per-pixel property, not a per-visible-point property. Remarkably, a consistent estimate for the reflected radiance is computed even with this single radius shared over all of the series of visible points in the pixel. We won’t show this derivation here, but with the rules in Equation (16.13), as the number of photons traced , , and the reflected radiance estimates are consistent and converge to the correct values.
Figure 16.8 shows a rendered scene with path tracing and SPPM. SPPM is much more effective at handling light that passes through the glass light fixtures than the path tracing algorithm is.
The SPPMIntegrator, implemented in the files integrators/sppm.h and integrators/sppm.cpp, implements the SPPM light transport algorithm.
The SPPMIntegrator constructor isn’t particularly interesting; it just sets various member variables with values passed in. We therefore won’t include it here but will discuss the various member variables that configure the SPPMIntegrator’s operation as they appear in the following.
The SPPMIntegrator is not a SamplerIntegrator, so it implements its own Render() method. After some initial setup has been performed, it runs a number of iterations of the SPPM algorithm, finding a set of visible points and then accumulating illumination from photons at them. Each iteration creates a new path starting from the camera in each pixel, which helps with antialiasing geometric edges and sampling motion blur and depth of field well.
The pixels array stores a SPPMPixel (to be defined shortly) for each pixel in the final image.
A user-supplied radius, initialSearchRadius, is used for , the initial search radius for photons. If the supplied radius is too large, too many photons will contribute to visible points during early iterations (before the radius is automatically decreased), which may be inefficient. If it’s too small, not enough photons will be found to estimate incident radiance well. A radius corresponding to a few pixels in the final image is generally a good starting point.
The SPPMPixel structure serves three purposes. First, it stores the current estimated average radiance visible over the extent of a pixel (including the time the shutter is open and accounting for depth of field, if present). Second, it stores parameters related to the photon density estimation for the pixel (e.g., various quantities from Equation (16.13)). Finally, it stores the geometric and reflection information for a visible point in the pixel after the camera pass.
Working with weighted photons allows for two very different overall sampling approaches: on the one hand, we could try to distribute photons uniformly with a weight that approximates the irradiance on surfaces, while also accounting for discontinuities and other important geometric and lighting features. However, this type of photon distribution is challenging to realize for general input; hence we will instead strive to generate photons that have the same (or similar) weights, so that it is their varying density throughout the scene that represents the variation in illumination.
Furthermore, if the photon weights on a surface have substantial variation that is not related to the irradiance, there can be unpleasant image artifacts: if one photon takes on a much larger weight than the others, a bright circular artifact will reveal the region of the scene where that photon contributes to the interpolated radiance.
Therefore, we’d like to shoot more photons from the brighter lights so that the initial weights of photons leaving all lights will be of similar magnitudes, and thus, the light to start each photon path from is chosen according to a PDF defined by the lights’ respective powers. Thus, it is a greater number of photons from the brighter lights that accounts for their greater contributions to illumination in the scene rather than the same number of photons from all, with larger weights for the more powerful lights.
Each iteration of the SPPM algorithm traces a new path from the camera at each pixel and then collects incident photons at each path’s endpoint. Figure 16.9 shows the effect of increasing the number of iterations with SPPM. Here, we see the caustic from light passing through the glass becoming increasingly sharp. In general, more iterations improve the sampling of visible edges, motion blur, and depth of field, but as more photons are accumulated in each pixel, the indirect lighting estimate becomes more accurate. Note how the artifacts from under-sampling here are low-frequency blotches, a quite different visual artifact from the high-frequency noise from under-sampling seen in path tracing.
The implementation here uses a HaltonSampler to generate camera paths; doing so ensures that well-distributed samples are used over all of the iterations in the aggregate.
16.2.4 Accumulating Visible Points
Similar to the SamplerIntegrator, the SPPMIntegrator decomposes the image into tiles of 16 pixels square and parallelizes the generation of camera paths and visible points over these tiles. The number of tiles is computed in the same manner as in the fragment <<Compute number of tiles, nTiles, to use for parallel rendering>>.
Unlike the path tracer, where the BSDF can be discarded after estimating the direct illumination and sampling the outgoing direction at each vertex, here we need to store the BSDFs for the visible points until the photon pass for the current iteration is done. Therefore, the MemoryArenas used for allocating the BSDFs during camera path tracing aren’t reset at the end of loop iterations here.
Note also that we only allocate one arena per worker thread used to run the parallel for loop and use the ThreadIndex global to index into the vector, rather than allocating a separate arena for each loop iteration. In this way, we avoid the overhead of having many separate MemoryArenas while still ensuring that each arena is not used by more than one processing thread (which would lead to race conditions in the MemoryArena methods).
We also need a unique Sampler for the thread processing the tile; as before, Sampler::Clone() provides one.
Recall that in SamplerIntegrators, the sample vectors in each pixel are all requested in sequence until the last one is consumed, at which point work starts on the next pixel. In contrast, the first SPPM iteration uses the first sample vector for each pixel; later, the second iteration uses the second sample vector, and so forth. (In other words, the loop nesting has been interchanged from “for each pixel, for each sample number,” to “for each sample number, for each pixel.”) It is for just this use case that the Sampler provides the SetSampleNumber() method, which configures the sampler to provide samples from the given sample vector for the pixel.
We can now start the path from the camera, following the usual approach. As with the PathIntegrator, the beta variable holds the current path throughput weight .
Path tracing can now proceed. As with most other Integrators, the path length is limited by a predefined maximum depth.
To find the SPPMPixel in the pixels array for the current pixel, we need to offset by the minimum pixel coordinate and convert to a linear index.
As described in Section 16.2.2, a regular direct lighting calculation is performed at each vertex of the camera path. Thus, for rays that don’t intersect any scene geometry, infinite area lights must be allowed a chance to contribute direct lighting via the Light::Le() method.
SPPMPixel::Ld records the weighted sum of emitted and reflected direct illumination for all camera path vertices for the pixel (in other words, the first two terms of Equation (16.12)). Note that these terms are also evaluated at vertices found by sampling the BSDF and tracing a ray for the third term; Ld doesn’t just store direct illumination at the first vertex. Because this sum of outgoing radiance includes contributions of all of the samples in a pixel, this value must be divided by SPPMIntegrator::nIterations to get the average direct illumination for the final pixel radiance estimate.
More commonly, the ray intersects a surface and <<Process SPPM camera ray intersection>> executes
First, we need the BSDF at the intersection point. Recall from Section 11.3 that a nullptr-valued BSDF * means that an intersection should be ignored, as the surface intersected is only in the scene to delineate the boundary of participating media. The SPPMIntegrator does not account for participating media; thus we simply skip over the intersection and restart the current loop iteration.
As in other integrators, emitted radiance from the surface is only included for the first intersection from the camera or after a specular bounce, where no direct lighting calculation was possible.
The implementation here creates a visible point at the first diffuse surface found or if the path is about to hit its maximum length and we have a glossy surface. As explained earlier, a visible point at a perfectly specular surface will never respond to incident photons, and a glossy surface may have high variance if the specular lobe is tight and not enough photons have arrived to represent the incident radiance distribution well.
The VisiblePoint structure records a point found along a camera path at which we’ll look for nearby photons during the photon shooting pass. It stores enough information to compute the reflected radiance due to an incident photon, which in turn is scaled by the path throughput weight of the visible point to find the overall contribution to the original image sample.
16.2.5 Visible Point Grid Construction
During the photon pass, whenever a photon intersects a surface, we need to efficiently find the visible points where the distance from the visible point to the photon’s intersection is less than the current search radius for the visible point’s pixel. The implementation here uses a uniform grid over the bounding box of all of the visible points; Exercise 16.7 at the end of the chapter has you implement other data structures in place of the grid and investigate trade-offs.
The grid is usually sparsely filled; many voxels have no visible points inside their extent. (For starters, any voxel in the grid with no surfaces in its volume will never have a visible point in it.) Therefore, rather than allocating storage for all of the voxels, the grid is represented by a hash table where a hash function transforms 3D voxel coordinates to an index into the grid array.
In the following, we’ll construct the grid in parallel using multiple threads. These threads will use atomic operations to update the grid, so a std::atomic is used for the grid voxel element type here.
Each grid cell stores a linked list, where each node points to an SPPMPixel whose visible point’s search volume overlaps the grid cell. A visible point may overlap multiple grid cells, so it’s desirable to keep the node representation compact by only storing a pointer to the SPPMPixel rather than making a copy for each cell it overlaps.
If there’s no visible point for the pixel for the current iteration, the pixel will have a path throughput weight (and no attempt should be made to place a visible point in the grid for that pixel). This case can happen if the path from the camera leaves the scene or is terminated before intersecting a diffuse surface.
Otherwise, the implementation here computes the bounding box centered at the visible point with extent , the current photon search radius for the pixel. In turn, when we later have a photon intersection point, we will only need to consider the visible points for the grid cell that the photon is in to find the visible points that the photon may contribute to (Figure 16.10). Because different visible points will have different search radii depending on how many photons have contributed to their pixel so far, it would otherwise be unwieldy to find the potentially relevant visible points for a photon if the visible points were stored without accounting for the volume of space they will accept photons from.
Given the overall bound of the grid, we need to decide how large the voxels should be and thus how finely to subdivide space. On one hand, if the voxels are too large, the photon shooting pass will be inefficient, as each photon will have to check many visible points to see if it contributes to each one. If they’re too small, then each visible point will overlap many voxels, and the memory needed to represent the grid will be excessive.
Here, we compute an initial resolution such that the voxel width in the largest grid dimension is roughly equal to the largest current search radius of all of the visible points. This limits the maximum number of voxels any visible point can overlap. In turn, resolutions for the other two dimensions of the grid are set so that voxels have roughly the same width in all dimensions.
The visible points can now be added to the grid. Because both the grid and the BSDFs for visible points from the camera path must be kept around until the end of the photon tracing pass, we can reuse the per-thread MemoryArenas that were created earlier for the BSDFs to allocate SPPMPixelListNodes.
Each visible point is added to all of the grid cells that its bounding box overlaps.
ToGrid() returns the coordinates of the voxel in the grid that the given point lies in. The Boolean return value indicates whether the point p is inside the grid’s bounds; if it isn’t, the returned coordinates pi are clamped to be inside the range of valid coordinates.
hash() hashes the coordinates of the voxel, returning an index into the grid array defined earlier; it is a straightforward hash function, and its implementation isn’t included here. A new SPPMPixelListNode is allocated and the task now is to add this list node to the head of the linked list in grid[h].
Given the grid index, atomic operations can be used to allow multiple threads to add visible points to the grid concurrently without needing to hold any locks. (See Appendix A.6.2 for further discussion of atomic operations.) In the absence of concurrency, we’d just want to set node->next to point to the head of the list in grid[h] and assign node to grid[h]. That approach will not work if multiple threads are updating the lists concurrently; it’s possible that the first assignment will become stale: another thread might modify grid[h] between the current thread initializing node->next and then trying to assign node to grid[h].
The compare_exchange_weak() method of std::atomic addresses this issue: the first parameter is the value that we expect grid[h] to have, and the second gives the value that we’d like to set it to. It performs the assignment only if our expectation was correct. Thus, in the common case, node->next and grid[h] have the same pointer value, the assignment occurs, and true is returned. The node has been added to the list.
If the pointer stored in grid[h] has been changed by another thread, then compare_exchange_weak() actually updates the first parameter, node->next, with the current value of grid[h] before returning false. We are thus all set to try the atomic compare and exchange again, as node->next points to the new head of the list. This process continues until the assignment is successful.
16.2.6 Accumulating Photon Contributions
Given the grid of visible points, the SPPMIntegrator can now follow photon paths through the scene. The total of photonsPerIteration photons to be traced for the current iteration are traced using multiple threads. A separate MemoryArena is available for each worker thread; this arena is reset after each photon path, so a new pool of per-thread arenas is allocated rather than reusing the one used for BSDFs and grid linked list nodes.
There’s a balance to strike in choosing the number of photons to trace for each SPPM iteration: too many, and pixels’ radii won’t have a chance to decrease as more photons arrive and too many too-far-away photons are used. Too few, and the overhead of finding visible points and making the grid of them won’t be amortized over enough photons. In practice, a few hundred thousand to a few million per iteration generally works well (see Figure 16.11).
A Halton sequence provides a set of well-distributed sample points for all of the photon paths over all of the iterations. haltonIndex records the index of the Halton sequence (corresponding to in Equation (7.7)) for the current photon; it can also be seen as a global index of photons traced. In other words, it starts at 0 for the first photon and passes through all subsequent integer values for following photons. It’s important to use a 64-bit integer for this value, since an unsigned 32-bit integer would overflow after roughly 4 billion photons; many more photons may be needed for high-quality images.
The dimension of the sample, corresponding to the th prime number in Equation (7.7), is maintained in haltonDim.
Which light to start the path from is determined by sampling from the PDF based on light power computed previously. The first dimension of the Halton sequence is used for the sample value.
The next five dimensions of the sample from the Halton sequence are used for the sample values used to generate the ray leaving the light source.
After the light has been chosen, its Sample_Le() method is used to sample an outgoing ray. Given a light, a ray from the light source is sampled and its value is initialized based on Equation (16.8):
where is the probability for sampling this particular light and is the product of the area and directional densities for sampling this particular ray leaving the light. Intersecting this ray against the scene geometry to obtain also samples part of the geometric term except for a cosine factor that must be explicitly integrated into the particle weight .
Now the integrator can start following the path through the scene, updating after each scattering event. The photon makes no contribution at the first intersection found after it left the light source, since that intersection represents direct illumination, which was already accounted for when tracing paths starting from the camera. For subsequent intersections, illumination is contributed to nearby visible points.
Given a photon intersection, ToGrid()’s return value indicates if it’s within the extent of the grid. If it isn’t, then by construction, none of the visible points is interested in this photon’s contribution. Otherwise, the visible points in the grid cell all need to be checked to see if the photon is within their radius.
Recall that grid stores std::atomic pointers to SPPMPixelListNodes. Normally, reading from a std::atomic value means that the compiler must be careful to not reorder instructions that read or write memory around the read of the value of grid[h]; this constraint is necessary so that lock-free algorithms will work as expected. In this case, the grid has been constructed and no other threads are concurrently modifying it. Therefore, it’s worthwhile to use the std::atomic.load() method and letting it know that the “relaxed” memory model, which doesn’t have these constraints, can be used to read the initial grid pointer. This approach has a significant performance benefit: for a simple scene of a few hundred triangles (where not too much time is spent tracing rays), the photon pass runs in 20% less time using this memory model on a 2015-era CPU.
Given a photon contribution, we need to update the sum for the pixel’s scattered radiance estimate from Equation (16.14). The total number of contributing photons in this pass is stored in M, and the sum of the product of BSDF values with particle weights is stored in Phi.
Each pixel’s and values are stored using atomic variables, which in turn allows multiple threads to safely concurrently update their values. Because pbrt’s Spectrum class doesn’t allow atomic updates, Phi is instead represented with an array of AtomicFloat coefficients for each spectral sample. This representation will require some manual copying of values to and from Phi to Spectrum-typed variables in the following, but we think that this small awkwardness is preferable to the complexity of, for example, a new AtomicSpectrum type.
Having recorded the photon’s contribution, the integrator needs to choose a new outgoing direction from the intersection point and update the value to account for the effect of scattering. Equation (16.7) shows how to incrementally update the particle weight after a scattering event: given some weight that represents the weight for the th intersection of the th particle history, after a scattering event where a new vertex has been sampled, the weight should be set to be
As with the path-tracing integrator, there are a number of reasons to choose the next vertex in the path by sampling the BSDF’s distribution at the intersection point to get a direction and tracing a ray in that direction rather than directly sampling by area on the scene surfaces. Therefore, we again apply the Jacobian to account for this change in measure, all of the terms in except for a single cancel out, and the expression is
As before, a nullptr-valued BSDF * indicates an intersection that should be ignored.
Sampling the BSDF to find the scattered photon direction follows the usual model.
The next two dimensions of the Halton sample vector are used for the BSDF sample.
The photon scattering step should be implemented carefully in order to keep the photon weights as similar to each other as possible. A method that gives a distribution of photons where all have exactly equal weights was suggested by Jensen (2001, Section 5.2). First, the reflectance is computed at the intersection point. A random decision is then made whether or not to continue the photon’s path with probability proportional to this reflectance. If the photon continues, its scattered direction is found by sampling from the BSDF’s distribution, but it continues with its weight unchanged except for adjusting the spectral distribution based on the surface’s color. Thus, a surface that reflects very little light will reflect few of the photons that reach it, but those that are scattered will continue on with unchanged contributions and so forth.
This particular approach isn’t possible in pbrt due to a subtle implementation detail (a similar issue held for light source sampling previously as well): in pbrt, the BxDF interfaces are written so that the distribution used for importance sampling BSDFs doesn’t necessarily have to perfectly match the actual distribution of the function being sampled. It is all the better if it does, but for many complex BSDFs exactly sampling from its distribution is difficult or impossible.
Therefore, here we will use an approach that generally leads to a similar result but offers more flexibility: at each intersection point, an outgoing direction is sampled with the BSDF’s sampling distribution, and the photon’s updated weight is computed using Equation (16.16). Then the ratio of the luminance of to the luminance of the photon’s old weight is used to set the probability of continuing the path after applying Russian roulette.
The termination probability is thus set so that if the photon’s weight is significantly decreased at the scattering point, the termination probability will be high and if the photon’s weight is essentially unchanged, the termination probability is low. In particular, the termination probability is chosen in a way such that if the photon continues, after its weight has been adjusted for the possibility of termination, its luminance will be the same as it was before scattering. It is easy to verify this property from the fragment below. (This property actually doesn’t hold for the case where , as can happen when the ratio of the BSDF’s value and the PDF is greater than 1.)
After all of the photons for the iteration have been traced, the estimate of the incident radiance visible in each pixel area can now be updated based on contributions from photons in the current pass.
Equation (16.13) gives the rules to update the search radius and other quantities related to the photon estimate.
Note that the number of photons that have contributed to the pixel N is actually stored as a Float. This quantity must be treated as continuously valued, not a discrete integer, for the progressive radiance estimate to converge to the correct value in the limit.
Before the next SPPM iteration begins, it’s necessary to zero out the visible point in the pixel so that we don’t attempt to re-use this one if no visible point and BDSF * is found in the next iteration.
Most of the <<Periodically store SPPM image in film and write image>> fragment is a straightforward matter of allocating an image of Spectrum values to pass to Film::SetImage() and then initializing the pixels in the image and before calling Film::WriteImage(). We won’t include that boilerplate here, in order to focus on the last step of the SPPM algorithm, which combines the direct and indirect radiance estimates.
As described earlier, the direct lighting estimate needs to be divided by the number of pixel samples (which in turn is how many iterations have completed at this point) to get its average value. The indirect photon term is computed using Equation (16.11)—the two values then just need to be added together.