② What kinds of scenes are worst-case scenarios for the
two acceleration structures in pbrt? (Consider specific geometric
configurations that the approaches will respectively be unable to handle
well.) Construct scenes with these characteristics, and measure the
performance of pbrt as you add more primitives. How does the worst case
for one behave when rendered with the other?
② Implement a hierarchical grid accelerator where you
refine cells that have an excessive number of primitives overlapping them
to instead hold a finer subgrid to store its geometry. (See, for example,
Jevans and Wyvill (1989) for one approach to this problem and Ize et al. (2007) for effective methods for deciding when refinement is
worthwhile.) Compare both accelerator construction performance and
rendering performance to a non-hierarchical grid as well as to the
accelerators in this chapter.
② Implement smarter overlap tests for building accelerators. Using
objects’
bounding boxes to determine which sides of a
kd-tree split they overlap can hurt performance
by causing unnecessary intersection tests. Therefore, add a bool
Shape::Overlaps(const Bounds3f &) const method to the shape interface that
takes a world space bounding box and determines if the shape truly overlaps
the given bound.
A default implementation could get the world bound from
the shape and use that for the test, and specialized versions could be
written for frequently used shapes. Implement this method for
Spheres and Triangles, and modify KdTreeAccel to call
it. You may find it helpful to read Akenine-Möller’s paper
(2001) on fast triangle-box overlap testing.
Measure the change in pbrt’s overall performance caused by this change,
separately accounting for increased time spent building the acceleration
structure and reduction in ray–object intersection time due to fewer
intersections. For a variety of scenes, determine how many fewer
intersection tests are performed thanks to this improvement.
② Implement “split clipping” in pbrt’s BVH
implementation. Read the papers by Ernst and Greiner (2007), Dammertz and Keller (2008a), Stich et al. (2009), and Karras and Aila (2013), and
implement one of their approaches to subdivide primitives with large
bounding boxes relative to their surface area into multiple subprimitives
for tree construction. (Doing so will probably require modification to the
Shape interface; you will probably want to design a new interface
that allows some shapes to indicate that they are unable to subdivide
themselves, so that you only need to implement this method for triangles,
for example.) Measure the improvement for rendering actual scenes; a
compelling way to gather this data is to do the experiment that Dammertz
and Keller did, where a scene is rotated around an axis over progressive
frames of an animation. Typically, many triangles that are originally axis
aligned will have very loose bounding boxes as they rotate more, leading to
a substantial performance degradation if split clipping isn’t used.
② The 30-bit Morton codes used for the HLBVH construction
algorithm in the BVHAccel may be insufficient for large scenes (note
that they can only represent steps in each dimension.)
Modify the BVHAccel to use 64-bit integers with 63-bit Morton codes
for HLBVHs. Compare the performance of your approach
to the original one with a variety of scenes. Are there scenes where performance
is substantially improved? Are there any where there is a loss of
performance?
② Investigate alternative SAH cost functions for building
BVHs or kd-trees. How much can a poor cost function hurt its performance?
How much improvement can be had compared to the current one? (See the
discussion in the “Further Reading” section for ideas about how the SAH
may be improved.)
③ Construction time for the BVHAccel and particularly
the KdTreeAccel can be a meaningful portion of overall rendering
time yet, other than HLBVHs, the implementations in this chapter do not
parallelize building the acceleration structures. Investigate techniques
for parallel construction of accelerators such as described by Wald (2007)
and Shevtsov et al. (2007b), and implement one of them in pbrt. How much of
a speedup do you achieve in accelerator construction? How does the speedup
scale with additional processors? Measure how much of a speedup your
changes translate to for overall rendering. For what types of scenes does
your implementation have the greatest impact?
③ The idea of using spatial data structures for ray
intersection acceleration can be generalized to include spatial data
structures that themselves hold other spatial data structures rather than
just primitives. Not only could we have a grid that has subgrids inside
the grid cells that have many primitives in them, but we could also have
the scene organized into a hierarchical bounding volume where the leaf
nodes are grids that hold smaller collections of spatially nearby
primitives. Such hybrid techniques can bring the best of a variety of
spatial data structure-based ray intersection acceleration methods. In
pbrt, because both geometric primitives and intersection accelerators
inherit from the Primitive base class and thus provide the same
interface, it’s easy to mix and match in this way.
Modify pbrt to build hybrid acceleration structures—for example, using
a BVH to coarsely sort the scene geometry and then uniform grids at the
leaves of the tree to manage dense, spatially local collections of
geometry. Measure the running time and memory use for rendering schemes
with this method compared to the current accelerators.
② Eisemann et al. (2007) described an even more efficient
ray–box intersection test than is used in the BVHAccel. It does
more computation at the start for each ray but makes up for this work with
fewer computations to do tests for individual bounding boxes. Implement
their method in pbrt, and measure the change in rendering time for a
variety of scenes. Are there simple scenes where the additional upfront
work doesn’t pay off? How does the improvement for highly complex scenes
compare to the improvement for simpler scenes?
② Read the paper by Segovia and Ernst (2010) on
memory-efficient BVHs, and implement their approach in pbrt. How does
memory usage with their approach compare to that for the BVHAccel? Compare
rendering performance with your approach to pbrt’s current
performance. Discuss how your results compare to the results reported in
their paper.
② Modify pbrt to use the “mailboxing” optimization in
the KdTreeAccel to avoid repeated intersections with primitives that
overlap multiple kd-tree nodes. Given that pbrt is multi-threaded, you
will probably do best to consider either the hashed mailboxing approach
suggested by Benthin (2006) or the inverse mailboxing algorithm of Shevtsov et al. (2007a). Measure the performance change compared to the current
implementation for a variety of scenes. How does the change in running
time relate to changes in reported statistics about the number of
ray–primitive intersection tests?
③ It is often possible to introduce some approximation into
the computation of shadows from very complex geometry (consider, e.g.,
the branches and leaves of a tree casting a shadow).
Lacewell et al. (2008) suggested augmenting the acceleration structure with a
prefiltered directionally varying representation of occlusion for regions
of space. As shadow rays pass through these regions, an approximate
visibility probability can be returned rather than a binary result, and the
cost of tree traversal and object intersection tests is reduced. Implement
such an approach in pbrt, and
compare its performance to the current implementation. Do you see any
changes in rendered images?