Implementation of VCM/UPS in Lightmetrica: Part 2

Posted on Dec 18, 2016
Development Lightmetrica

(This article is written for Ray-tracing Advent Calendar 2016)

Introduction of Part 2

Continuing the previous article, in this article we will introduce the implementation part of VCM/UPS in Lightmetrica. Our code resides in src/liblightmetrica/renderer/renderer_vcm.cpp (link) relative to the root directory of the repository. We can refer this renderer from the configuration file with renderer::vcm. From now on we will explain the internal detail of the code. For the sake of explanation we omit some unnecessary code for debugging, so the code snippets in this article are not exactly same as the original code. We sometimes refer to the numbering of the equation in the previous articles, so it would be helpful to peek the previous article in the same time.

VCM/UPS in Lightmetrica

Code Structure

The structure of the code follows basic interface of renderers in Lightmetrica; Renderer_VCM class inherited from Renderer class implements Renderer::Initialize and Renderer::Render interfaces.

 1 class Renderer_VCM final : public Renderer
 2 {
 3 public:
 4 
 5     LM_IMPL_CLASS(Renderer_VCM, Renderer);
 6 
 7 private:
 8 
 9     // Member variables
10 
11 private:
12 
13     LM_IMPL_F(Initialize) = [this](const PropertyNode* p) -> bool
14     {
15         // 1. Initialization
16     };
17 
18     LM_IMPL_F(Render) = [this](const Scene* scene, Random* initRng, const std::string& outputPath) -> void
19     {
20         // 2. Rendering
21     };
22 
23 };

1. Initialization

Renderer::Initialize loads parameters of the renderer. max_num_vertices / min_num_vertices controls the number of path vertices (= path length + 1) to render. This parameter is useful for debugging, e.g., focusing on specific lighting effect only seem in the specific path length. Our implementation based on the progressive photon density estimation by Knaus and Zwicker [2011]. Therefore the quality of the rendering is controlled by the num_iteration_pass parameter, which specifies the number of iteration in rendering. One iteration contains photon tracing and ray tracing parts. num_photon_trace_samples and num_eye_trace_samples controls the granularity of these process. initial_radius specifies the initial merge radius and alpha controls its convergence ratio. As well as the above parameters, the mode parameter can switch several fall-back implementations for BDPT or BDPM. This parameter is mainly for debugging purpose. In this article, we fixed it to vcm. The loaded parameters are stored in the class variables like numIterationPass_ which is referenced in the implementation.

2. Rendering

Renderer::Initialize implements the core logic of the renderer. The iteration loop is composed of four parts.

1 for (long long pass = 0; pass < numIterationPass_; pass++)
2 {
3     // 2-1. Update merge radius
4     // 2-2. Sample light subpaths
5     // 2-3. Construct range query structure
6     // 2-4. Estimate contribution
7 }

2-1. Update merge radius

Starting from the value specified by initial_radius, a merge radius used for the range query is updated according to the formulation in Knaus and Zwicker [2011]:

1 mergeRadius = pass == 0
2     ? initialRadius_
3     : std::sqrt((alpha_ + pass) / (1_f + pass)) * mergeRadius;

2-2. Sample light subpaths

In this part we sample light subpaths by the number specified by num_photon_trace_samples. A subpath is defined as a structure VCMSubpath and sampled by VCMSubpath::SampleSubpath function. The structure contains a vector of VCMPathVertex which contains a path vertex type (VCMPathVertex::type) represented by a bitwised flag defined with SurfaceInteractionType, the surface geometry information such as the position or the normal (VCMPathVertex::geom), and the reference to the primitive (VCMPathVertex::primitive).

VCMSubpath::SampleSubpath deligates the process to PathSamplerUtils::TraceSubpath function, which is a general subpath sampler in our framework. This function calls the given callback function once a new path vertex is generated. Here we just copy the information and push to our vertex structure (line 16-26). Note that we ignores throughput argument which indicates a product of contributions of the sampled subpath so far. This is because of our design choice for our implementation; avoid optimizations.

 1 struct VCMPathVertex
 2 {
 3     int type;
 4     SurfaceGeometry geom;
 5     const Primitive* primitive = nullptr;
 6 };
 7 
 8 struct VCMSubpath
 9 {
10     std::vector<VCMPathVertex> vertices;
11     auto SampleSubpath(const Scene* scene, Random* rng, TransportDirection transDir, int maxNumVertices) -> void
12     {
13         vertices.clear();
14         PathSamplerUtils::TraceSubpath(
15             scene, rng, maxNumVertices, transDir,
16             [&](int numVertices, const Vec2& rasterPos,
17                 const PathSamplerUtils::PathVertex& pv,
18                 const PathSamplerUtils::PathVertex& v, SPD& throughput) -> bool
19             {
20                 VCMPathVertex v_;
21                 v_.type = v.type;
22                 v_.geom = v.geom;
23                 v_.primitive = v.primitive;
24                 vertices.emplace_back(v_);
25                 return true;
26             });
27     }
28 };

2-3. Construct range query structure

Based on the light subpaths generated in 2-2, we construct the range query structure. This structure accelerates the query to obtain a light subpath from a position in the scene surfaces. A light subpath is selected if one of its vertices is within the range of given kernel size from the position. We used kd-tree for the structure implemented as VCMKdTree class.

1 VCMKdTree pm(subpathLs);
2 pm.Build();

2-4. Estimate contribution

Next we trace eye subpaths and estimate intensities contributed to pixels. We used a standard parallelization function Parallel::For in Lightmetrica, which here executes numEyeTraceSamples_ iterations in parallel. Context defines thread-specific storage. The random number generator (Random) must be assigned for each thread because it maintains its own internal states. Context::film accumulates the estimated contribution. Although the most of paths contributes to one pixel, for simplicity of code, we choose a design to accumulate all contribution to one film instead of separating two films as said in Veach’s paper [1997].

After the rendering completes, the films for each of threads are accumulated into one. Here we compute the average of films in numIterationPass_ iterations in the progressive way (line 19-24).

 1 struct Context
 2 {
 3     Random rng;
 4     Film::UniquePtr film{nullptr, nullptr};
 5 };
 6 std::vector<Context> contexts(Parallel::GetNumThreads());
 7 for (auto& ctx : contexts)
 8 {
 9     ctx.rng.SetSeed(initRng->NextUInt());
10     ctx.film = ComponentFactory::Clone<Film>(film);
11     ctx.film->Clear();
12 }
13 Parallel::For(numEyeTraceSamples_, [&](long long index, int threadid, bool init)
14 {
15     // 2-4-1. Sample subpaths
16     // 2-4-2. Estimate contribution by vertex connection
17     // 2-4-3. Estimate contribution by vertex merging
18 });
19 film->Rescale((Float)(pass) / (1_f + pass));
20 for (auto& ctx : contexts)
21 {
22     ctx.film->Rescale(1_f / (1_f + pass));
23     film->Accumulate(ctx.film.get());
24 }

2-4-1. Sample subpaths

First we sample subpaths with both directions. These subpaths are later used for the vertex connection and the vertex merging. Note that specifically for the vertex connection, we can reuse previously sampled light subpaths in 2-2 instead of sampling a new one. This, however, needs to compensate correlation among the reused paths and it is dependent on the number of light subpaths sampled for each iteration. Following the policy of not doing optimizations, here we did not adopt that way.

1 VCMSubpath subpathE;
2 VCMSubpath subpathL;
3 subpathE.SampleSubpath(scene, &ctx.rng, TransportDirection::EL, maxNumVertices_);
4 subpathL.SampleSubpath(scene, &ctx.rng, TransportDirection::LE, maxNumVertices_);

2-4-2. Estimate contribution by vertex connection

In this part we estimate contribution by vertex connection ($\langle I \rangle_{VC}$ in Eq.8). The basic structure follows double loop for s and t, each corresponds to the number of vertices in light and eye subpaths. The ranges of these values are clipped so that the number of vertices in the connected path is in the range of max(2, minNumVertices_) and min(nL+nE, maxNumVertices_), where nE and nL is the number of vertices in the sampled subpaths.

 1 const int nE = (int)(subpathE.vertices.size());
 2 for (int t = 1; t <= nE; t++)
 3 {
 4     const int nL = (int)(subpathL.vertices.size());
 5     const int minS = Math::Max(0, Math::Max(2 - t, minNumVertices_ - t));
 6     const int maxS = Math::Min(nL, maxNumVertices_ - t);
 7     for (int s = minS; s <= maxS; s++)
 8     {
 9         // 2-4-2-1. Connect vertices and create a full path
10         // 2-4-2-2. Evaluate contribution
11         // 2-4-2-3. Evaluate path PDF
12         // 2-4-2-4. Evaluate MIS weight
13         // 2-4-2-5. Accumulate contribution
14     }
15 }

2-4-2-1. Connect vertices and create a full path

After fixing t and s, we connect two subpaths and construct a full path. The actual logic for path connection is implemented in VCMPath::ConnectSubpath function. For both-positive s and t, the connection process needs to check visibility between s-th vertex in the light subpath and t-th vertex in the eye subpath. Scene::Visible function checks this (line 26). We do not need this visibility query if one of s or t is zero, indicating a path sampling strategy that directly hit emitters. For instance, s=0 corresponds to path tracing. In this case, we only need to check the last vertex in the corresponding direction is on an emitter. If the path connection fails, in both cases the evaluation process is terminated because the resulting paths do not carry any contribution.

1 VCMPath fullpath;
2 if (!fullpath.ConnectSubpaths(scene, subpathL, subpathE, s, t)) { continue; }

 1 struct VCMPath
 2 {
 3     ...
 4     auto ConnectSubpaths(const Scene* scene, const VCMSubpath& subpathL, const VCMSubpath& subpathE, int s, int t) -> bool
 5     {
 6         assert(s >= 0);
 7         assert(t >= 0);
 8         vertices.clear();
 9         if (s == 0 && t > 0)
10         {
11             vertices.insert(vertices.end(), subpathE.vertices.rend() - t, subpathE.vertices.rend());
12             if ((vertices.front().primitive->Type() & SurfaceInteractionType::L) == 0) { return false; }
13             vertices.front().type = SurfaceInteractionType::L;
14         }
15         else if (s > 0 && t == 0)
16         {
17             vertices.insert(vertices.end(), subpathL.vertices.begin(), subpathL.vertices.begin() + s);
18             if ((vertices.back().primitive->Type() & SurfaceInteractionType::E) == 0) { return false; }
19             vertices.back().type = SurfaceInteractionType::E;
20         }
21         else
22         {
23             const auto& vL = subpathL.vertices[s - 1];
24             const auto& vE = subpathE.vertices[t - 1];
25             if (vL.geom.infinite || vE.geom.infinite) { return false; }
26             if (!scene->Visible(vL.geom.p, vE.geom.p)) { return false; }
27             vertices.insert(vertices.end(), subpathL.vertices.begin(), subpathL.vertices.begin() + s);
28             vertices.insert(vertices.end(), subpathE.vertices.rend() - t, subpathE.vertices.rend());
29         }
30         return true;
31     }
32     ...
33 };

2-4-2-2. Evaluate contribution

Here we evaluate the measurement contribution ($f$ in Eq.1) for the connected (full) path. This logic is implemented in VCMPath::EvaluateF function. This function takes arguments s and merge. Some might be temped to evaluate this function always with some fixed strategy (e.g., s=0) irrespective to the actual sampling strategies, but we cannot because of the following reason.

First, our code of the BSDF evaluation (BSDF::EvaluateDirection) compensates for non-symmetries for shading normals and specular refractions [Veach 1997]. This makes the measurement contribution function to be dependent on the sampling strategy. This means we must match the strategy of evaluating the same strategy with which the path is sampled.

Second, in the case that a path contains vertices on specular surfaces, we implicitly handles delta functions in the BSDF evaluations. In the sequence of the evaluation of contribution, the delta functions will be canceled out except for the BSDF evaluations related to connecting vertices. In order to compensate for this oddity, our implementation introduces special flag to ignore delta functions when evaluating BSDFs, which again makes it dependent on the sampling strategy.

2-4-2-2-1 and 2-4-2-2-2 computes sequence of $f_s\cdot G$ from the edges of E and L respectably. 2-4-2-2-3 takes care of the remaining terms. Note that we omits the visibility check if the merge flag is enabled.

1 const auto f = fullpath.EvaluateF(s, false);
2 if (f.Black()) { continue; }

 1 struct VCMPath
 2 {
 3     ...
 4     auto EvaluateF(int s, bool merge) const -> SPD
 5     {
 6         const int n = (int)(vertices.size());
 7         const int t = n - s;
 8         assert(n >= 2);
 9 
10         // 2-4-2-2-1. Evaluate product of f_s from L
11         // 2-4-2-2-2. Evaluate product of f_s from E
12         // 2-4-2-2-3. Evaluate c_{s,t}
13 
14         return fL * cst * fE;
15     }
16     ...
17 };

2-4-2-3. Evaluate path PDF

Next we will evaluate PDF for the connected path ($p_{s,t}^{VC}$ in Eq.2 or $p_{s,t}^{VM}$ in Eq.7), which is implemented in VCMPath::EvaluateF function. This function first checks if the given path is samplable with the selected strategy (2-4-2-3-1). If the path is determined not to be samplable, this function returns zero pdf value. For instance, a path originated from the geometrically-degenerated emitters such as a point light source cannot be sampled with path tracing (the strategy with s=0). We need to check this because this function can be called with different strategy as it sampled with, later in the evaluation of MIS weights.

Once the path is checked as samplable with given strategy, the function evaluates sequence of pdf defined for each vertex of the path in (2-4-2-3-2). This evaluate process also depends on the strategy as well as the merge flag. Note that the Primitive::EvaluateDirectionPDF functions returns pdf in projected solid angle measure, so we need to convert it to the area measure with PDFVal::ConvertToArea function because the target density is defined on the product area measure. We also note that when merge flag is enabled, we need to evaluate $\pi r^2$ in Eq.7.

1 const auto p = fullpath.EvaluatePathPDF(scene, s, false, 0_f);
2 if (p.v == 0_f) { return; }

 1 struct VCMPath
 2 {
 3     ...
 4     auto EvaluatePathPDF(const Scene* scene, int s, bool merge, Float radius) const -> PDFVal
 5     {
 6         const int n = (int)(vertices.size());
 7         const int t = n - s;
 8         assert(n >= 2);
 9 
10         // 2-4-2-3-1. Check if the path is samplable by the selected strategy
11         // 2-4-2-3-2. Evaluate path PDF     
12 
13         return pdf;
14     }
15     ...
16 };

2-4-2-4. Evaluate MIS weight

In this part we will evaluate MIS weights ($w_{v,s,t}$ in Eq.9). Here numPhotonTraceSamples corresponds to $N_{VM}$. In line 7, the numerator of Eq.9 evaluates to ps. The double loop in the next part evaluates the reciprocal of the weight, considering numerical goodness. Our implementation adopts the power heuristic with $\beta=2$ so the ratio of pi and ps is multiplied twice in line 19. Some strategies might not be possible to sample the given path, in this case EvaluatePathPDF function evaluates to zero and omits from the evaluation of the weight (line 16).

1 const auto w = fullpath.EvaluateMISWeight_VCM(
2     scene, s, false, mergeRadius, numPhotonTraceSamples_);

 1 struct VCMPath
 2 {
 3     ...
 4     auto EvaluateMISWeight_VCM(const Scene* scene, int s_, bool merge, Float radius, long long numPhotonTraceSamples) const -> Float
 5     {
 6         const int n = static_cast<int>(vertices.size());
 7         const auto ps = EvaluatePathPDF(scene, s_, merge, radius);
 8         assert(ps > 0_f);
 9 
10         Float invw = 0_f;
11         for (int s = 0; s <= n; s++)
12         {
13             for (int type = 0; type < 2; type++)
14             {
15                 const auto pi = EvaluatePathPDF(scene, s, type > 0, radius);
16                 if (pi > 0_f)
17                 {
18                     const auto r = pi.v / ps.v;
19                     invw += r*r*(type > 0 ? (Float)(numPhotonTraceSamples) : 1_f);
20                 }
21             }
22         }
23 
24         return 1_f / invw;
25     }
26     ...
27 };

2-4-2-5. Accumulate contribution

Hooray! We finally set up necessary terms (f,w, and p) to compute the contribution. Here we compute it and accumulate to the film (Film::Splat function). This function takes a raster position in $[0,1]^2$ and the contribution to be accumulated. VCMPath::RasterPosition computes the raster position of the path.

1 const auto C = f * w / p;
2 ctx.film->Splat(
3     fullpath.RasterPosition(),
4     C * (Float)(film->Width() * film->Height()) / (Float)numEyeTraceSamples_);

 1 struct VCMPath
 2 {
 3     ...
 4     auto RasterPosition() const -> Vec2
 5     {
 6         const auto& v = vertices[vertices.size() - 1];
 7         const auto& vPrev = vertices[vertices.size() - 2];
 8         Vec2 rasterPos;
 9         v.primitive->sensor->RasterPosition(Math::Normalize(vPrev.geom.p - v.geom.p), v.geom, rasterPos);
10         return rasterPos;
11     }
12     ...
13 };

2-4-3. Estimate contribution by vertex merging

In this part, we estimate the contribution by vertex merging ($\langle I \rangle_{VM}$ in Eq.8). Note that this has very similar structure to 2-4-2. The first loop for t is same as before, and the second loop is replaced with range query. VCMKdTree::RangeQuery function takes a function that is called when a recorded position within the given radius from the given position is found. si corresponds to the index of the light subpaths in 2-2, and vi points to its vertex. Note that the range query takes responsibility for the operation $\sum_{l=1}^{N_{VM}} \sum_{s,t\geq 2} \chi_{s,t}(\bar{x}_l)$ in Eq.8.

Line 4 skip density estimation for the point geometries, although it rarely happens in the most of the scenes. We omits the explanation for the parts 2-4-3-2 to 2-4-3-5 because it uses same function with changing parameters as in 2-4-2.

 1 for (int t = 1; t <= nE; t++)
 2 {
 3     const auto& vE = subpathE.vertices[t - 1];
 4     if (vE.primitive->IsDeltaPosition(vE.type)) { continue; }
 5     pm.RangeQuery(vE.geom.p, mergeRadius, [&](int si, int vi) -> void
 6     {
 7         const int s = vi + 1;
 8         const int n = s + t - 1;
 9         if (n < minNumVertices_ || maxNumVertices_ < n) { return; }
10         // 2-4-3-1. Merge vertices and create a full path
11         // 2-4-3-2. Evaluate contribution
12         // 2-4-3-3. Evaluate path PDF
13         // 2-4-3-4. Evaluate MIS weight
14         // 2-4-3-5. Accumulate contribution
15     });
16 }

2-4-3-1. Merge vertices and create a full path

Once again after fixing the strategies (s and t), we construct the full path from the light and eye subpaths. We want to be careful of the index of the strategy, the strategy indexed by (s,t) corresponds to the path with s+t-1 vertices, not s+t vertices as in the case of vertex connection. And the number of vertices passed to the following functions must be s-1, not s. VCMPath::MergeSubpaths takes care of the actual process. Note that we no longer needs to check visibility between vertex so the things becomes easier. Again we skip the merging for point geometries in line 11.

1 static thread_local VCMPath fullpath;
2 if (!fullpath.MergeSubpaths(subpathLs[si], subpathE, s - 1, t)) { return; }

 1 struct VCMPath
 2 {
 3     ...
 4     auto MergeSubpaths(const VCMSubpath& subpathL, const VCMSubpath& subpathE, int s, int t) -> bool
 5     {
 6         assert(s >= 1);
 7         assert(t >= 1);
 8         vertices.clear();
 9         const auto& vL = subpathL.vertices[s - 1];
10         const auto& vE = subpathE.vertices[t - 1];
11         if (vL.primitive->IsDeltaPosition(vL.type) || vE.primitive->IsDeltaPosition(vE.type)) { return false; }
12         if (vL.geom.infinite || vE.geom.infinite) { return false; }
13         vertices.insert(vertices.end(), subpathL.vertices.begin(), subpathL.vertices.begin() + s);
14         vertices.insert(vertices.end(), subpathE.vertices.rend() - t, subpathE.vertices.rend());
15         return true;
16     }
17     ...
18 };

Conclusion

Concluding the article, we did not have time to explain all of the code and omits some explanation of the details. So it would be great to read code further for better understands, and of course, feel free to ask me any questions! As said again and again in this article, our implementation does not focus on the optimizations, even on the utterly apparent ones. Please be careful if you adopt (some part of) the code in your productions. We plan to implement the optimized version of VCM/UPS someday, possibly.

Thank you for reading!

References