Denoising - Real-Time Path Tracing - Fast Convolutional Neural Networks for Real-Time Path Trac

3. Real-Time Path Tracing

3.5 Denoising

As stated previously, the computational demands of real-time path tracing are too much to generate an output image without noise. Therefore, many methods have been proposed to reduce the noise and variance of a noisy path traced image with only few samples

per pixel. The focus of this thesis is the real-time denoising of these samples so exploring most of the offline methods are omitted. But as also dedicated machine learning inference hardware has appeared some of the work done for machine learning denoisers are visited as the usage of CNNs for denoising in real-time is possible.

As halving the error of the output requires quadrupling the number of samples [45, p.

643], there is a point where using a denoising filter can create a better quality image with less complexity opposed to just increasing the number of samples. Moreover, the time requirements for real-time path tracing and the time consumed for even generating the noisy image places even more demanding time restrictions for the denoising algorithm.

For example, the time constraints for 60 frames per second (fps) gives the path tracing and reconstruction algorithm∼16 ms per single frame.

One method for denoising the path traced image include simple blurring filters such as bilateral blur [51] and it’s multi-resolution variant Á Trous [52] which has been successfully implemented for real-time path tracing denoising with Spatio-Temporal Variance Guided Filter (SVGF) [1]. As the the name suggests, SVGF uses spatio-temporal sample vari-ance as the edge avoiding statistic for bilateral filter. The bilateral filter and Á Trous filter are introduced more comprehensively in sections 3.5.1 and 3.5.2. Also, a blockwise linear regression method is used by Koskela et al. in Blockwise Multi-Order Feature Regres-sion(BMFR) [46] for real time denoising.

A real-time machine learning based solution has been proposed using neural bilateral grids by Meng et al. [48]. As dedicated machine learning inference hardware has ap-peared, Meng also compares the proposed method with some previous machine learning methods such as aMulti-Resolution variant of Kernel Prediction CNN (MR-KP) [53] and OptiX Neural Network Denoiser (ONND) derived from the work by Chaitanya et al. [3]

which have been previously labeled as ’interactive’ and not for real-time applications.

Meng suggests these as interesting comparison points for real-time path tracing denois-ing. Moreover, the implementations for MR-KP and ONND run for the order of tens of milliseconds on state-of-the-art GPUs for a frame size of 1280x720 and are not able to produce real-time denoising.

One interesting way to denoise the Monte Carlo approximation is to divide the path tracing in different groups in the equation 3.1. One way to do this is to separate the direct and in-direct illumination components like done in other work [1, 47]. Another way is to separate the diffuse and specular components like done by Bako et al. in [12]. In these cases, the components would be reconstructed in separate pipelines and added together after that.

This enables for example to use different scale for the input samples as done by Bako by denoising the specular component in logarithmic scale with better results opposed to linear scale. However, for real-time consideration the separation of the components is not found to be efficacious in other work [46, 50].

3.5.1 Bilateral Filter

Bilateral filter is an edge-preserving nonlinear local filter [51]. It is essentially an extension to a gaussian blur:

w(p) =e⁻

(p−q)²

2σ² , (3.3)

where the p is the filtered pixel coordinates, q is the sample coordinate in the kernel respectively andσ is the standard deviation for the gaussian distribution. Bilateral filter extends this by adding a color intensity difference to the gaussian filter.

w_b(p) = e

−(p−q)² 2σ_d² ⁻

|I(p)−I(q)|

2σ²_l . (3.4)

Here the I(p) andI(q) are the color intensity values for the samples,σ_d andσ_l are the standard deviations for the spatial distance and color intensity values respectively.

In path tracing for example the depth buffer and the normals of the first rays offer good information for edge preservation. The usage of these separate features for edge pre-serving is called cross-bilateral filtering [54, 55]. In path tracing these can be added to the filter kernel like done in [1, 56] by for example:

w_n(p) = max(0, n(p)·n(q))^σⁿ, (3.5)

where n(p) andn(q) is the normals of the surface point for the sample. And the depth can be added as:

wz(p) =e

− |Z(p)−Z(q)|

σ²_z|∇Z(q)·(p−q)|+ϵ , (3.6)

whereZ(p)is the depth value for the sample and the∇Z(qis the gradient of the depth, and ϵ is used to avoid division by zero in cases where there the gradient for the depth values is zero.

Bilateral filter has also been extended to be used in a grid format for real-time image processing by CHen et al. [57]. The bilateral grid extends the 2D image with a third value which describes the bilateral filter for the 2D pixel. This idea is used in work by Meng et al. [48] to combine the idea of machine learning filter and bilateral filtering.

Á Trous with step size = 1 Á Trous with step size = 2 Á Trous with step size = 3

Figure 3.5. Á Trous with step sizes 1, 2 and 3. Each step size increases the receptive field with2ⁿwhere n is step size - 1 [52]

3.5.2 Á Trous

Á Trous ("algorithm with holes") is a specialized bilateral filter. The idea is to run multiple passes of the same bilateral blur filter in different frequencies [52]. Á Trous is also called the discrete wavelet transform where the bilateral blur filter is run in a dithering pattern.

The dithering pattern is used to increase the receptive field of the bilateral filter consid-ering only a small set of the samples in a larger window. The kernel with the dithconsid-ering pattern is illustrated in figure 3.5.

Most notably the Á Trous algorithm has been used successfully to denoise real-time path tracing with SVGF. For path tracing at low sample counts the SVGF must scale the in-tensity weight of the samples. It does this by modifying the inin-tensity weight in equation 3.4

w_l(p) = e

|I(p)−I(q)|

σ_l√︁

g_3x3(V ar(I(p)) +ϵ , (3.7)

where the √︁

g_3x3(V ar(I(p))is the sample variance acquired from the variance buffer.

In addition, SVGF updates the variance estimationV ar(I(p)for every iteration of the Á Trous filter. SVGF also uses variables to estimate the standard deviations of σ_z = 1, σ_n = 128 and σ_l = 4 for the bilateral filter. These variables control the edge detection weight of the features. For real-time path tracing SVGF with one sample per pixel SVGF uses 5 iterations of Á Trous with step size of2ⁿ.

In document Fast Convolutional Neural Networks for Real-Time Path Tracing Denoising (sivua 30-34)