• Ei tuloksia

Computational Multifocal Near-Eye Display with Hybrid Refractive-Diffractive Optics

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Computational Multifocal Near-Eye Display with Hybrid Refractive-Diffractive Optics"

Copied!
6
0
0

Kokoteksti

(1)

COMPUTATIONAL MULTIFOCAL NEAR-EYE DISPLAY WITH HYBRID REFRACTIVE-DIFFRACTIVE OPTICS

Ugur Akpinar, Erdem Sahin, Atanas Gotchev

Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland

ABSTRACT

We present a multifocal computational near-eye display that employs a static diffractive optical element (DOE) in tandem with a refractive lens. The DOE is co-optimized with the con- volutional neural network-based preprocessing to achieve de- sired multifocal display point spread function in an optimal manner. In the simulations, we demonstrate a multifocal dis- play that can deliver sharp images for three distinct depths sampling the dioptric depth range uniformly from 3 diopters to infinity.

Index Terms— Computational display, Optics, Neural network

1. INTRODUCTION

An ideal three-dimensional (3D) near-eye display (NED) is aimed at delivering all physiological depth cues of the hu- man visual system (HVS) accurately to ensure realistic as well as comfortable viewing experience. In most of the ex- isting NEDs, the stereo cues (binocular disparity and ver- gence) are simply created by projecting the corresponding two-dimensional (2D) image, with correct perspective, to each eye; and the motion parallax is available via head track- ing. However, it has been challenging to deliver the correct focus cues (accommodation and retinal defocus blur). The NEDs that fail to overcome this challenge suffer from the well-known vergence-accommodation conflict (VAC), which has been reported to be an important factor that can cause vi- sual discomfort and fatigue [1].

There exist mainly two broad category of approaches ad- dressing the VAC problem: enabling the focus cues or mak- ing the display accommodation-invariant (AI) [2, 3]. The ap- proaches enabling focus cues include: varifocal, where the focal depth of the display is dynamically altered to match the converged depth, e.g., using a focus-tunable lens or varying the lens-to-sensor distance [4, 5]; multifocal, where a stack of 2D displays are spatially multiplexed or multiple depths are covered by time-multiplexing through a fast varifocal optics [6, 7]; light field and holographic, where the complete field of light due to given 3D scene is created, in the former case, geometrically as a collection of all desired rays [8, 9] or, in the latter case, through wavefronts relying on wave diffrac-

tion and interference [10]. The AI displays, on the other hand, aim to remove the optical defocus blur cue. In such a case, the accommodation is expected to be only (cross-) driven by the binocular disparity, and thus the VAC is avoided.

In the Maxwellian-type displays the effective entrance pupils of the eyes are reduced by projecting images through small pinholes, where the retinal images appear to be sharp in a large depth range without perceivable defocus blur [11]. Al- though such approach is simple to implement, it significantly reduces the light throughput. A more efficient approach is to achieve a depth-invariant display point spread function (PSF) at the retina, which is implemented, for instance, employing a focus-tunable lens that changes the focal depth of the dis- play and scans the scene depth range faster than the temporal resolution of the eye [12]. In this approach, unlike the above- mentioned multifocal technique, the display content is fixed.

In particular, the display image is deconvolved with an esti- mated (time-averaged) display PSF, which results in AI per- ceived images when used with the focus-tunable optics.

In this paper, we present a computational multifocal NED, which can provide a set of focal planes at intended discrete depths. Unlike in most of the existing multifocal NEDs, our approach relies on fixed static optics, in particular, an eye- piece consisting of a refractive lens and a diffractive optical element (DOE). This is a significant advantage in terms of form-factor and system complexity (e.g., issues related with adaptive optics, such as the need for very high frame-rate dis- plays, etc.). Additionally, such custom design enables arbi- trary PSF engineering, significantly increasing the design de- gree of freedom. An additional preprocessing stage further helps achieving desired system response. We utilize a con- volutional neural network (CNN) based preprocessing decon- volution algorithm for this purpose. By combining a differ- entiable display model with the preprocessing, we are able to engineer and achieve the desired (fixed) multifocal PSF in an optimal manner. Such end-to-end optimization framework has been actually previously utilized in various imaging prob- lems, but mostly for image capture, such as extended depth of field imaging [13, 14].

At this point, it is also worth mention the multifocal cam- eras that are analogues to multifocal NEDs in image capture, especially those employing spatial multiplexing of the imag- ing lens [15, 16], and multifocal contact and intraocular lenses

(2)

[17] that also combine (multiplex) lenses with different fo- cal powers in a single optical element. It is obvious that, in the latter case, the HVS has to deal with necessary postpro- cessing to better interpret the multifocal image, whereas in the former case, such deconvolution can be applied as part of the computational imaging approach. Nevertheless, indepen- dently designed (multifocal) optics and deconvolution algo- rithm, as applied in [15], is likely to be suboptimal.

2. METHOD

Optics

D-CNN Loss=

𝐸(𝐼𝑝, መ𝐼𝑝) 𝐼

𝑧

መ𝐼𝑝 𝐼𝑑

Blur 𝐼𝑝

𝜕𝐸

𝜕 መ𝐼𝑝

𝜕𝐸

𝜕𝐼𝑑

Φ 𝜕𝐸

𝜕Φ

Fig. 1: Overall representation of the proposed method.

Fig. 1 schematize the computational multifocal NED sys- tem illustrating the end-to-end iterative optimization of op- tics (i.e., the DOE) and CNN-based preprocessing deconvolu- tion layer (D-CNN) based on the desired perceived imageIp (modelled through the Blur block) and defined quality metric (Loss). Here the preprocessing stage can be thought as analo- gous to the postprocessing in the computational cameras, e.g., the deconvolution operation in extended depth of field imag- ing [13, 14], which aims to computationally complement the display optics in achieving the desired characteristics in the final visualized imageIˆp. During optimization (training), we provide an all-in-focus imageIand the accommodation dis- tancez, which is picked randomly from a set of discrete depth values within the target scene depth range, as inputs to the net- work. First, D-CNN takes the input imageIand outputs the image to be driven to the display,Id. Then, a physically re- alizable optics model simulates the perceived imageIˆpbased on the display imageIdand the accommodation distancez.

As we aim at creating multifocal display that outputs sharp (i.e., focused) images at certain depth values, we di- vide the discrete set of z values into two subsets, namely main depths and intermediate depths. The conditional blur- ring block (Blur) passes the all-sharp input image, ifzcorre- sponds to one of the main (focal) depths; otherwise, for an in- termediate depth, it blurs the input image based on the amount of defocus with respect to (one of the) nearest main depth. Fi- nally, the loss function is calculated between the desired per- ceived imageIpand the network outputIˆp, and the system is optimized in an end-to-end manner using gradient-based op- timization. In the following subsections, each components of the end-to-end system is described in details.

2.1. Optical design

zr ze zd

z

Lens+DOE OLED Reference

Eye Retina

(s, t) (x, y)

Fig. 2: Proposed computational multifocal NED setup.

Fig. 2 illustrates the optical layout of the multifocal dis- play. In particular, we employ the DOE together with the refractive lens, which is expected to focus the input image at multiple depths while decreasing the chromatic aberrations exhibited by the refractive lens-only or DOE-only cases. As- suming thin lens model for the eye and planar retinal plane, and given the focus distance of the eye from the lens plane, z, there exists a reference plane(x, y), which is the conjugate plane of the retina (e.g., the black curve in Fig. 2). In the proposed implementation, we reconstruct images at the refer- ence plane. The image formation can then be described via the PSFs.

Under monochromatic illumination with the wavelength λ, and within the limits of the paraxial optics, the incoherent PSF at the reference plane(x, y),h(x, y), is described as [18]

hλ,z(x, y)∝

F {Q(s, t)}| x

λz,λzy

2

, (1)

whereQ(s, t)represents the generalized pupil function at the lens plane, andF {.}is the Fourier transform operator. As- suming thin lens model for the refractive lens, the generalized pupil functionQ(s, t)is derived as

Qλ,z(s, t) =A(s, t) exp(jΦλ(s, t)) exp

λ,z

s2+t2 r2

, (2) whereA(s, t)is the circular aperture function,Φλ(s, t)is the phase delay due to the DOE,

Ψλ,z =k 2

−1 z+ 1

zd − 1 fλ

r2, (3)

is the defocus coefficient,k= 2π/λis the wavenumber,fλis the wavelength-dependent effective focal length of the under- lying refractive lens, andris the aperture radius. Finally, the perceived image atz,Iˆp(x, y), is the convolution between the display image and the incoherent PSF,

z,λp (x, y) =Iλd(x, y)∗hz,λ(x, y). (4) In order to enforce the DOE to create the multifocus ef- fect, we employ conditional blurring on the input image to

(3)

derive the desired (ground truth) perceived image. That is, we apply blur on input images with a pre-calculated ground-truth PSFhb(x, y), if the reconstruction depth is one of the pre- defined intermediate depths. In such case, the ground truth image,Ip(x, y), is defined to be

Iλp(x, y) =Iλ(x, y)∗hb(x, y). (5) The PSF of Eq. 5 is derived using Eq. 1 and Eq. 2, where Φλ(s, t) is set to be zero, andΨλ,z = 74 for all λ. Such defocus value corresponds to the amount of blur when the object is 0.5 diopter (D) away from the focus plane of the refractive lens for the wavelength ofλ= 530nm.

In the training process, the phase delay due to the DOE, Φλ(s, t), is optimized. A physically realizable transparent op- tical element, such as DOE, exhibits phase delay that can be formulated via its thickness functiond(s, t), as

Φλ(s, t) =k(nλ−1)d(s, t), (6) wherenλ is the refractive index of the material atλ. Then, if the phase delay for a nominal wavelengthλ0λ0(s, t)is to be optimized, the phase delay for a givenλcan be derived using Eq. 6 as

Φλ(s, t) = Φλ0(s, t)λ0(nλ−1)

λ(nλ0−1). (7) Similarly, the wavelength-dependent focal length of the re- fractive lens,fλ, is modeled throughfλ0as

fλ=fλ0nλ0−1

nλ−1. (8)

2.2. Preprocessing CNN

3 I

32 64

128

256

128 128 64 64

32 32 3

+

3 Id

Fig. 3: The preprocessing network (D-CNN) based on U-net architecture [19] equipped with skip connection.

The proposed network for the preprocssing step is shown in Fig. 3. In particular, we employ the U-net architecture [19], as a residual network. That is, the output of the network is the difference image between the display imageIdand the input imageI. U-net is a multi-scale network which consists of en- coding and decoding parts. In the encoding stage,3×3con- volutions followed by rectified linear unit (ReLU) activation

functions are utilized, shown as graded yellow in Fig. 3. At the end of each scale, the output is downsampled by 2 using a max pooling layer (red in Fig. 3). The decoding stage starts with upsampling the output of the lower scale by transposed convolution (shaded blue) and concatenation with the output of the encoding stage at the corresponding level (skipped con- nections). After that, the data is further processed via3×3 convolution and ReLU layers. The channel sizes of each filter are 32, 64, 128 and 256 in each scale, respectively, as given under each block in Fig. 3. The final output of the decoding is processed into a1×1convolution layer with the channel size equal to the input channel size of 3. After that, we add a sum- mation layer that takes the all-sharp imageIand the output of the U-net, and gives the display imageId. Such connection enables the network to learn the difference image, which is expected to be sparse.

2.3. Loss function

We train the network using a regularized loss function, which minimizes the L1-loss and maximizes the structural similarity (SSIM) [20] between the network outputIˆp and the ground truthIp. Such loss function provides a good compromise be- tween the texture details and perceptual quality. An additional regularization term is employed on the display imageIdin or- der to keep the pixel values within the display dynamic range of[0,1]. The loss function is mathematically described as

E(Ip,Iˆp) =Ll1(Ip,Iˆp) +Lssim(Ip,Iˆp)

+αR(Ip,Iˆp) +γRd(Id), (9) whereLl1(Ip,Iˆp)is the L1-loss,Lssim(Ip,Iˆp)is the SSIM- loss [21]

Lssim(Ip,Iˆp) = 1−SSIM(Ip,Iˆp), (10) R(Ip,Iˆp) is the regularization on the network output, and Rd(Id)is the regularization on the display image.Rd(Id)is the indicator function which gives 0 for the pixels within[0,1]

and 1 elsewhere. By settingγ → ∞, it approaches to a hard constraint on display image. In practice we setγ= 150. We utilize the dark channel prior [22] as the network regularizer R(Ip,Iˆp). The dark channel of a color image,J, is defined as the minimum of the color channels within each image patch, i.e.

J(x) = min

λ∈{R,G,B} min

y∈Ω(x)Iλ(y), (11) wherex,yare the pixel indices andΩ(x)is the neighborhood of x. In the proposed method, the regularizer is chosen to be the weighted L1-norm of the dark channel of the network output, that is

R(Ip,Iˆp) =|exp(−βJp) ˆJp|1, (12) whereexp(−βJp)decreases the weights of the pixels within the bright regions of the ground truth image. During training, we setα= 0.005, β= 10, and the patch size of 17 pixels.

(4)

3. SIMULATION RESULTS

During training, we assume a commercially available plano- convex lens as the underlying refractive lens. The effective focal length isfλs = 30mm at the specification wavelength λs= 587.6nm, whereas the aperture radius is taken to ber= 5mm. The materials of both the refractive lens and the DOE are assumed to be fused silica, with the wavelength-dependent refractive indices of nλR = 1.458, nλG = 1.461, nλB = 1.466. The network is trained with color images, accounting for the wavelengthsλR= 600nm,λG= 530nm,λB= 450 nm, for red, green, and blue channels, respectively. The lens- to-OLED distance is set as zd = 28.57 mm, focusing the refractive lens at 1.5 D for the nominal wavelengthλ0G. The pixel pitch of OLED is 8.7µm, which corresponds to angular resolution of 1 arcmin.

The proposed computational multifocal display is opti- mized using a mixture of natural images in [23] and synthetic images in [24]. The images are divided into the patches of 256×256pixels, while the batch size is set as 2. The target ac- commodation range is set as 0-3 D, which is divided into dis- crete set of depths with 0.5 D interval. At each iteration, the reconstruction depth is chosen randomly. We set the interme- diate depth values as0D,1D,2D,3D, at which the ground truth images,Ip, are created by applying blur on the input im- age, as described in Sec. 2. For the main reconstruction depth values, i.e.,0.5D,1.5D,2.5D, we setIp=I. Such arrange- ment is based on previous studies [25], indicating that a dif- ference of±0.5 D between the vergence and accommodation depths is within the zone of comfort. To speed up the training, we initializeΦλ0(s, t)with the superposition of three diffrac- tive lenses such that the hybrid refractive-diffractive system focuses at the main reconstruction depths forλ0.

−5 5

−5

5

s(mm)

t(mm)

Optimized height map

0 1 2 3 µm

(a)

0 1 2 3

−20

20

D

arcmin

PSF

(b)

Fig. 4: The optimized height map at fabrication resolution of 3µm(a), one-dimensional cross-sections of the PSFs (b).

Using Eq. 3, the limits of the defocus coefficient is|Ψ| ≤ 328for the target accommodation range of 0-3 D. Following the discussion in [14], we set the optimum mask sampling rate based on the defocus range, which corresponds to∆s = 6 µm. Please note, however, that such sampling rate is utilized

0 10 20 30

0 1

CPD MTF (0.5 D)

(a)

R G B Lens-only

0 1 2 3

0 1

D MTF (1 CPD)

(b)

0 1 2 3

0 1

D MTF (5 CPD)

(c)

0 1 2 3

0 1

D MTF (10 CPD)

(d) Fig. 5: One-dimensional cross-sections of MTFs at the ac- commodation distance of 0.5 D (a), the MTFs within the as- sumed scene depth range for 1 CPD (b), 5 CPD (c), and 10 CPD (d). The MTF of the green channel of a conventional display without phase mask (lens-only) is also plotted.

only during training. During the simulations, we upsample the optimized phase element to 3µmvia bicubic interpola- tion, which accounts for a typical fabrication resolution of the DOE. Fig. 4 illustrates the optimized DOE with the up- sampled resolution, together with the one-dimensional PSFs at varying depth. We also present the modulation transfer function (MTF) analysis in Fig. 5, including one-dimensional MTFs at 0.5 D, as well as the cross-sections of the MTFs throughout the scene depth at three different spatial frequen- cies, namely 1, 5, and 10 CPD. As can be inferred from the PSFs and the MTFs, we observe sharper response at the aimed focus depths of 0.5 D, 1.5 D, and 2.5 D and broader PSFs at intermediate depths. However, compared to a conventional refractive lens with focal depth of 1.5 D (dashed green plots), such improvement comes at the cost of degradation in the MTFs around the focal depth. One main objective of the mul- tifocal display is to relax such resolution-depth trade-off in an optimal manner. We aim at sacrificing as little as possible from the overall image quality, while keeping the conflict be- tween the accommodation and convergence inside the zone of comfort [25]. In addition, the preprocessing is intended to fur- ther compensate such degradation. However, it should also be noted that the D-CNN output is limited to the display dynamic range. In particular, we assume the display minimum and maximum brightness values as 0 and 1, respectively. The pre- processing aims at finding the optimum display image within such range, which may not correspond to the global optimum.

(5)

(a) 2.5 D

20.25 / 0.695

(b) 2 D

22.70 / 0.735

(c) 1.5 D

27.18/0.849

(d) 1 D

23.75/0.796

(e) 0.5 D

20.65 / 0.754

(f) 0 D

19.17 / 0.724

22.18 / 0.783 22.08 / 0.785 23.15 / 0.809 22.14 / 0.794 22.32 / 0.799 20.25 / 0.754

23.23/0.795 23.52/0.789 24.09 / 0.810 23.30 / 0.784 24.73/0.827 21.43/0.756 Fig. 6: Comparison of the conventional stereoscopic display with single refractive lens (top), AI computational near-eye dis- play proposed by Konradet al. [12] (middle), and the proposed multifocal computational near-eye display (bottom). The PSNR/SSIM values are given under each image.

It is also worth to mention here that both the PSFs in Fig. 4 and the MTFs in Fig. 5 are derived by taking into account the finite size of the display pixels. That is, the PSFhλ,z(x, y)of Eq. 1 is convolved with the square display pixel. Therefore, the display resolution of 1 arcmin inherently constitutes an upper bound for the MTFs, corresponding to the maximum frequency of 30 CPD. During simulations, the sampling step of the reconstruction grid is taken to be 0.5 arcmin (corre- sponding to 60 CPD), which accounts for the assumed maxi- mum retinal resolution.

The delivered image quality of the computational multifo- cal NED is further compared with the conventional lens-only stereoscopic display focused at 1.5 D, and the AI computa- tional display proposed by Konradet al. [12], which utilizes focus tunable lenses to sweep through the scene. We sim- ulate the AI display at the discrete mode, where the lens is focused at 0.5 D, 1.5 D, and 2.5 D. The preprocessing is done via Wiener deconvolution with the average PSF. The simu- lations are performed on a synthetic image from the TAU agent data set [24], which is reconstructed at six accommo- dation depths from 2.5 D to 0 D. The results are illustrated in Fig. 6. The peak signal-to-noise ratio (PSNR) and SSIM values are given as quantitative comparison under each im- age with bold indicating the best value for each depth. As it can be seen from the figure, the amount of blur significantly increases in the conventional display as the accommodation distance gets further away from the lens image plane. AI dis- play achieves significantly higher image quality at such dis- tances as well, especially at main reconstruction depths. The proposed method achieves slightly better results compared to the AI display. The main advantage of the proposed method over the AI display is that the optics is composed of static

elements, which significantly reduces the system complex- ity. However, in both methods we observe degradation in the overall image quality compared to the focal depth (1.5 D) of the conventional display, due to the above-mentioned resolution-depth trade-off. The degradation is especially vis- ible as milky haze effect due to reduced contrast. One main source of such artifacts is the limitation enforced by the dis- play dynamic range, as previously discussed. That is, the out- put of the preprocessing step is bounded within [0,1], which in turn limits the boosting of high frequency components in the image. Nevertheless, in the quantitative analysis the max- imum SSIM value of the proposed method is comparable with the maximum SSIM in case of the conventional method.

4. CONCLUSION

We present a computational multifocal NED to ad- dress the VAC. The proposed method takes advantage of a co-designed (preprocessing) deconvolution and hybrid refractive-diffractive optics to create multiple accommodation distances. In addition, the resolution-depth trade-off inherent to our optical setup is optimized by selecting the focus depths with 1D interval, which satisfies the comfort zone condition.

A critical advantage of our computational multifocal NED is that we utilize static optical elements, which significantly re- duces the system complexity and further eases the integration of the proposed algorithm into commercially available head- sets. As a future work, we plan to implement a prototype dis- play and perform subjective experiments to rigorously char- acterize the display especially in terms of the VAC.

(6)

5. REFERENCES

[1] David M. Hoffman, Ahna R. Girshick, Kurt Akeley, and Martin S. Banks, “Vergence–accommodation conflicts hinder visual performance and cause visual fatigue,”

Journal of Vision, vol. 8, no. 3, pp. 33–33, 03 2008.

[2] Hong Hua, “Enabling focus cues in head-mounted dis- plays,” Proceedings of the IEEE, vol. 105, no. 5, pp.

805–824, 2017.

[3] Gregory Kramida, “Resolving the vergence- accommodation conflict in head-mounted displays,”

IEEE transactions on visualization and computer graphics, vol. 22, pp. 1912 – 1931, 08 2015.

[4] Nitish Padmanaban, Robert Konrad, Tal Stramer, Emily A Cooper, and Gordon Wetzstein, “Optimizing virtual reality for all users through gaze-contingent and adaptive focus displays,” Proceedings of the National Academy of Sciences, vol. 114, no. 9, pp. 2183–2188, 2017.

[5] Kaan Akundefinedit, Ward Lopes, Jonghyun Kim, Pe- ter Shirley, and David Luebke, “Near-eye varifocal aug- mented reality display using see-through screens,”ACM Trans. Graph., vol. 36, no. 6, Nov. 2017.

[6] Kurt Akeley, Simon J Watt, Ahna Reza Girshick, and Martin S Banks, “A stereo display prototype with mul- tiple focal distances,” ACM transactions on graphics (TOG), vol. 23, no. 3, pp. 804–813, 2004.

[7] Gordon D. Love, David M. Hoffman, Philip J.W. Hands, James Gao, Andrew K. Kirby, and Martin S. Banks,

“High-speed switchable lens enables the development of a volumetric stereoscopic display,”Opt. Express, vol.

17, no. 18, pp. 15716–15725, Aug 2009.

[8] Douglas Lanman and David Luebke, “Near-eye light field displays,” ACM Transactions on Graphics (TOG), vol. 32, no. 6, pp. 1–10, 2013.

[9] F. Huang, K. Chen, and G. Wetzstein, “The Light Field Stereoscope: Immersive Computer Graphics via Fac- tored Near-Eye Light Field Displays with Focus Cues,”

ACM Trans. Graph. (SIGGRAPH), , no. 4, 2015.

[10] Andrew Maimone, Andreas Georgiou, and Joel S Kollin, “Holographic near-eye displays for virtual and augmented reality,” ACM Transactions on Graphics (TOG), vol. 36, no. 4, pp. 1–16, 2017.

[11] Takahisa Ando, Koji Yamasaki, Masaaki Okamoto, Toshiaki Matsumoto, and Eiji Shimizu, “Retinal pro- jection display using holographic optical element,” in Practical Holography XIV and Holographic Materi- als VI. International Society for Optics and Photonics, 2000, vol. 3956, pp. 211–216.

[12] Robert Konrad, Nitish Padmanaban, Keenan Mol- ner, Emily A Cooper, and Gordon Wetzstein,

“Accommodation-invariant computational near-eye displays,” ACM Transactions on Graphics (TOG), vol.

36, no. 4, pp. 88, 2017.

[13] Vincent Sitzmann, Steven Diamond, Yifan Peng, Xiong Dun, Stephen Boyd, Wolfgang Heidrich, Felix Heide, and Gordon Wetzstein, “End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging,” ACM Transactions on Graphics, vol. 37, no. 4, 2018.

[14] U. Akpinar, E. Sahin, and A. Gotchev, “Learning opti- mal phase-coded aperture for depth of field extension,”

in2019 IEEE International Conference on Image Pro- cessing (ICIP), Sep. 2019, pp. 4315–4319.

[15] Anat Levin, Samuel W. Hasinoff, Paul Green, Fr´edo Du- rand, and William T. Freeman, “4d frequency analysis of computational cameras for depth of field extension,”

ACM Trans. Graph., vol. 28, no. 3, July 2009.

[16] Eyal Ben-Eliezer, Emanuel Marom, Naim Konforti, and Zeev Zalevsky, “Experimental realization of an imaging system with an extended depth of field,”Appl. Opt., vol.

44, no. 14, pp. 2792–2798, May 2005.

[17] Daniel Carson, Warren Hill, Xin Hong, and Mutlu Karakelle, “Optical bench performance of acrysof iq restor, at lisa tri, and finevision intraocular lenses,”Clin- ical ophthalmology (Auckland, N.Z.), vol. 8, pp. 2105–

13, 10 2014.

[18] J. W. Goodman,Introduction to Fourier Optics, Roberts and Company Publishers, 2005.

[19] Olaf Ronneberger, Philipp Fischer, and Thomas Brox,

“U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Med- ical image computing and computer-assisted interven- tion. Springer, 2015, pp. 234–241.

[20] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli, “Image quality assessment: from error vis- ibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.

[21] Hang Zhao, Orazio Gallo, Iuri Frosio, and Jan Kautz,

“Loss functions for neural networks for image process- ing,” arXiv preprint arXiv:1511.08861, 2015.

[22] Kaiming He, Jian Sun, and Xiaoou Tang, “Single image haze removal using dark channel prior,” IEEE transac- tions on pattern analysis and machine intelligence, vol.

33, no. 12, pp. 2341–2353, 2010.

[23] Herve Jegou, Matthijs Douze, and Cordelia Schmid,

“Hamming embedding and weak geometric consistency for large scale image search,” inEuropean conference on computer vision. Springer, 2008, pp. 304–317.

[24] Harel Haim, Shay Elmalem, Raja Giryes, Alex Bron- stein, and Emanuel Marom, “Depth Estimation from a Single Image using Deep Learned Phase Coded Mask,”

IEEE Transactions on Computational Imaging, pp. 298 – 310, 2018.

[25] Takashi Shibata, Joohwan Kim, David M. Hoffman, and Martin S. Banks, “The zone of comfort: Predicting vi- sual discomfort with stereo displays.,”Journal of vision, vol. 11 8, pp. 11, 2011.

Viittaukset

LIITTYVÄT TIEDOSTOT

For encoding the depth values from the sequence of images , we use the method presented in [9], where the depth value of a current region, , is encoded using its position in the list

Since both the beams have the same stiffness values, the deflection of HSS beam at room temperature is twice as that of mild steel beam (Figure 11).. With the rise of steel

To scrutinise these developments, the paper is framed by a conceptualisation of policy mobility and translation, with an in-depth focus on localised assembling processes

Others may be explicable in terms of more general, not specifically linguistic, principles of cognition (Deane I99I,1992). The assumption ofthe autonomy of syntax

The problem is that the popu- lar mandate to continue the great power politics will seriously limit Russia’s foreign policy choices after the elections. This implies that the

The US and the European Union feature in multiple roles. Both are identified as responsible for “creating a chronic seat of instability in Eu- rope and in the immediate vicinity

Te transition can be defined as the shift by the energy sector away from fossil fuel-based systems of energy production and consumption to fossil-free sources, such as wind,

Indeed, while strongly criticized by human rights organizations, the refugee deal with Turkey is seen by member states as one of the EU’s main foreign poli- cy achievements of