• Ei tuloksia

Machine Learning for Snapshot Hyperspectral Camera

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Machine Learning for Snapshot Hyperspectral Camera"

Copied!
25
0
0

Kokoteksti

(1)

MACHINE LEARNING FOR SNAPSHOT HYPERSPECTRAL CAMERA

Bachelor’s Thesis Faculty of Information Technology and Communication Sciences Examiner: MSc. Ugur Akpinar January 2022

(2)

Leevi Hietamäki: Machine Learning for Snapshot Hyperspectral Camera Bachelor’s Thesis

Tampere University

Bachelor’s Degree Programme in Signal Processing and Machine Learning January 2022

Coded aperture snapshot spectral systems encode a scene to a single compressive patch and then reconstructs a hyperspectral image of the scene from the compressive patch. The re- construction from the compressive patch to hyperspectral image presents significant problems because majority of scene information is lost. Many reconstruction methods have been created to address this issue. Some of the modern results use machine learning and data-driven priors to achieve the best reconstruction accuracy.

In this thesis, we present the main theories behind hyperspectral images, compressive sens- ing, and coded aperture snapshot spectral systems. Then we choose a state-of-the-art learning- based method which uses data-driven priors. We modify the original method and apply the coded aperture optimization to it. We introduce the framework for the imaging model and the image re- construction model of our proposed method. Then we evaluate the results against other state-of- the-art methods. Our proposed method shows superior results against the original state-of-the-art method under quantitative metrics and perceptual quality.

Keywords: CASSI, Hyperspectral imaging, Image reconstruction, Coded Aperture Optimization The originality of this thesis has been checked using the Turnitin OriginalityCheck service.

(3)

TIIVISTELMÄ

Leevi Hietamäki: Koneoppimisen Hyödyntäminen Hyperspektrisessä Pikakuva Kamerassa Kanditaatin Tutkielma

Tampereen yliopisto

Kanditaatin tukinto-ohjelma singaalinkäsittelyssä ja koneoppimisessa Tammikuu 2022

Coded aperture snapshot spectral imaging -kuvantamislaitteisto ohjelmoi kuvattavan näky- män yhdeksi tiivistetyksi läikäksi ja sen jälkeen jälleenrakentaa hyperspektrisen kuvan tästä tii- vistetystä läikästä. Jälleenrakennus tiivistetystä läikästä hyperspektriseksi kuvaksi sisältää monia haasteita, jotka johtuvat siitä, että suurin osa näkymää esittävästä datasta on menetetty. Monia jälleenrakentamis-metodeja on esitetty edellä mainittujen ongelmien ratkaisemiseksi. Osa moder- neista metodeista käyttää hyväkseen koneoppimista sekä data johdannaista prioria saavuttaak- seen parhaan tarkkuuden jälleenrakentamiselle.

Tässä tutkielmassa, esittelemme pääteoriat hyperspektristen kuvien, puristetun kuvantamisen sekä Coded aperture snapshot spectral imaging -kuvantamislaitteistojen takana. Sen jälkeen valit- simme yhden parhaista huippuluokan koneoppimismetodin, joka hyödyntää datasta johdainnaista prioria. Muokkasimme kyseistä metodia ja lisäsimme siihen ohjelmoitavan aukon optimointia. Esit- telemme kuvantamismallin sekä jälleenrakennusmallin rakenteet kyseiselle muokatulle metodille.

Sen jälkeen arviomme jälleenrakentamisen tulokset meidän metodimme ja muiden huippuluokan metodien välillä. Onnistuimme saavuttamaan hyvät tulokset sekä spatiaalisessa, että värillisessä tarkkuudessa. Ehdottamamme muokattu metodi tuottaa paremmat tulokset alkuperäiseen huippu- luokan metodiin verrattuna kvantitatiivisesti mitattuna sekä silmämääräisesti havainnoitsemalla.

Avainsanat: CASSI, Hyperspektrinen kuvantaminen, Kuvien jälleenrakentaminen, Ohjelmoidun Aukon Optimointi

Tämän julkaisun alkuperäisyys on tarkastettu Turnitin OriginalityCheck -ohjelmalla.

(4)

1. Introduction . . . 1

2. Prerequisites . . . 3

2.1 Hyperspectral images . . . 3

2.2 Compressive Sensing . . . 3

2.3 CASSI . . . 5

3. Related works . . . 7

3.1 Conventional methods . . . 7

3.2 Learning-based methods. . . 7

4. Our method . . . 9

4.1 Prior network . . . 9

4.2 Reconstruction method . . . 11

4.3 Learning parameters . . . 12

5. Results . . . 14

5.1 Side by side comparison . . . 14

6. Conclusion . . . 17

References . . . 18

(5)

LIST OF SYMBOLS AND ABBREVIATIONS

CNN Convolutional Neural Network

CASSI Coded Aperture Snapshot Spectral Imaging CS Compressive Sensing

HS Hypersectral

HSI Hyperspectral Imaging NN Neural Network

PSNR Peak signal-to-noise ratio

RGB Red Green Blue (refers to main colours) SSIM Structural similarity index measure

(6)

1. INTRODUCTION

Hyperspectral imaging (HSI) is an imagining technique that captures information from narrow bands out of a broad spectral range. Normal RGB pictures only use three of these spectral bands for each main colour. Hyperspectral (HS) images consist of tens or even hundreds of these individual bands [1]. Each of these spectral bands corresponds to a certain wavelength interval. HS images offer more information than these regular RGB pictures and that is the reason they are used in remote sensing, medical imaging, environmental monitoring, and astronomy[2, 3, 4, 5].

HS image constructs a three dimensional data cube where the first two dimensions con- tain the spatial data and the third dimension contains the spectral data of the image. The conventional ways to acquire these data cubes include point scanning, area scanning and line scanning techniques. These techniques capture a one- or two-dimensional seg- ment of the data cube and because some of the required data is still missing, additional scannings of the remaining dimensions are required to capture the data cube completely.

[6]

Usually the conventional techniques are time consuming and their efficiency to collect light from incoherent sources is poor. To overcome the weaknesses of the conventional techniques, Wagadarikar et al. [7] have proposed a new imager based on the principles of compressive sensing (CS), which is called coded aperture snapshot spectral imaging (CASSI). These imagers allow capturing the 3D data cube with as little as one scan. [7]

CASSI essentially modifies the incoming HS data by means of a structured random coded aperture to create a spatio-spectrally multiplexed monochromatic sensor image. The un- derlying HS image is then recovered using image priors and inverse problem solving methods inherited from CS literature. In this bachelor’s thesis, we propose an algorithm based on machine learning to efficiently capture and reconstruct HS images within a CASSI system, and compare it against the state-of-the-art methods.

In Chapter 2, we present the concepts of HS images, CS, and CASSI systems. Under- standing these concepts is an important part of the theory of the implemented solution.

In Chapter 3, we review some of the existing reconstruction methods. Next, in Chapter 4, we propose our method for the reconstruction problem. In Chapter 5, we compare our algorithm reconstruction results against the other algorithms. We present the conclusions

(7)

we come from this project in Chapter 6.

(8)

2. PREREQUISITES

Chapter 2 discusses the essential foundations of HS images and after that, it focuses on the dissertation on the compressive sensing theorem and the operation of the CASSI system, which constitutes the basis of the presented work.

2.1 Hyperspectral images

A regular colour image is usually displayed with three channels. The channels usually cor- respond to the main colours: red, green, and blue; but in some cases, the channels can represent others colours as well, prescribed by the used colour model. [8]. HS images differ from the regular colour as they consist of tens or even hundreds of channels, each channel corresponding to a specific spectrum of wavelengths [1]. The high spectral reso- lution preserves features of the spectrum which are usually used to observe differences of different materials or structures [9].

Hyperspectral data is usually represented in the form ofN-dimensional vector for each pixel where N corresponds to the number of the spectral channels [10]. When these vectors are arranged to the places of the image pixels, they will form a 3D data cube.

These data cubes represent the HS pictures. Spectral imagers are used to capture and save scenes as the 3D data cubes. A spectral imager captures the 3D spatiospectral information,I(x, y, λ), of a scene, where for each 2D location,(x, y), the light intensity is recorded as a function of wavelength,λ. [11]

2.2 Compressive Sensing

CS is a sampling theorem that can be used to reconstruct a signal with fewer samples than allowed by the Nyquist–Shannon theorem [12]. Nyquist–Shannon theorem notes that the sampling frequency must be at least twice as frequent as the highest frequency component which exists in the signal to reconstruct the signal completely. [13].

CS uses a linear measurement modely= Φx, whereΦis the measurement matrix, and x is the signal. We further represent x in a sparse basis Ψ, i.e., x = Ψα, where the model becomesy = ΦΨα. CS then aims at recovering sparse or compressible set of

(9)

coefficientsα, by using thel0 norm as a constraint for the sparsity, arg min

α ∥α∥0 s.t.ΦΨα=y. (2.1)

While the optimization above can recover the desired signal from compressive measure- ments, it is regrettably NP-hard problem [14] and the recovery is unreliable if the signal is noisy [12]. However this problem can be converted intol1 minimization, which is convex optimization problem that can be solved using linear programming methods [15]. The l1-norm minimization can be seen below:

arg min

α ∥α∥1 s.t.ΦΨα=y. (2.2)

CS follows two main principles: sparsity and incoherence. Sparsity is used to pertain the signals of interest, and incoherence is used to pertain the sensing modality [16].

Sparsity utilizes the fact that numerous natural signals are compressible or sparse in a way that they have succinct representations when expressed in a proper basis Ψ [16].

Any signal is considered to be sparse if K ≪ N of the coefficients are nonzero. The signal can be approximated to be K-sparse if the coefficients α, when sorted, decay rapidly close to zero. [12] Now, most of the coefficientsαare small and close to zero and only a few coefficients are large. Majority of the signals information is captured by the large coefficients and thus all the small coefficients can be set to be zero without much of perceptual loss [16].

Coherence measures the largest correlation between two elements,that are in this case the sensing basis Φ and the representation basis Ψ [17]. The coherence between Φ and Ψ is large if they contain correlated elements. The mutual coherence µ can be represented as, if Φis a m×n matrix withΦ1, ...,Φm as rows andΨis an×nmatrix withΨ1, ...,Ψnas columns:

µ(Φ,Ψ) =√

nmax|Φkj| (2.3)

for1 ≤ j ≤ nand 1≤ k ≤ m. As for how large is the coherence, it follows from linear algebra that

1≤µ(Φ,Ψ) ≤√

n (2.4)

Low coherence betweenΨandΦcorrelates to fewer samples are required for recovering the signal [18]. It is possible to use random matrices as the sensing basis. The random matrices are to a great extent incoherent with any fixed basisΨ[16].

The correct basis must be chosen in order to compress the chosen signal. E.g., some signals are compressible in Fourier basis and some signals are compressible in wavelet basis [12]. If both of the previous conditions are met with the current signal, all the infor-

(10)

even when the measurements are noisy [12]. The number of measurements can be further decreased by only measuring the large coefficients of the signal. For this to be possible one should know aprioriof the locations of the large coefficients [19].

2.3 CASSI

Coded Aperture Snapshot Spectral Imaging (CASSI) is an imager that is used to capture HS images. CASSI is one of the most promising solutions for capturing HS images [20].

Where conventional approaches for capturing HS images require an equal or greater amount of measurements to the total number of elements in the reconstructed data cube, CASSI is able to capture the scene with few or, in some cases, with a single focal plane array measurement. [7, 21] The main advantage of using CASSI is the ability to capture a certain scene using much fewer measurements compared to the total number of elements in the reconstructed data cube. To achieve this ability CASSI relies on CS to reconstruct the data cube with fewer samples.

The CASSI consists of a detector array, one or two dispersive elements, a coded aperture, and a few optics and lenses. Imaging/Objective lenses are used to form an image of the scene in the plane of the coded aperture. Then the coded aperture modulates the spatial data over all wavelengths within the coded design. The coded aperture adheres to the incoherence principle in the CS theorem, as the coded design is a random matrix and random matrices are mostly incoherent with any fixed basis [16]. After that, the data goes through the dispersive elements. The dispersive elements result in numerous images of the code-modulated data at wavelength reliant locations in the detector array. The detector captures two-dimensional data that is multiplexed projection of 3D data depicting the scene. [7] These detectors are usually field plane arrays which are mentioned in the previous paragraph [21]. The structure of the CASSI imagers can also be seen in figure 2.1.

The sensing model of the CASSI system can be described as followed below, but to keep things as simple as possible the following model does not take in to consideration of optical distortions introduced by the optics or lenses in the system. The spectral density of a scene can be represented asF0(x, y, λ). The scenes spectral density is relayed to the coded aperture which can be represented asT(x, y). The spectral density just after the coded aperture is

F1(x, y, λ) =T(x, y)F0(x, y, λ) (2.6) . [7, 21] After that, dispersion elements create shear in the spectral density along the hor-

(11)

Figure 2.1. Architecture of single and dual disperser CASSI. Single dispenser CASSI is on the top and dual dispenser CASSI is on the bottom.

izontal axis, following the dispersion functionϕ(λ). Finally, the light intensity is captured by the detector. In a case of a single disperser CASSI (Figure 2.1, top) the light intensity on the detectorG(x, y)can be represented as integral over all visible wavelengths [22]:

G(x, y) =

∫︂

λ

T(x+ϕ(λ), y)F0(x+ϕ(λ), y, λ)dλ (2.7) In the case of a dual disperser CASSI (Figure 2.1, bottom), the spectral density is un- sheared by another dispersion function of the second dispersive element in the opposite direction [22]. Light intensity on the detector, in this case, can be represented as:

G(x, y) =

∫︂

λ

T(x−ϕ(λ), y)F0(x, y, λ)dλ (2.8)

(12)

3. RELATED WORKS

The reconstruction methods can be roughly divided into two groups: Conventional and learning-based methods.In this section, we briefly discuss about existing compressive hyperspectral imaging reconstruction methods of both groups that we validate our results against. We will give more detailed introduction to the method of Wang et. al. [20] and our own method in the next chapter.

3.1 Conventional methods

Conventional methods generally define a data fidelity term, and a total variation l1-norm regularization term to emphasize sparsity of gradients to get rid of the spatial-spectral trade off during reconstruction [22]. Usually these methods have HS image priors that are handcrafted based on empirical observation [20].

SpaRSA and DeSCI are methods that we have chosen to represent conventional methods [23, 24]. SpaRSA uses an algorithmic framework to find sparse approximate solutions for basis pursuit denoising problem [25]. Because CS follows the principles of sparsity, it is possible to recover HS images as the solution to this problem when the compressed batch is given as input for this function.

The DeSCI algorithm aims at recovering the original data from the snapshot compressive imaging using rank minimization approach. While DeSCI can operate in two different applications, namely the video compressive imaging and HS compressive imaging, we are only interested in the HS compressive imaging application which can be applied to CASSI systems.

3.2 Learning-based methods

Just like our method other learning-based methods use deep learning to reconstruct HS images from the compressed patches. The best learning-based methods can reconstruct the HS pictures with better accuracy in both spatial and spectral dimensions than the conventional computational methods but training the model requires a diverse dataset and a lot of computing power.

We have chosen DeepCASSI and HSCNN methods to represent the learning-based

(13)

methods [22, 26]. DeepCASSI is an image reconstruction algorithm that uses a two- step process to reconstruct HS images. The first step is a convolutional autoencoder and the second step is to reconstruct HS images by solving an optimization problem.

The reconstruction step uses a data-driven prior term which it gets from the convolutional autoencoder. [22]

The HSCNN is used to recover HS images from spectrally undersampled projections.

These projections can be RGB images or compressed sensing measurements from CASSI systems but we are only interested in the recovery from the CS measurements from the CASSI systems. These CASSI measurements must be upsampled with simple CS recon- struction method before deep learning enhancement. The original method uses TwIST [27] algorithm but we have opted to choose SpaRSA for upsampling. [26]

(14)

4. OUR METHOD

This work is based on a previous state-of-the-art HSI algorithm proposed by Wang et al. [20]. We improve the performance of the reconstruction network by simulating and jointly optimizing the measurement model. In this chapter, the original algorithm and the proposed modifications are discussed.

The CASSI system captures the 3D HS scene as a 2D compressive patch. Equation 2.7 represents the image formation as an integral over the spectral wavelength λ. The observation model can also be expressed in matrix-vector form as [20]:

g = Φf, (4.1)

where g and f are a vectorized representation of the compressive image and the HS image,Φis the measurement matrix of the CASSI system.

Figure 4.1 presents the relationship between a binary coded apertureT and a measure- ment matrix Φ. In this example, the HS image has spatial dimension of3×3and also with 3 spectral channels. As seen in Figure 4.1, the coded aperture is first reshaped into vectorized form and after that duplicated in the horizontal direction, on every occasion with a uniform shift in the vertical direction without padding. The amount of repetitions is the same amount as the number of spectral channels in the HS image. [28]

4.1 Prior network

From Bayesian perspective, the reconstruction problem can be solved by solving the fol- lowing minimization problem:

fˆ = arg min

f ||g−Φf||2+τ R(f), (4.2) whereτ is the balancing parameter andR(·)is the HS image prior.

Wang et al. proposed that it is not necessary to explicitly model the prior term to solve the minimization problem. They suggested to learn a solver for the image prior R(·). The benefits of modeling the image prior this way is that it introduces nonlinearity and is more accurate than explicit hand-crafted priors. [20] To get this solver, variable splitting

(15)

Figure 4.1. The relationship between the coded aperture and the measurement matrix, where M ×N is the spatial dimension of the coded aperture and l is the number of spectral channels.

technique is used to decouple the data and the regularization terms in Equation 4.2. A new auxiliary variablehis introduced:

fˆ = arg min

f ||g−Φf||2+τ R(h), s.t. h=f (4.3) After the new variable, it is possible to use the half quadratic splitting to amend the prob- lem above to a non-constrained optimization problem where η is a penalty parameter.

(fˆ, hˆ) = arg min

f,h ||g−Φf||2+η||h−f||2+τ R(h), (4.4) After that the Equation above can be split into two subproblems:

(k+1) = arg min

f ||g−Φf||2+η||h(k)−f||2 (4.5)

(k+1) = arg min

h ||h−f(k+1)||2+ τ

ηR(h) (4.6)

Because of the half quadratic splitting, image priorR(·)and the observation modelΦare separated from each other. Now the prior term only appears in the form of a proximal op- erator in theh-subproblem in the Equation 4.6. The proposed solverS(·)for the proximal

(16)

Figure 4.2. Neural network architecture of the method. In the upper part is the structure of the reconstruction network and in the lower part with the cyan background is the prior neural network.

operator with a HS image prior network is:

(k+1) =S(f(k+1)). (4.7)

Wang et al. followed two insights while creating their HS image prior network design.

Firstly, the network should be able to develop both spectral and spatial correlation simul- taneously. Secondly, the network should be as straightforward as possible to ease the training. Following these insights, the image prior network was divided into two parts:

spatial network and spectral network. The spatial network consists of two convolutional layers and rectified linear unit between them. The first convolutional layer utilizes3×3×Λ filters and yieldsLfeatures and the second convolutional layer utilizes 3×3×Lfilters and yieldsΛfeatures. The spectral network only consists of one convolutional layer which uses1×1×Λfilters. [20] The prior network architecture can also been in Figure 4.2.

4.2 Reconstruction method

To reconstruct the HS images, Wang et al. suggest uniting Equations 4.5 and 4.6 to re- connect the observation model and the image prior. They utilize the conjugate gradient algorithm to solve Equation 4.4. [20] Now the subproblem can be expressed as:

f(k+1) =f(k)−ϵ[ΦT(Φf(k)−g(k)) +η(f(k)−h(k))] = Φ¯f(k)+ϵf(0)+ϵηh(k), (4.8) whereϵis the step size in gradient descent,f(0) = ΦTgwhich represents the initialization andΦ¯ = (1−ϵη)I−ϵΦtΦ, whereIis an identity matrix. Then the two subproblems are united by substituting Equation 4.7 into 4.8.

f(k+1) = Φ¯f(k)+ϵf(0)+ϵηS(f(k)) (4.9)

(17)

The deep neural network (NN) is designed to solve Equation 4.9 and to reconstruct the HS image. The network consists of multiple previously introduced prior networks. The prior networks are connected in a feed-forward manner to each other and every section, with one prior network, is called a stage. Wang et al. experimented with a different number of stages and came to a conclusion that reconstruction accuracy does not improve signifi- cantly after 9 stages, and that is why we also set that stage amount in our tests.[20] The network takes the compressive patchgas an input which is then shaped by the transpose of the measurement matrixΦT and inputted into a linear layer. The output of that layer is then reshaped to an oblique parallelepiped cube which is the initial HS image estimation f(0) = ΦTg. For every stagek, the updated resultf(k)comes from three different parts as Equation 4.9 suggests. The first part comes from deriving the previous resultf(k−1). The derivative is given as an input to the prior network and then weighted by parameters ϵη. The second part comes also from the previous result which is parameterized by Φ¯ and entered into a linear layer. The final part is a skip connection to the initial HS image estimationf(0)weighted by parameterη. [20] The architecture can also be seen in Figure 4.2.

4.3 Learning parameters

The original algorithm learned the network parametersΘand the optimization parameters ϵand ηwith end-to-end training while training the network. All the previously mentioned parameters are established to be distinct among each stage. During the training while stages increase, the reconstruction quality is improved, thereby both the optimization and the network parameters should be adaptively substituted. [20] In our method, we have added extra optimization parameter to the implementation. We have placed the CASSI systems coded aperture T to be an optimization parameter. In our experiments, this improves the reconstruction quality compared to the original model. The downside of the coded aperture being one of the optimization parameters is that it restricts the reconstruction image to be the same as the coded aperture. In our case, we used a coded aperture of size 256×256 pixels and that is the same size as the reconstructed pictures.

The coded aperture of the Wang et al.’s algorithm was a random binary matrix gener- ated by Bernoulli distribution [20]. Along side with the binary coded aperture we also experimented with uniformly distributed grayscale coded aperture. To ensure real life implementation we kept the grayscale coded aperture values between zero and one.

The network is trained in agreement with the MSE-based loss function. If we express the set of training samples as parallelepiped cubesf(l) and the corresponding compressive

(18)

whereδ(·)denotes the output of the network given the input and the parameters.

The network was created using MatConvNet which minimizes the loss function in equation 4.10. The optimization parametersϵandηare initialized with all zeros. The other training options can be found below in table 4.1.

epochs 150

batch size 4

momentum 0.9

learning rate 10−3 Weight decay 10−4 coded aperture learning rate 0.5

Table 4.1.This table contains all training options for our method.

(19)

5. RESULTS

In this chapter, we evaluate our reconstruction method against the reconstruction meth- ods mentioned in Chapter 3. We used single disperser CASSI systems for every recon- struction method. The training images are constructed using the KAIST dataset [29], having 31 spectral channels. The image patch size is set as 256×256. Further data augmentation is done by rotating the images randomly, which resulted in total 1890 HS image patches. We trained all the learning-based methods with the same dataset for a fair comparison.

5.1 Side by side comparison

We compare our method against the original method and three state-of-the-art methods previously mentioned in the Chapter 3. We tried to keep the state-of-the-art methods as close to as original as possible but due to some computational restraints, we had to change some parameters for it to be possible to train these networks. We were able to use the hand-crafted prior-based methods SpaRSA and DeSCI without any changes consequently the results should evaluate their reconstruction properties the best. For the two learning-based methods we had to make some changes to their parameters which might affect their results.

The changes that had to be made, because of the computational limitations, are listed here. The original Wang et al. algorithm used a patch size of 64×64 and a mini-batch size of 64. Due to memory limitations, we changed these respectively to 256×256 and 4.

These are also the values we use in our method for better comparison.

The reconstruction methods were evaluated with two image quality metrics, which were peak signal-to-noise ratio (PSNR) measured in dB and structural similarity index mea- sure (SSIM). PSNR and SSIM are measured on every 2D spatial image. These metrics expresses the spatial fidelity between the reconstructed HS image and the original HS image. Larger values of PSNR and SSIM correlates with better performance for the con- struction method. To inspect the perceptual quality of each reconstruction method the reconstructed HS images needed to be visualized in RGB colour model. The HS images were converted to to RGB images with colour matching function [30].

(20)

(26.35, 0.901) (27.51, 0.894) (31.97, 0.953) (27.10, 0.884) (27.79, 0.910)

(23.40, 0.724) (19.59, 0.752) (29.11, 0.867) 25.42, 0.763) (27.04, 0.817)

Figure 5.1. Visual quality comparison of reconstruction methods. Each column corre- sponds to the algorithm mentioned at top. The PSNR (dB) and SSIM values are shown in the parenthesis for each method.

It can be seen from Figure 5.1 that the learning-based methods outperform the compu- tational methods in both spatial and spectral accuracy. It must be noted that the compu- tational methods use handcrafted priors whereas the learning-based methods use data- driven priors. This shows the superiority of the data-driven priors against handcrafted priors. Unfortunately, Wang et. al.’s method was outperformed by DeepCASSI, which is due to the different training settings that we had to use because their results should be very close to each other according to corresponding publications [20, 22]. It is also clearly shown that our proposed method is more superior than the original Wang et. al.’s method.

We were able to achieve better results in both image quality metrics.

The spectral reflectances of reconstruction method were also evaluated and they are shown in Figure 5.2. Learning-based methods show their superiority also in spectral ac- curacy. To get better comprehension of our proposed method versus the original method, we calculated the mean squared error (MSE) of the reflectances. For Figure 5.2 (a), our proposed method got MSE of 5.6833· 10−4 and Wang et.al.’s method got MSE of 3.9264·10−4. For Figure 5.2 (b), our proposed method got MSE of 2.5160·10−4 and Wang et. al.’s method got MSE of1.5918·10−4.The original method was able to achieve

(21)

400 500 600 700 0

0.1 0.2 0.3 0.4 0.5

Wavelength (nm)

Reflectance

GT

DeepCASSI Our

DeSCI Wang et al.

SpaRSA

(a)

400 500 600 700

0.05 0.10 0.15 0.20

Wavelength (nm)

Reflectance

GT

DeepCASSI Our

DeSCI Wang et al.

SpaRSA

(b)

Figure 5.2. Spectral accuracy comparison. The points in image (a) correspond to a red pixel on the toy’s nose from Figure 5.1 and the points in image (b) correspond to a blue pixel on the toy’s pants.

better results in this comparison than our result.

It seems that our method gives smoother transition between the color channels, which results in inaccuracies when the ground truth has rapid changes in spectrum, e.g., the two test pixels that we chose for this test. Nevertheless, our method seems to have better color representation when converted to RGB images, as seen in Figure 5.1.

(22)

6. CONCLUSION

In this study we have modified one of the state-of-the-art CASSI reconstruction meth- ods to improve accuracy. The modification was done by optimizing the coded aperture structure and placing the coded aperture as one of the optimization parameters for the CNN. Our proposed method outperformed the original method in both spatial and spectral accuracy all though the original method got better results in the colour spectrum test.

Unfortunately we were not able to demonstrate the original method’s full potential due to the computational limitations. We can assume that the modifications would have worked the same way with the original method without the changed parameters, even though the improvement might not be in the same scale as in our results.

Even with promising results, much is still to be achieved for CASSI reconstruction. This research can be continued in the future by implementing a prototype CASSI system with the optimized coded aperture and verifying the simulation results within the real world.

Our coded aperture optimization should also be tested with other reconstruction methods, e.g., DeepCASSI, as it performed the best in our simulations. In addition, implementing a novel neural network architecture to work with our coded aperture optimization might be another direction to continue this research in the future.

(23)

REFERENCES

[1] Li, S., Qiu, J., Yang, X., Liu, H., Wan, D. and Zhu, Y. A novel approach to hyperspec- tral band selection based on spectral shape similarity analysis and fast branch and bound search.Engineering Applications of Artificial Intelligence27 (2014), pp. 241–

250.

[2] Eismann, M. Hyperspectral remote sensing. Society of Photo-Optical Instrumenta- tion Engineers. 2012.

[3] Liu, Z., Wang, H. and Li, Q. Tongue tumor detection in medical hyperspectral im- ages.Sensors12.1 (2012), pp. 162–174.

[4] Stuart, M. B., McGonigle, A. J. and Willmott, J. R. Hyperspectral imaging in environ- mental monitoring: a review of recent developments and technological advances in compact field deployable systems.Sensors19.14 (2019), p. 3071.

[5] Rodet, T., Orieux, F., Giovannelli, J.-F. and Abergel, A. Data inversion for hyper- spectral objects in astronomy. 2009 First Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing. IEEE. 2009, pp. 1–4.

[6] ElMasry, G. and Sun, D.-W. Principles of hyperspectral imaging technology.Hyper- spectral imaging for food quality analysis and control. Elsevier, 2010, pp. 3–43.

[7] Wagadarikar, A., John, R., Willett, R. and Brady, D. Single disperser design for coded aperture snapshot spectral imaging.Applied optics47.10 (2008), B44–B51.

[8] Ibraheem, N. A., Hasan, M. M., Khan, R. Z. and Mishra, P. K. Understanding color models: a review. ARPN Journal of science and technology 2.3 (2012), pp. 265–

275.

[9] Shaw, G. and Manolakis, D. Signal processing for hyperspectral image exploitation.

IEEE Signal processing magazine19.1 (2002), pp. 12–16.

[10] Landgrebe, D. Hyperspectral image data analysis.IEEE Signal processing maga- zine19.1 (2002), pp. 17–28.

[11] Wagadarikar, A. A., Pitsianis, N. P., Sun, X. and Brady, D. J. Video rate spectral imaging using a coded aperture snapshot spectral imager. Optics express 17.8 (2009), pp. 6368–6388.

[12] Baraniuk, R. G., Cevher, V., Duarte, M. F. and Hegde, C. Model-based compressive sensing.IEEE Transactions on information theory 56.4 (2010), pp. 1982–2001.

[13] Luke, H. D. The origins of the sampling theorem.IEEE Communications Magazine 37.4 (1999), pp. 106–108.

[14] Natarajan, B. K. Sparse approximate solutions to linear systems.SIAM journal on computing 24.2 (1995), pp. 227–234.

(24)

signal processing magazine25.2 (2008), pp. 21–30.

[17] Donoho, D. L. and Huo, X. Uncertainty principles and ideal atomic decomposition.

IEEE transactions on information theory 47.7 (2001), pp. 2845–2862.

[18] Qaisar, S., Bilal, R. M., Iqbal, W., Naureen, M. and Lee, S. Compressive sensing:

From theory to applications, a survey. Journal of Communications and networks 15.5 (2013), pp. 443–456.

[19] Fornasier, M. and Rauhut, H. Compressive Sensing. Handbook of mathematical methods in imaging 1 (2015), pp. 187–229.

[20] Wang, L., Sun, C., Fu, Y., Kim, M. H. and Huang, H. Hyperspectral image recon- struction using a deep spatial-spectral prior.Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition. 2019, pp. 8032–8041.

[21] Arce, G. R., Brady, D. J., Carin, L., Arguello, H. and Kittle, D. S. Compressive coded aperture spectral imaging: An introduction.IEEE Signal Processing Magazine31.1 (2013), pp. 105–115.

[22] Choi, I., Kim, M., Gutierrez, D., Jeon, D. and Nam, G. High-quality hyperspectral reconstruction using a spectral prior. Tech. rep. 2017.

[23] Wright, S. J., Nowak, R. D. and Figueiredo, M. A. Sparse reconstruction by separa- ble approximation.IEEE Transactions on signal processing57.7 (2009), pp. 2479–

2493.

[24] Liu, Y., Yuan, X., Suo, J., Brady, D. J. and Dai, Q. Rank minimization for snapshot compressive imaging. IEEE transactions on pattern analysis and machine intelli- gence41.12 (2018), pp. 2990–3006.

[25] Gill, P. R., Wang, A. and Molnar, A. The in-crowd algorithm for fast basis pursuit denoising.IEEE Transactions on Signal Processing59.10 (2011), pp. 4595–4605.

[26] Xiong, Z., Shi, Z., Li, H., Wang, L., Liu, D. and Wu, F. Hscnn: Cnn-based hyper- spectral image recovery from spectrally undersampled projections.Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017, pp. 518–

525.

[27] Bioucas-Dias, J. M. and Figueiredo, M. A. A new TwIST: Two-step iterative shrink- age/thresholding algorithms for image restoration.IEEE Transactions on Image pro- cessing 16.12 (2007), pp. 2992–3004.

[28] Wang, L., Zhang, T., Fu, Y. and Huang, H. Hyperreconnet: Joint coded aperture op- timization and image reconstruction for compressive hyperspectral imaging. IEEE Transactions on Image Processing28.5 (2018), pp. 2257–2270.

[29] Choi, I., Jeon, D. S., Nam, G., Gutierrez, D. and Kim, M. H. High-Quality Hyperspec- tral Reconstruction Using a Spectral Prior. ACM Transactions on Graphics (Proc.

(25)

SIGGRAPH Asia 2017)36.6 (2017), 218:1–13.DOI:10.1145/3130800.3130810.

URL:http://dx.doi.org/10.1145/3130800.3130810.

[30] Magnusson, M., Sigurdsson, J., Armansson, S. E., Ulfarsson, M. O., Deborah, H.

and Sveinsson, J. R. Creating RGB images from hyperspectral images using a color matching function. IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium. IEEE. 2020, pp. 2045–2048.

Viittaukset

LIITTYVÄT TIEDOSTOT

Tornin värähtelyt ovat kasvaneet jäätyneessä tilanteessa sekä ominaistaajuudella että 1P- taajuudella erittäin voimakkaiksi 1P muutos aiheutunee roottorin massaepätasapainosta,

Using a hyperspectral camera mounted to the side ocular of operation microscope and Xenon white light illumination, we collected samples of standard microsurgical

(Hirvi­Ijäs ym. 2017; 2020; Pyykkönen, Sokka & Kurlin Niiniaho 2021.) Lisäksi yhteiskunnalliset mielikuvat taiteen­.. tekemisestä työnä ovat epäselviä

Kulttuurinen musiikintutkimus ja äänentutkimus ovat kritisoineet tätä ajattelutapaa, mutta myös näissä tieteenperinteissä kuunteleminen on ymmärretty usein dualistisesti

In this study I have presented and analysed English verbs of causative active accomplishment movement within the RRG framework, and I have arranged them into a typology by taking

The shifting political currents in the West, resulting in the triumphs of anti-globalist sen- timents exemplified by the Brexit referendum and the election of President Trump in

achieving this goal, however. The updating of the road map in 2019 restated the priority goal of uti- lizing the circular economy in ac- celerating export and growth. The

The Minsk Agreements are unattractive to both Ukraine and Russia, and therefore they will never be implemented, existing sanctions will never be lifted, Rus- sia never leaves,