Dissertations in Forestry and Natural Sciences
DISSERTATIONS | JYRKI JAUHIAINEN | NON-SMOOTH OPTIMIZATION AND REGULARIZATION FOR EIT | No 461
JYRKI JAUHIAINEN
NON-SMOOTH OPTIMIZATION AND REGULARIZATION FOR ELECTRICAL IMPEDANCE TOMOGRAPHY
PUBLICATIONS OF
THE UNIVERSITY OF EASTERN FINLAND
PUBLICATIONS OF THE UNIVERSITY OF EASTERN FINLAND DISSERTATIONS IN FORESTRY AND NATURAL SCIENCES
N:o 461
Jyrki Jauhiainen
NON-SMOOTH OPTIMIZATION AND REGULARIZATION FOR ELECTRICAL
IMPEDANCE TOMOGRAPHY
ACADEMIC DISSERTATION
To be presented by the permission of the Faculty of Science and Forestry for public examination in the Auditorium SN2000 Snellmania Building at the University of Eastern Finland, Kuopio, on January 28th, 2022, at 12 o’clock.
University of Eastern Finland Department of Applied Physics
Kuopio 2022
PunaMusta Oy Joensuu, 2022
Editors: Pertti Pasanen, Jukka Tuomela, Matti Tedre, Raine Kortet,
and Nina Hakulinen
Distribution:
University of Eastern Finland Library / Sales of publications julkaisumyynti@uef.fi
http://www.uef.fi/kirjasto ISBN: 978-952-61-4449-8 (print)
ISSNL: 1798-5668 ISSN: 1798-5668 ISBN: 978-952-61-4450-4 (pdf)
ISSNL: 1798-5668 ISSN: 1798-5676
Author’s address: University of Eastern Finland Department of Applied Physics P.O.Box 1627
70211 KUOPIO FINLAND
email: jyrki.jauhiainen@uef.fi
Supervisors: Professor Aku Seppänen
University of Eastern Finland Department of Applied Physics P.O.Box 1627
70211 KUOPIO FINLAND
email: aku.seppanen@uef.fi
Associate Professor Tuomo Valkonen Escuela Politécnica Nacional
ModeMat
Ladrón de Guevara E11-253 170109 QUITO
ECUADOR
email: tuomo.valkonen@iki.fi
Associate Professor Mohammad Pour-Ghaz North Carolina State University
Department of Civil, Construction, and Environmental Engineering RALEIGH, NC 27695-7908 USA
email: mpourgh@ncsu.edu Dr. Antti Voss
University of Eastern Finland Department of Applied physics P.O.Box 1627
70211 KUOPIO FINLAND
email: antti.voss@uef.fi
Reviewers: Associate Professor Yiqiu Dong Technical University of Denmark Department of Applied Mathematics and Computer Science
2800 KONGENS LYNGBY DENMARK
email: yido@dtu.dk Professor Mikko Salo University of Jyväskylä
Department of Mathematics and Statistics P.O.Box 35
40014 JYVÄSKYLÄ FINLAND
email: mikko.j.salo@jyu.fi
Opponent: Assistant Professor Antti Hannukainen Aalto University
Department of Mathematics and Systems Analysis P.O.Box 11100
02150 ESPOO FINLAND
email: antti.hannukainen@aalto.fi
Jyrki Jauhiainen
Non-smooth optimization and regularization for electrical impedance tomography Kuopio: University of Eastern Finland, 2022
Publications of the University of Eastern Finland Dissertations in Forestry and Natural Sciences N:o 461
ABSTRACT
Electrical impedance tomography (EIT) is an imaging modality for reconstructing the electrical conductivity within a physical object through electrical measurements made on the surface of the object. The problem of solving the conductivity, the so- called inverse conductivity problem, is a non-linear inverse problem, and it is known to be ill-posed to an extent that regularization is required to acquire meaningful reconstructions. By using the available prior knowledge about the conductivity, this regularization is included into the problem in the form of a functional that penalizes undesired features. Typically, these functionals either promote spatially smooth or piecewise constant conductivities. Using the latter regularization functionals turns the inverse conductivity problem into a non-smooth problem making it more com- plicated and more computationally demanding to solve.
EIT has multiple potential applications in medical imaging, geological imaging, industrial process tomography, and structural health monitoring. One promising structural health monitoring application is crack detection through the so-called EIT- based sensing skin. In this technique, we apply a conductive layer on the surface of the target object. By monitoring the conductivity changes of this layer with EIT, we aim to detect cracks on the surface. The sensing skin sensor is designed in such a way that the conductive layer cracks whenever the surface of the target breaks.
The cracking of the conductive layer causes sharp inclusions and thus non-smooth regularization, such as total variation (TV), is preferred.
Past studies that have focused on crack detection via EIT-based sensing skins have used a smooth approximation of the TV regularization (Smoothed TV, sTV).
However, this approximation lowers the reconstruction quality compared to TV. In addition, using this type of regularization increases the computation times signifi- cantly and hence, for practical applications, it is important to develop methods that can solve the problem faster. Further, past studies considered that the sensing skin is planar. In practical applications, however, the surfaces of the monitored structures are often non-planar, thus it is also necessary to extend the sensing skin technique to non-planar geometries.
In this thesis, we study and develop non-smooth regularization and optimization methods to improve the reconstruction accuracy and speed of the EIT-based sensing skin and similar EIT techniques. Utilizing these techniques, we also study EIT-based sensing skins, extending these sensing skins to non-planar geometries.
Specifically, to solve the non-smooth EIT reconstruction problem, we study a novel optimization method called non-linear primal-dual proximal splitting. We demonstrate numerically that this method can be used in practical EIT applications, and that adjusting the step length parameters of this method with so-called block adaptation, can improve its performance. Based on the idea of applying primal-dual proximal splitting methods to EIT, we propose a new optimization method called Relaxed inexact proximal Gauss–Newton. We study the convergence properties of
this method and demonstrate numerically that this method can significantly speed up the reconstruction process in EIT.
To further improve the image quality, we study a non-smooth and non-convex regularization, called Mumford-Shah regularization, in EIT. We show that this func- tional can be approximated in EIT through the Ambrosio-Tortorelli approximation.
We also evaluate this approximation in EIT through numerical studies, demonstrat- ing that using this functional can improve reconstruction quality.
Finally, we also study EIT-based sensing skins in non-planar geometry. By ex- tending the reconstruction framework to non-planar geometries, we demonstrate numerically and experimentally that EIT-based sensing skins can be used to moni- tor non-planar surfaces.
Universal Decimal Classification:517.956.27, 517.972.87, 519.863, 537.311.6, 621.317.33
Library of Congress Subject Headings: Electrical impedance tomography; Inverse prob- lems (Differential equations); Mathematical optimization; Nonsmooth optimization; Func- tionals; Image reconstruction; Nondestructive testing; Structural health monitoring; Nu- merical analysis
Yleinen suomalainen ontologia:impedanssitomografia; matemaattinen optimointi; funk- tionaaliyhtälöt; funktionaalianalyysi; rikkomaton aineenkoetus; numeerinen analyysi
ACKNOWLEDGEMENTS
This thesis work was done between years 2017 and 2021 in the Department of Ap- plied Physics at the University of Eastern Finland.
First, I would like to thank my supervisors. I am grateful for my principal super- visor Professor Aku Seppänen for allowing to work on such a fruitful research topic, and moreover, for helping and guiding me whenever needed, while also allowing me to work freely without feeling pressured.
I also wish to express my gratitude to my second supervisor Associate Professor Tuomo Valkonen for his enormous contribution to the mathematical aspects of this thesis. I have learned a great deal from our conversations and without your work, the mathematical parts of this thesis would not have been possible.
I would also like to thank my third supervisor Associate Professor Mohammad Pour-Ghaz for his insight on the engineering parts of this thesis – your contribution has significantly improved the experimental aspects of this thesis. I am also grateful to my fourth supervisor Antti Voss who has guided me through the practical aspects of the thesis and research work in general.
Secondly, I would like to thank the reviewers of this thesis, Associate Professor Yiqui Dong and Professor Mikko Salo for spending their valuable time on reviewing this thesis and providing me with invaluable feedback. I am honored and looking forward to having Assistant Professor Antti Hannukainen as my dissertation oppo- nent.
I also want to thank everyone at the Computational physics and inverse problems group. Specifically, I would like to thank my colleagues and friends Matti Hanhela, Petri Kuusela, and Mikko Räsänen for the insightful conversations and thoughtful feedback.
I am very appreciative of all the support I have received from my friends and my family. I am especially thankful to my parents Eeva and Kari and my sister Satu for their support and encouragement. Finally, I am profoundly thankful to my fiancée Suvi for her patience, support, and companionship. I am lucky to have you in my life.
This thesis work was funded by European Union‘s Horizon 2020 research and innovation program under grant agreement No 764810 and by the Academy of Fin- land (Centre of Excellence of Inverse Modelling and Imaging, 2018-2025, project 336795).
Kuopio, January 10, 2022 Jyrki Jauhiainen
LIST OF PUBLICATIONS
This thesis consists of the present review of the author’s work in the field of non- smooth optimization, inverse problems, and structural health monitoring and the following selection of the author’s publications:
I S. Mazurenko, J. Jauhiainen, and T. Valkonen, “Primal-dual block-proximal splitting for a class of non-convex problems,”Electronic Transactions on Numer- ical Analysis52, 509–552 (2020).
II J. Jauhiainen, P. Kuusela, A. Seppänen, and T. Valkonen, “Relaxed Gauss–
Newton Methods with Applications to Electrical Impedance Tomography,”
SIAM Journal on Imaging Sciences13, 1415–1445 (2020).
III J. Jauhiainen, T. Valkonen, and A. Seppänen, “Mumford-Shah regularization in electrical impedance tomography with complete electrode model,”Submitted.
IV J. Jauhiainen, M. Pour-Ghaz, T. Valkonen, and A. Seppänen, “Nonplanar sens- ing skins for structural health monitoring based on electrical resistance to- mography,” Computer-Aided Civil and Infrastructure Engineering 36, 1488–1507 (2021).
Throughout the overview, these papers will be referred to by Roman numerals.
AUTHOR’S CONTRIBUTION
The publications selected in this dissertation are original research papers on elec- trical impedance tomography and optimization. These publications are joint works with the supervisors and co-authors. In the first paper, the author implemented the numerical framework and carried out the numerical EIT study. In the rest of the papers, the author was the principal writer and participated in all parts of these studies, from the theoretical study to numerical implementation and testing.
TABLE OF CONTENTS
1 Introduction 1
2 Electrical impedance tomography as an optimization problem 5 2.1 A primer on convex analysis... 8 2.2 primal-dual proximal splitting... 9 2.3 Practical aspects of the numerical and experimental studies... 10 3 Non-smooth optimization and regularization for electrical impedance
tomography 15
3.1 Publication I: Primal-dual block-proximal splitting for a class of non- convex problems... 15 3.1.1 Block proximal splitting... 15 3.1.2 Numerical EIT studies... 17 3.2 Publication II: Relaxed Gauss–Newton methods with applications to
electrical impedance tomography... 19 3.2.1 Convergence properties of the relaxed inexact proximal Gauss–
Newton method... 20 3.2.2 Numerical and experimental studies... 21 3.3 Publication III: Mumford-Shah regularization in
electrical impedance tomography with complete
electrode model... 24 3.3.1 Approximation of the Mumford–Shah functional in EIT... 25 3.3.2 Numerical studies... 26 3.4 Publication IV: Non-planar sensing skins for structural health moni-
toring based on electrical resistance tomography... 28 3.4.1 Complete electrode model on a surface... 29 3.4.2 Numerical and experimental studies... 30
4 Conclusions 33
BIBLIOGRAPHY 35
ABBREVIATIONS
CEM Complete electrode model DtN Dirichlet to Neumann
EIT Electrical impedance tomography FE Finite element
M-S Mumford-Shah
NL-PDPS Non-linear primal dual proximal splitting PCLS Proper, convex and lower semicontinuous PDPS Primal-dual proximal splitting
RE Relative error
RIPGN Relaxed inexact proximal Gauss-Newton SSIM Structural similarity index
sTV Smoothed total variation TV Total variation
NOMENCLATURE
∇· Divergence
γ Electrical conductivity
∇ Gradient, approximate differential or Fréchet derivative
Ω Spatial domain
∂Ω Boundary of the domainΩ Λγ Dirichlet to Neumann map
ν Unit normal
J(γ) Objective function
γˆ Approximate solution to minγJ(γ) · p Ap-norm
γm Minimum conductivity γM Maximum conductivity A(γ) Residual term
F(Bγ) Regularization function with linear operatorB Rs Union of real numbers and infinity,Rs =R∪∞ I Vector with Mmeasurements, i.e.,I ∈RM I(γ) Forward model
W Weight matrix of the measurements
uj Electric potential inside domainΩduring excitationj ζk Contact impedance
ek Electrodek
Ukj Electric potential at electrodekduring excitationj Ikj Electric current through electrodekduring excitationj f∗ Convex conjugate of f
δX(γ) Indicator function of the setX
∂f Subdifferential of f
∂Cf Clarke subdifferential of f proxf(x) Proximal mapping of f at x a,α Regularization parameters F∇ L2gradient regularization TV Total variation regularization
sTV A smooth approximation of the total variation regularization FS A smoothness promoting regularization
Cp p continuously differentiable functions levcJ c-sublevel set ofJ
cl Closure
domF The effective domain ofF B The space of Borel functions
HN N−dimensional Haussdorf measure
M A smooth manifold
g Metric (In manifold context)
gi,j The i-j-component of the matrix representing metricg
gi,j The i-j-component of the inverse of the matrix representing metricg
|g| Determinant of the matrix representingg
1 Introduction
Electrical impedance tomography is a non-invasive imaging modality for recon- structing the electrical conductivity distribution within a physical object. This con- ductivity distribution can be used to distinguish regions within the object. Depend- ing on the target application, these regions can indicate, e.g., moisture within a concrete element [1–3] or tumor within human tissue [4–6].
In EIT, a set of electrodes is attached to the surface of the object, and the elec- trodes are connected to a measurement device. The typical EIT device performs the measurements by injecting an electric current sequentially through a set of elec- trodes, measuring the caused electric potentials on the rest of the electrodes, and changing the injecting electrodes between each repetition. Alternatively, the device may perform the measurements by exciting a set of electrodes to predetermined potentials and measuring the electrical currents through the rest of the electrodes, repeating this by changing the excitation electrodes between each repetition [1].
There are multiple approaches to reconstructing the electrical conductivity from the EIT measurements. This thesis focuses on the optimization-based approach [7–
9]. In this approach, the reconstruction problem of EIT is formulated as a minimiza- tion problem and this problem is usually solved through an iterative scheme. Alter- native approaches include, for example, the direct methods, such as D-bar methods [10–14], that give an approximate solution to the inverse conductivity problem with- out the explicit iterative scheme.
Since the reconstruction problem of EIT is known to be an ill-posed inverse prob- lem [15, 16], solving it requires using specific prior information about the conduc- tivity. In optimization-based methods, thisa prioriinformation is included into the problem through the so-called regularization functional.
Typical regularization functionals either promote smooth inclusions or piecewise constant inclusions, i.e., thesparsityof the conductivity gradient. In this thesis, we will simply refer to the former bysmooth regularizationand the latter bynon-smooth regularization. Perhaps the most common choice for smooth regularization functional is the squared L2 norm. Indeed, the properties of this type of regularization are known very well [17], and it is widely used in EIT [18]. More so, the optimization problem is solvable through common iterative methods, such as Newton’s method or Gauss–Newton method [19–21].
In applications where it isa prioriknown that the target conductivity has sharp edged inclusions, non-smooth regularization functionals, such as total variation (TV, [22]) offer better alternatives to the smooth ones [23]. Obviously, the non- smooth regularization makes the objective functional non-smooth and thus methods of smooth optimization cannot be directly applied to the problem. To circumvent the difficulties related to the non-smoothness of the functional, often smooth ap- proximations of TV are employed so that the problem can be solved using, for ex- ample, the Newton’s method [23–27]. However, using the smoothed approximation can lower the image quality and the smoothed problem is often computationally demanding to solve [28].
EIT applications which benefit from non-smooth regularization can be found, for
Figure 1.1: Illustration of an EIT-based sensing skin measurement setup. Top:
Photo of a concrete beam painted with copper paint. Small notches on the bor- der of the beam are the electrodes and the dark lines are cracks on the concrete beam caused by the pressure applied to it from above. Bottom: Conductivity recon- structed from the EIT measurements using smoothed total variation regularization.
The cracks of the concrete beam are visible in the reconstruction as areas of low conductivity, i.e., the dark lines.
example, in structural health monitoring, where using the smooth approximation of TV has yielded very promising results in crack detection through the so-called EIT- based sensing skin technique [29–31]. This technique uses an electrically conductive layer applied on the surface of the target object, monitored with EIT, to identify structural deficiencies of the target, e.g., cracks [29] (see Figure 1.1). One of the limiting factors of this approach is, however, the long computation times [23]. In addition, although the reconstruction quality is already relatively high, using a non- smooth regularization without applying the smoothing could potentially improve the reconstruction quality further.
In this thesis, we study non-smooth regularization and optimization methods in EIT, to decrease the computation time and to improve the reconstruction quality. We readily consider the sensing skin as an example application because crack monitor- ing is a particularly suitable target for non-smooth regularization and optimization.
We also study this specific application to make it more viable in practice.
The first objective of this thesis is to study and improve the feasibility of novel non-smooth and non-convex optimization methods in EIT [32, 33]. The second ob- jective is to study regularization functionals that promote segmentation of the re- construction into smooth sub-regions. In the numerical realization of this objective, we employ the methods developed to realize the first one.
While the above aims of the thesis are related to computational methods appli- cable to EIT and potentially other imaging modalities, the final objective is focused on the EIT-based sensing skin application. So far sensing skins have been used only in planar geometries. Thus, the last aim of this thesis is to extend this monitoring tool to applications where the surfaces are non-planar. The objectives of this thesis can be summarized as follows:
• To apply and further improve non-convex and non-smooth optimization tech- niques in EIT (Publications I and II).
• Improve the reconstruction quality through a regularization functional that promotes segmentation of the reconstruction into smooth sub-regions (Publi- cation III).
• Study the feasibility of the non-planar sensing skin computationally and ex- perimentally (Publication IV).
To realize these objectives, in Publication I, we study the so-called block adapted primal-dual proximal splitting methods in EIT. In Publication II, we derive a novel Gauss–Newton type method for non-smooth problems and numerically demon- strate that this method is particularly suitable for solving non-smooth EIT recon- struction problems. In Publication III, we show that under specific conditions, the Ambrosio–Tortorelli approximation [34, 35] of the Mumford–Shah functional [36] is theoretically applicable to EIT and using numerical studies we show that this func- tional can improve image quality in practical applications (see also [37]). In Publica- tion IV, we adapt the forward model as well as the non-smooth regularization and optimization methods to non-planar geometries, and we study non-planar sensing skins numerically and experimentally. The work presented in this thesis contributes to the fields of inverse problems, non-smooth optimization, and structural health monitoring.
The rest of this thesis is organized as follows: In Chapter 2, we discuss the optimization in EIT generally. We describe how the related optimization problem is formulated, and then we proceed to discuss certain crucial concepts related to non-smooth optimization. In this chapter, we also present a common framework for the numerical and experimental studies discussed in Chapter 3. In Chapter 3, we review the main results of Publications I-IV, and finally in Chapter 4, we reflect upon these results and discuss their implications and future possibilities.
2 Electrical impedance tomography as an optimization problem
The problem of solving the conductivity from EIT measurements is referred to as theinverse conductivity problem or Calderon’s problem. Assuming that the target has no electrical sources or sinks, the inverse conductivity problem [38] can be defined through an elliptic partial differential equation
∇ ·γ∇u=0 x∈Ω
u= f x∈∂Ω, (2.1)
and a Dirichlet to Neumann (DtN) map Λγf = γ∂∂νu|∂Ω that maps the electrical potential f at the boundary ∂Ω of the domain Ω to the electric current γ∂u∂ν|∂Ω through boundary. Particularly, the inverse conductivity problem is to determine the conductivityγgiven the DtN map [16, 18, 39, 40].
To obtain the DtN map in practice, by using electrodes attached to the surface of the target, we perform electrical current injections and potential measurements, or alternatively, potential excitations and current measurements (see Figure 2.1).
In most cases, we can obtain the DtN map only partially since the boundary data is available only on those parts of the boundary where the electrodes are located, and the current injections and the excitation voltages are limited by the measuring equipment [16].
Optimization-based methods aim to reconstruct the conductivity as a solution to a minimization problem
γm≤γ≤γmin M
J(γ) i.e. γˆ := arg min
γm≤γ≤γM
J(γ), (2.2)
where we impose box constraints on the conductivityγvia 0<γm < γM ≤∞. In this thesis, we concentrate onobjective functional J of the composite form
J(γ):= 12A(γ)22+F(Bγ), (2.3) whereγ∈X for some function spaceXon the domainΩ⊂RN,F:Y→Rs which we assume to be proper, convex, and lower-semicontinuous (PCLS)1, andB:X→Y is linear. The term 12A(γ)22 is often called the (least squares) data fidelity term, andF(Bγ)is called the regularization term.
The residual term A : X → RM describes the discrepancy between the mea- surements and a model that relates the conductivityγto the measurements in such way that A(γ) =ˆ 0 would be the solution to the unregularized inverse conductivity problem. The common approach is to define A by the current injection/potential measurement setup. However, we assume that the electrode potentials are known,
1A convex functionFis said to beproper, if for everyx∈X,F(x)>−∞and if there exists a ˜xsuch thatF(x)˜ <∞.
Figure 2.1: Illustration of an EIT measurement setup. During each excitationj, the electrodesei on the boundary∂Ωof domainΩare excited to known potentialsUij. This creates a field potential fieldujinside the domain, which, in turn, causes the alternating electric currentsIijat the electrodes.
and the currents are measured. Thus, given a vector of measured currentsI ∈RM, we defineAby
A(γ):=W(I(γ)−I), (2.4)
where I, the so-calledforward operator, describes how the conductivity is related to the measurements, andW ∈RM×Mis a weight matrix for the measurements. Note that while we use a physics-based modelIthat links the conductivityγto the mea- surements I, optimization-based reconstruction methods are not limited to this;
certain methods, such as deep learning-based methods [41–43] do not necessitate a forward model that has well-understood physical foundation.
To model the measurements, we use the so-called complete electrode model (CEM) [44]. CEM is defined by
∇ ·(γ(x)∇uj(x)) =0 x∈Ω, (2.5a) uj(x) +ζkγ(x)∇uj(x),ν=Ukj x∈∂Ωek, (2.5b)
∂Ωekγ∇uj(x),νdx˜=−Ikj, (2.5c) γ(x)∇uj(x),ν=0 x∈∂Ω\
L1 k=1
∂Ωek, (2.5d)
whereuj is the potential inside the domain Ω⊂ RN,ζk is the contact impedance between the surface of the target and thek’th electrode,νis a unit vector pointing out of Ω, Ukj is the potential on the k’th electrode, ∂Ωek is the part of the bound- ary of Ω covered by the k’th electrode, and Ikj is the current through electrode k.
The superscript j∈ {1, 2, . . . ,L2}indicates that we may have multiple different po- tential excitations Uj = (U1j,U2j, . . . ,UjL
1), each corresponding to a current vector Ij = (I1j,I2j, . . . ,ILj
1) (see Figure 2.1). The current vector I in (2.4) is then concate- nated from Ikj, possibly excluding certain Ikj, as the corresponding measurements may not be measured by the EIT measurement device.
(a) (b)
Figure 2.2: Figure (a) shows graphs J(tγˆ+ (1−t)γ0) (blue curve) and tJ(γˆ) + (1−t)J(γ0)(red curve) for the discretized functionalJ(γ) = 12A(γ)22+aTV(γ). Since the blue curve is partially above the red curve, there are at least two points (conductivitiesγ0 and ˆγ) for which J(tγˆ+ (1−t)γ0) ≤ tJ(γˆ) + (1−t)J(γ0) does not hold for all t ∈ [0, 1], indicating that the (discretized) problem is not convex even if the regularizationF(Bγ) =aTV(γ)is. In this specific case, the conductivity γ0(x) = 1 S is spatially constant and ˆγ is the estimated minimum conductivity of this specific J. The minimum conductivity ˆγ is illustrated in Figure (b). The rectangles in Figure (b) depict the electrodes.
The derivation of CEM from the fundamental physical quantities can be found in multiple sources e.g., [21, 44] and we do not restate it here. However, note that compared to the typical CEM, there is a difference in the sign of Ikj in our model.
This simply means that we have defined theoutgoingcurrents to be of positive value.
Let us briefly discuss the non-convexity of the problem (2.2) caused by the non- linearity ofI. Recall that in a convex subsetYs⊂Y, f :Ys →Rs,Rs :=R∪ {−∞,∞}, is convex iff
f(tγ1+ (1−t)(γ2))≤t f(γ1) + (1−t)f(γ2)holds for all 0≤t≤1, andγ1,γ2∈Y.s The sum of two convex functionals is convex and the composition of a convex func- tional and an affine mappingK : Xs →Ys is also convex in a convex subset Xs ⊂ X.
However, since I in (2.4) is non-linear w.r.t. the conductivity γ[45], the functional (2.3) is not necessarily convex. Using the discretized version of (2.5), we can confirm this non-convexity by choosing two conductivities γ1 and γ2 and plotting curves J(tγ1+ (1−t)γ2)and tJ(γ1) + (1−t)J(γ2)into the same figure as shown in Fig- ure 2.2.
Finally, recall that the specific interest of this study is in non-smooth versions of the problem (2.2). In the following, we will discuss how to reformulate such problems into corresponding primal-dual problems. A class of methods specialized in non-smooth problems, so-calledprimal-dual proximal splittingmethods, could then be used to solve this problem if I was linear. These linear methods work as a basis for the non-linear methods which we apply to EIT in Chapter 3.
2.1 A PRIMER ON CONVEX ANALYSIS
The minimum of a smooth convex function in an open setUis characterized by the Fermat principle, i.e.,
γˆ =arg min
γ∈U
f(γ)⇔ f(γ) =ˆ 0,
given that the minimum ˆγexists. If the function f is non-smooth, the basic Fermat principle cannot be used to identify the minimum. Nevertheless, we may replace differential with the subdifferential
∂f(γ):={γ∗ ∈X∗ | γ∗, ˜γ−γ ≤ f(γ)˜ −f(γ) for all ˜γ∈X},
where X∗ is the dual space of the Banach space X and f : X → Rs. Analogously to the smooth case, the Fermat principle then reads (simply take γ∗ = 0 in the definition of the subdifferential),
γˆ =arg min
γ∈U
f(γ)⇔0∈∂f(γ).ˆ
By [46, Theorem 3.8], this critical point exists if X is a reflexive Banach space and f is PCLS and coercive. For simplicity, we restrict our discussion to Hilbert spaces, and we identifyX∗withX.
Using the effective domain of f defined by
domf :={γ∈X| f(γ)<∞},
we can show certain calculus rules for subdifferentials. Supposing that f andgare PCLS and there exists someγ0∈int(domf)∩domg, subdifferentials obey the sum rule [46, Theorem 4.14]
∂f(γ) +∂g(γ) =∂(f+g)(γ). (2.6) In addition, assuming thatK is linear and bounded and that there exists a γ0 ∈ X withKγ0∈int(domf), also the chain rule
K∗∂f(Kγ) =∂(f◦K)(γ) (2.7) holds forγ∈domf ◦K[46, Theorem 4.17]. For other rules, see [46, Section 4.3].
Let us now consider the minimization of a non-smooth composite function
minγ∈X f(Kγ) +g(γ), (2.8)
whereK: X→Yis linear, f :Y →R, ands g: X→Rs are PCLS. Although we can characterize a solution to (2.8) through the Fermat principle, finding this solution is not as trivial as in the smooth case. For example, the naive equivalent of the steepest descent method
γk+1∈γk−τk[K∗∂f(Kγk) +∂g(xk)]
fails since arbitrary subgradients, i.e., the elements of ∂(K∗f(Kγ) +g(γ)), are not necessarily descent directions. However, there are methods that can, at least theo- retically, solve (2.8) directly. In the following section we consider an alternate but
equivalent formulation of this problem. To present this formulation, we define con- vex conjugate (or Fenchel conjugate) of f by
f∗(y):=max
γ∈Xy,γ −f(γ).
If f is PCLS, then also f∗ is (weakly-*) PCLS [46, page 58]. Moreover, for PCLS functions f∗∗= f on Hilbert spaces by the Fenchel—Moreau—Rockafellar theorem [46, Theorem 5.1].
Finally, to see how (2.2) is related to (2.8), recall that 12 · 22+F(·) in (2.2) is PCLS. AllowingKin (2.8) to be non-linear, we may present the problem (2.2) in the same form by taking f(y1,y2) := 12y1− WI22+F(y2), K(γ) := (WI(γ), Bγ), andg(γ):=δ[γm,γM](γ)for the indicator function
δ[γm,γM](γ):=
0 γ∈ [γm,γM]
∞, otherwise so that2
γm≤γ≤γmin M
J(x) =min
γ∈X 1
2A(γ)22+F(Bγ) +δ[γm,γM](γ)
=min
γ∈X f(WI(γ),Bγ) +g(γ)
=min
γ∈X f(K(γ)) +g(γ),
(2.9)
since the indicator function is PCLS.
To characterize the minimum of non-smooth non-convex problems, we need to define a subdifferential notion for such objective functions. A common one is the Clarke subdifferential (see [47, Pages 25-27] and [46, Definition 13.2 and Theorem 13.4 (Fermat principle)]). It agrees with the convex subdifferential on convex func- tions, and with the gradient on smooth functions. A sum rule also holds under some regularity conditions (see [46, Page 186 and Theorem 13.20]); in particular, the Clarke subdifferential ofF+G, whereGis PCLS andF is smooth, is∇F+∂G.
In Chapter 3, we propose methods to solve the problem (2.10) in EIT, i.e., the case where K is non-linear. The next section is dedicated to the primal-dual proximal splitting method which can be used when K is linear. This method serves as a building block for the methods in Chapter 3.
2.2 PRIMAL-DUAL PROXIMAL SPLITTING
Algorithm 1 describes the original primal-dual proximal splitting method (PDPS) of Pock, Cremers, Bischof, and Chambolle [32, 48]. This method is designed to solve the primal-dual saddle-point problem
minγ max
y g(γ) +Kγ,y −f∗(y). (2.10) This problem is obtained from (2.8) by writing f(Kγ) = f∗∗(Kγ) =maxyKγ,y − f∗(y) =maxyγ,K∗y −f∗(y). Applying the Fermat principle and the subdifferen- tial sum (2.6) and chain (2.7) rules individually to the primal and dual variables x andyof the objective functional shows that a saddle-point(γˆ, ˆy)satisfies
Kγˆ ∈∂f∗(y)ˆ and −K∗yˆ∈∂g(γ).ˆ (2.11)
2Combining the two functionals into f(K(γ))is sometimes called the "lifting trick".
Algorithm 1Primal-dual proximal splitting.
Require: Convex, proper, lower semicontinuousg:X→R, f :Y→R, and a linear operatorK∈L(X;Y).
1: Choose step length parametersτ,σ>0 satisfyingσ <1/(τK2).
2: Choose initial iteratesγ0∈X,y0∈Y.
3: for alli≥0untila stopping criterion is satisfieddo
4: γi+1:=proxτg
γi−τK∗yi
5: γsi+1:=2γi+1−γi
6: yi+1:=proxσf∗
yi+σKγsi+1
7: end for
The saddle point exists if the primal problem (2.8) has a solution and there exists someγ0∈dom(f◦K)∩domgwithKγ0∈int(dom f)[46, Theorem 5.10 (Fenchel- Rockafellar) and Corollary 5.13].
The PDPS of Algorithm 1 is a fixed-point iteration to find a saddle point of (2.10).
To see this, let us first examine proximal mappings in Algorithm 1. These proximal mappings are defined by3
proxσf (x):=arg min
y
f(y) +2σ1y−x22. By [46, Lemma 6.18], the proximal mapping satisfies the relation
y∈∂f(γ) ⇔ γ=proxσf (γ+σy). (2.12) Suppose that(γ, ˜˜ y)is a fixed point of Algorithm 1. Then ˜γ=proxtg(γ˜−tK∗y),˜ s˜
γ=2 ˜γ−γ˜ =γ, and ˜˜ y=proxσf∗(y˜+σKγ) =s˜ proxσf∗(y˜+σKγ). By (2.12)˜ Kγ˜ ∈ f∗(y)˜ and −K∗y˜∈ g(γ)˜ ,
showing that (2.11) holds for(γ˜, ˜y), i.e., it is a saddle point to (2.10). The convergence properties of Algorithm 1 are discussed for example in [32, 46]. Specifically in finite dimensions, if the primal step length parameter τ > 0 and dual step length parameter σ > 0 satisfy the condition σ < 1/(τK2), then (γi,yi) → (γˆ, ˆy) [32, Theorem 1]. Similarly, for(γi,yi)(γˆ, ˆy)in Hilbert spaces, see [46, Corollary 9.13].
Since in (2.8) and (2.10) we can always takeK = Id and g(γ) = 0, we can use the PDPS method to solve minγ f(γ)with arbitrary PCLS f. However, the method requires evaluating the proximal mappings of f∗ and gat each step. If K is non- trivial and included into f, computing the proximal mapping of f∗will often require invertingK, possibly slowing the iteration significantly; the practicality of the PDPS method is indeed dependent on the structure of f,g, andK.
In the next section, we lay out the practical aspects of the numerical and experi- mental studies used in the Publications I-IV.
2.3 PRACTICAL ASPECTS OF THE NUMERICAL AND EXPERIMEN- TAL STUDIES
Throughout the thesis, we will refer to studies that use both simulated and exper- imental measurement data. These studies share a common scheme for computing
3We also write proxs f(x) = (Id+s∂f)−1(x). The functional(Id+s∂f)−1is called theresolventof∂f.
the simulated data, for measuring the experimental data, and for solving the con- ductivity from either the simulated or experimental data. Any variations of this methodology will be explicitly highlighted.
To compute I(γ)numerically, we approximate (2.5) through Galerkin finite ele- ment (FE) method. We use this finite element approximation to compute the sim- ulated data, and we also use it as a forward model in (2.2), by replacing I in (2.4) with its finite element approximation. We presentγin a piecewise linear basis with Nn nodes andNe elements.
The so-calledinverse crime, i.e., the assumption that the computational model is exact, can give an unrealistically positive view of how an inverse algorithm works in practice [17, 49, 50]. To avoid this, we always compute the simulated data in a mesh that is denser than the one used in the forward model. Further, we add Gaussian noise to each simulated measurement. Standard deviation of the noiseekj added to a measurement Ikj is proportional to the magnitude of the measurement, i.e., std(ejk)∝|Ikj|.
The measurements used in this thesis are collected using an EIT device manufac- tured by Rocsole Ltd. (www.rocsole.com). This device employs the aforementioned measurement setup, more typically encountered in electrical capacitance tomogra- phy, in which the device excites the electrodes to user-defined potential patterns and measures the currents caused by these potentials. For each pattern, the user can choose which electrodes are grounded and which are excited. The excitation amplitude is automatically set by the device. The user can specify the excitation fre- quency, which we set to 39 kHz. Further, the device uses internal 1 MHz sampling frequency for the currents, automatically converts these samples to the current am- plitudes through discrete Fourier transform, and outputs these amplitudes, along with the excitation potentials, into an ASCII encoded text file.
In the reconstruction of the conductivity, we will employ a variety of regulariza- tion functionals: A smooth gradient regularization [51]
F∇(γ) =a
Ω|∇γ|2dx, (2.13)
a total variation regularization [22]
TV(γ) =a
Ω|∇γ|dx, (2.14) and a smooth approximation of the total variation [52, 53],
sTV(γ) =a
Ω
|∇γ|2+dx, (2.15)
wherea>0 denotes the regularization parameter, |∇γ|2= ∇γ,∇γis the spatial norm, and > 0. We will also use the notation ∇K for the Fréchet derivative of an operatorK : X→Ybetween Banach spaces XandY [46, Section 2.1]. To avoid confusion in these cases, we will use∇xfor the gradient w.r.t. x∈Ω⊂RN.
In finite dimensions, which we are concerned with in the numerical studies,γ∈ RNn is interpreted as a vector formed by coefficients of the discretized conductivity and∇Kof operatorKis interpreted as the Jacobian ofKw.r.t. these coefficients4.
4Note that in Publications (II-IV), we instead wrote∇K∗and∇KTfor the Jacobian.
In a piecewise linear basis, we may write the above gradient-based regulariza- tions in the formF(Bγ), whereFis PCLS andBis a matrix. Here, we use TV as an example; since∇xγis constant within each element, we can write
Ei|∇xγ|dx=Ai|∇xγ|= Ai
(∂1γ)2+ (∂2γ)2+· · ·+ (∂Nγ)2,
where∂kγ is constant within the mesh elementEi ⊂ Ω. By denoting the indices of the nodes of the mesh elementi by j1, j2, and j3, forx ∈ Ei we can writeγ(x) = γj1φj1(x) +γj2φj2(x) +γj3φj3(x). This yields
∂kγ=γj1∂kφj1+γj2∂kφj2+γj3∂kφj3.
We can then organize the values Ai∂kγjl into a matrix R∇ ∈ R(N·Ne)×Nn, e.g., (R∇)(k−1)Ne+i,j = Ai∂kφjl for j = j1,j2,j3 and (R∇)(k−1)Ne+i,j = 0 otherwise, that corresponds to the matrix Bof the regularization term. To form the PCLS function F, we interpret| · |as a mapping fromRN·Ne →RNe so that thei’th component of
|R∇γ|reads
|(R∇γ)|i =
(Aiγj1∂1φj1+Aiγj2∂1φj2+Aiγj3∂1φj3)2+. . . + (Aiγj1∂1φj1+Aiγj2∂1φj2+Aiγj3∂1φj3)2 1/2. Since|(R∇γ)|i ≥0, takingF(y) =a|y|1yields
TV(γ) =a
Ω|∇γ|dx=a
Ne i
∑
=1(|R∇γ|)i =a|R∇γ|1=F(Bγ).
The composition| · |1is convex (and trivially also proper and continuous) since a sum of convex functions is convex. The formF(Bγ)of F∇ and sTV follows from a similar treatment.
In addition to the gradient regularization, we also use a smoothness promoting regularization defined on the coefficient vectorγ∈RNn by
FS(γ):=Wγ(γ−γ0)22, (2.16) whereγ0is the expected value ofγ. The matrixWγis defined by inverse factoriza- tion (WγTWγ)−1 = Γ of a kernelΓi,j = aexp(−xi−xj22/(2b2)), where xi andxj are the locations of the nodesiandjanda,b>0 [54].
The constraintsγmandγMalso containa prioriinformation about the conductiv- ity. We do not assume that these are known precisely; generally we set the minimum conductivity to a small value, e.g.,γm=10−8, and the maximum conductivity to an arbitrary large number, e.g.,γM =1012. The maximum constraint is not supposed to have a practical impact in these cases. For sensing skin crack detection studies, however, we compute the maximum constraint from a stage where the sensing skin is intact, since the cracks never increase the conductivity [29, 30].
Finally, whenever we are solving the problem (2.2) through an iterative scheme, we generally start these iterations at a homogeneous estimate; an estimate computed by omitting the regularization functional and assuming that γ is constant. If the conductivity progresses in multiple stages, then only the iteration of the first recon- struction starts from the homogeneous estimate and the subsequent reconstruction
iterations start at the reconstructions of the previous stages. To stop these iterations, we monitor how the value of the objective functional changes between the iterates, and we stop the iteration once this value no longer decreasessufficiently. The con- dition for a sufficient decrease is, of course, dependent on the algorithm and the problem at hand, and of minor interest regarding the topics of this thesis.
3 Non-smooth optimization and regularization for electrical impedance tomography
In this chapter, we review the results of Publications I-IV. Based on the results of Publication I, we focus on answering whether the primal-dual proximal splitting (PDPS) methods can be used to solve the non-smooth EIT problem in practice, and whether so-called block adaptation of the step length parametersτandσcan speed up the convergence. Based on Publication II, we study a novel variant of the Gauss–
Newton method suitable for non-smooth and non-convex problems. We apply this method to EIT to see if additional speed up can be obtained. Further, based on the results of Publication III, we discuss whether the Mumford–Shah functional can improve image quality in EIT applications. Finally, based on Publication IV, we discuss a specific application of EIT; the sensing skin. The aim of Publication IV is to formulate a model for the sensing skin in non-planar geometries, and to implement it numerically. Further, we study whether such models can make these non-planar sensing skins viable in practical applications.
3.1 PUBLICATION I: PRIMAL-DUAL BLOCK-PROXIMAL SPLITTING FOR A CLASS OF NON-CONVEX PROBLEMS
Most previous studies on total variation regularization in EIT have employed its smooth approximations [23, 25–27]. Our objective is to avoid using these smooth approximations by (numerically) solving the primal-dual problem of (2.2) with a non-linear version of the PDPS (NL-PDPS).
If the operatorAin (2.3) were affine, i.e.,Iwere linear, as discussed in Section 2.2, we could solve the problem (2.2) with Algorithm 1. This method has also been extended to non-linearA(orK), i.e., the NL-PDPS of [33, 55]; see also Algorithm 2.
This method, however, similarly to the PDPS, is a first order method and requires evaluating the forward operator and its Jacobian at each step. Since computing the EIT forward operator in practice involves the FE approximation of (2.5), computing each iterate can take up to a few minutes. This can render NL-PDPS practically unusable because reasonable convergence usually requires thousands of iterations [33, 55]. In the following, we consider so-called block adaptation to speed up this convergence. The Publication I focuses more broadly on the block adaptation of the NL-PDPS algorithms and only the numerical EIT study is a part of this thesis.
3.1.1 Block proximal splitting
Let us again consider the composite minimization problem (2.8) and the correspond- ing primal-dual problem (2.10). If the variables naturally have a "block structure" i.e., different parts of f,K, orguse different components ofγandy, then it may make sense to update these components, or "blocks", independently — possibly some of them even randomly. For example, the data fidelity term and the regularization term in the objective functionalJof (2.3) form two such blocks. Also, we can divide
Algorithm 2Non-linear Primal-dual proximal splitting.
Require: Convex, proper, lower semicontinuous g : X → R, f : Y → R, and operatorK∈C2(X;Y).
1: Choose step length parametersτi,σi>0 satisfyingσi<1/(τiR2)for some upper boundsR≥supk=1,...,i∇K(γk).
2: Choose initial iteratesγ0∈X,y0∈Y.
3: for alli≥0untila stopping criterion is satisfieddo
4: γi+1:=proxτig
γi−τi[∇K(γi)]∗yi
5: γsi+1:=2γi+1−γi
6: yi+1:=proxσ
if∗
yi+σiK(sγi+1)
7: end for
the data fidelity term by each excitation j to obtain L2+1 blocks or alternatively divide it by each measurement to obtainL1·L2+1 blocks. Dividing a problem into separate blocks and adapting the step length parameters to these blocks can speed up the convergence rates in the convex case [56].
Generally, we say that the problem (2.10) is block-separable if we can write it as minγ∈Xmax
y∈Y
∑
m j=1gj(γj) +
∑
n=1
K(γ),y −f∗(y)
, (3.1)
whereγ = (γ1, . . . ,γm), y= (y1, . . . ,yn), and for all j = 1, . . . ,mand= 1, . . . ,n, the functionalsgj:X→Rand f:Y→Rare PCLS. In the following, we will write Kfor the operator
K(γ) = (K1(γ), . . . ,Km(γ)).
Note that in Publication I, using mutually orthogonal families of linear projection operators P1, . . . ,Pm ∈ L(X;X) and Q1, . . . ,Qn ∈ L(Y;Y), we write γj = Pjγ and y=Qy. The variablesPjγandQyare the blocks of this problem.
The analytical part of Publication I considers three different methods for solving the problem (3.1). We derive step length parameter bounds and conditions under which these algorithms converge. We apply Algorithm 3 to EIT, which is a simplified version of the one depicted in Publication I. The original version in Publication I is stochastic; at each step, the updated blocks can be chosen randomly. The step length conditions in this case are involved. In the non-stochastic setting, however, supposing that we start the iteration sufficiently close to a (Clarke) critical point, we may derive the step condition (see Publication I for the other algorithms and the original conditions);
0<σi ≤ 1−κ
τ∑wx,,∇K(γk)2 for allk∈ N, (3.2) whereκ ∈ (0, 1)andwx,, = w−x,1,. The step length parametersτ andσi are em- ployed by the block adapted NL-PDPS in the primal and the dual steps (see Equa- tion (3.1)), respectively. The weights wx,, > 0 can be chosen arbitrarily and are used to balance out the steps, if, for example, certain∇K(γk)is very small/large relative to the rest of the blocks. Further, the parameterκ >0 is used to change a strict inequality to inequality in (3.2), and thus it is often sufficient to takeκ≈0.