• Ei tuloksia

The Moore-Penrose Generalized Inverse

Consider equation (1) in Euclidean spaces: Computing the generalised inverse when A is a matrix of full rank is relatively easy. Also, if Ran(A) is not the full image space, then the right hand sideyof equation (1) becomes complicated to solve. In such a case, it is important to find x such that Ax has the minimal distance to x. If on the other hand Ker(A) 6= {0}, then equation (1) does not have a unique solution and one might be interested in choosing the specific solution with minimal norm among the multiple solutions.

In the Hilbert space real-valued setting, we consider following definition:

Definition 2.6. Let A : H1 → H2 be bounded linear operator. An element x ∈ H1 is called

(i) least squares solution of equation(1)if

kAx−yk= inf{kAz −yk |z ∈H1}. (6)

(ii) best-approximate solution or minimum norm solution of equation(1)ifxis a least square solution of equation(1)and

kxk= inf{kzk |z is least-squares solution of equation(1)}. (7)

It is easily seen that if the least squares solution exists, then the minimum norm solu-tion is unique because it is the minimizer of quadratic error funcsolu-tions and thus the best-approximate can be defined as the least-squares solution of minimal norm. The notion of a best-approximate solution is related to the Moore-Penrose generalized inverse ofA.

Given bounded linear operatorsAonly, that is,A ∈ L(H1, H2)whereA˜ : Ker(A) → Ran(A)is its restriction, thenthe Moore-Penrose generalised inverseofA, denotedA, is the unique linear extension ofA˜−1 to the domain ofAdenoted as:

D(A) := Ran(A)⊕Ran(A) (8) such that

Ker(A) =Ran(A). (9)

Thus due to the restriction toKer(A),A˜is injective (one-to one) and surjective (onto) due to the restriction to Ran(A). Hence, A˜ is bijective, and, A˜−1 exists. For any y ∈ D(A), there is unique y1 ∈ Ran(A) and y2 ∈ Ran(A) with y = y1 + y2. Since Ker( ˜A) = {0} and Ran( ˜A) = Ran(A), the operator A is well-defined and from (9) and the linearity ofA, we have that

Ay=Ay1+Ay2 =Ay1 = ˜A−1y1. (10) Proposition 2.7. Let P = AA andQ = AA be the orthogonal projection operators onto Ker(A) andRan(A), respectively. ThenA is uniquely characterized by the four Moore-Penrose equations:

AAA = A (11)

AAA = A (12)

AA = I−P (13)

AA = Q|D(A), (14) whereIis the identity operator.

Proof. For each y ∈ D(A)and by the definition of the Moore-Penrose inverseA, we

have

Ay= ˜A−1Qy =AQy (15) so thatAy ∈Ran( ˜A−1) =Ker(A). For eachx∈Ker(A), it follows that

AAx = ˜A−1Ax˜ =x.

The above assertion proves thatRan(A) = Ker(A). Now equation (15) implies that AAy=AAQy =AA˜−1Qy = ˜AA˜−1Qy =Qy,

sinceA˜−1Qy ∈Ker(A)and hence equation (14) holds. By the definition ofA, it holds for eachx∈H1 that

AAx= ˜A−1A(P x+ (I−P)x) = ˜A−1AP x

| {z }

=0

+ ˜A−1A(I˜ −P)x= (I−P)x. (16)

Inserting equation (14) into equation (15) yields

Ay =AQy =AAAy

for ally∈D(A). equation (16) implies equation (13) and equation (13) also implies that AAA =A(I−P) = A−AP =A.

Hence equations (11), (15) and (14) imply equation (12).

Any operator satisfying equation (13) or equation (14) is referred to as aninner inverseor outer inverseofA, respectively.

The following theorem provides a connection between the least-squares solutions and how they are computed via the Moore-Penrose inverse.

Theorem 2.8. For ally ∈ D(A), the equation(1)has a unique best-approximate solu-tion given by

x:=Ay.

The set of all least-squares solution isAy+Ker(A).

Proof. For a fixedy∈ D(A), let us construct a set S ={z ∈H1 |Az =Qy}.

Sincey ∈ D(A) = Ran(A)⊕Ran(A), it follows thatQy ∈ Ran(A) and therefore S 6=∅. BecauseQis an orthogonal projector we have for allz ∈S and for allx∈H1:

kAz−yk=kQy−yk ≤ kAx−yk.

So, all elements inS are least-squares solutions ofAx =y. Conversely, letz be a least-squares solutions ofAx=y. Then

kQy−yk ≤ kAz−yk= inf{ku−yk |u∈Ran(A)}=kQy−yk.

Thus,Az is the closest element toyinRan(A), that is,Az =Qyand S ={x∈H1 |x is least-squares solution of Ax =y}.

Now, let z¯be the element of minimal norm in S = A−1({Qy}). Since then S = ¯z + Ker(A), it suffices to show that

¯

z =Ay.

As an element of minimal norm inS = ¯z+Ker(A), z¯is orthogonal toKer(A), that is

¯

z ∈Ker(A). This implies that

¯

z = (I −P)¯z =AA¯z =AQy =AAAy =Ay, (17) that is,z¯=Ay.

In linear algebra, it is a well-known fact that the least-squares solutions can be character-ized by the normal equations and this leads us to the next theorem to verify if the assertion is true in the continuous case.

Theorem 2.9. LetA denote the adjoint operator ofA. For giveny ∈ D(A),x∈ H1 is a least-squares solution of equation(1)if and only ifxsatisfies the normal equations

AAx=Ay. (18)

Proof. An elementx ∈H1is least-squares solution of equation (1) if and only ifAx co-incides with the projection ofyontoRan(A). This is equivalent toAx−y ∈Ran(A)= Ker(A). Thus, we conclude thatA(Ax−y) = 0which is equivalent to equation (18).

Furthermore, a least-squares solution has minimal norm if and only ifx∈Ker(A).

A direct consequence from Theorem 2.9 is that Ay is a solution of equation (18) with

minimal norm. Consequently, we have

Ay = (AA)Ay, (19) and this means that in order to approximate Ay, we may be required to compute an approximation via equation (18) instead.

To analyse the domain of the generalised inverse, one could show from the Moore-Penrose inverse as defined earlier in equation (8) thatD(A)is the natural domain of the definition forA. This is in the sense that if y /∈ D(A), then there is no existence of least-squares solution of equation (1). Thus contrary to the finite-dimensional case, the concept of minimnorm solution as introduced does not always give a solution to a problem al-though the concept imposes uniqueness. As we knowAis a bounded linear operator and thus the orthogonal complements always admits a closure. We can then conclude from equation (8) that

D(A) :=Ran(A)⊕Ran(A) =Ran(A)⊕Ker(A) =H2.

The domain D(A)is dense in H1 whereasD(A) is dense in H2. Thus, it follows that D(A) = H2 ifRan(A)is closed and vice versa. D(A) = H2 implies that Ran(A)is also closed. Furthermore, fory ∈ Ran(A) = Ker(A), the best-approximate solution isx = 0. It is therefore important to check when yalso satisfies y ∈ Ran(A), for any giveny ∈ Ran(A). In such a case,Ahas to be continuous. However, ify ∈ Ran(A)\ Ran(A) exists, then it is just enough to prove that A is discontinuous. This leads us to the following theorems and the introduction of compact operators which discuss the discontinuities of the Moore-Penrose inverses.

The following theorem is a result which implies that theRan(A)of continuous operator between two Hilbert spaces is closed.

Theorem 2.10 (Bounded inverse theorem). Let H1 and H2 be Hilbert spaces. If A ∈ L(H1, H2) is bijective (one-to-one and onto mapping), then the inverse map A−1 ∈ L(H1, H2).

Proof. The proof of Theorem 2.10 can be found in [28, Theorem 8.72].

The Closed Graph Theorem provides conditions for which a closed linear operator as defined in Definition 2.3 is bounded.

Theorem 2.11 (Closed Graph Theorem). Let H1 and H2 be Hilbert spaces and let A :

D(A) → H2 be a linear operator fromH1 to H2. Then if A : D(A) → H2 is a closed linear operator and its domainD(A)is closed inH1, then the operatorAis bounded.

Proof. Assume thatH1×H2 is complete. Assume also that gr(A)is a closed subspace inH1×H2 andD(A)is a closed subspace inH1, thusgr(A)andD(A)are complete. We now define the projection mapping

P :gr(A)−→ D(A) by

P(x, Ax) :=x.

The mappingP is linear and bounded since

kP(x, Ax)k=kxk ≤ kxk+kAxk=k(x, Ax)k for allx∈ D(A). In fact, its inverse

P−1 :D(A)→gr(A) is defined by

P−1x:= (x, Ax).

for allx∈H1. Sincegr(A)andD(A)are complete, by the bounded inverse theorem 2.10, the projectionP−1 is bounded and there is a constantbsuch that

k(x, Ax)k=kP−1xk ≤bkxk for allx∈ D(A). But this implies thatAis bounded since

kAxk ≤ kAxk+kxk=k(x, Ax)k ≤bkxk for allx∈ D(A).

Closed Graph Theorem is applied to Moore-Penrose generalized inverse and gives the following result.

Theorem 2.12. LetA ∈ L(H1, H2). ThenA ∈ L(D(A), H1)if and only ifRan(A)is closed.

Proof. Before proving the result, we will derive the following identity:

{(y1,A˜−1y1)|y1 ∈Ran(A)}={(Ax, x)|x∈H1} ∩(H2×Ker(A)). (20) Lety1 ∈Ran(A),x:= ˜A−1y1; by the definition ofA,˜ x∈Ker(A), and due to equation (14), we have

Ax=AAy1 =y1. Hence, it follows that

(y1,A˜−1) = (Ax, x)∈H2×Ker(A).

Ifx∈Ker(A)andy1 :=Ax(hence,y1 ∈Ran(A)), thenA˜−1y1 =AAx=x, so that (y1,A˜−1y1) = (Ax, x).

Thus, equation (20) holds.

By the definition ofA, we have for the graph ofA: gr(A) = {(y, Ay)|y∈ D(A)}

= {(y1+y2,A˜−1y1)|y1 ∈Ran(A), y2 ∈Ran(A)}

= {(y1,A˜−1y1)|y1 ∈Ran(A)}+ (Ran(A)× {0}), which implies with equation (20) that

gr(A) = [{(Ax, x)|x∈H1} ∩(H2×Ker(A))] + [Ran(A)× {0}]. (21) The spaces on the right-hand side of equation (21) are closed and orthogonal to each other inH2×H1, so that also their sumgr(A)is also closed.

To prove the second part of the proposition, we assume now thatRan(A)is closed, so that D(A) = H2. Because of the Closed Graph Theorem 2.11, Ais bounded. Conversely, letA be bounded, thenA has a unique continuous extensionA toH2. From equation (14) and the continuity ofAwe conclude that

AA=Q.

Hence fory∈Ran(A), we havey=Qy =AAy∈Ran(A). Consequently, we obtain Ran(A)⊆Ran(A)

and thereforeRan(A)is closed.