• Ei tuloksia

3.4 Contour Estimation

3.4.2 B-spline

In [9], a method based-on the Gaussian mixture model of B-spline was proposed to infer the missing parts of the object contours within the framework of overlapping the object segmenta-tion. With a collection of prescribed object shapes, the B-splines mixture model tries to address the contour estimation and object shapes simultaneously.

Given the predefined B-spline order ofdand the number of control pointsp, the contour of every individual object is constructed through a closed B-spline curve. That is, any particular contour pointcij ∈Ci is approximated by the uniform periodic B-spline curve model

fi(t) =

p−1

X

i=1

φh,d(t)pi,h (48)

whereφh,d(t)is the blending function and pi,h ishth control point. To be specific, any element cij of contour evidence Ci in the 2-dimensional image discrete space is assumed to be a noisy observation of the uniform periodic B-spline curve model with additive Gaussian noise

ci,j =fi,j(ti,j) +i,j. (49)

The complete mixture B-spline model is built up through the two particular sub-tasks: 1) estima-tion of the spline parameterstij and 2) estimation of control point points.

The task of estimating the B-spline curve parameters, the so-called data parametrization problem, is to determine the parameter values where the B-spline model is estimated. Given the contourCi and the sequence of ordered contour pointsci,1, ci,2, ..., ci,niinR2, data parametrization specifies the parameter valuesti,1, ti,2, ..., ti,ni inR.

There have been several methods in literature proposed for the data parameterization [68], [69].

The most commonly used parameterization methods can be referred to as the general exponent methods defined by

These methods derive uniform, chordal, and centripetal parametrization with α equals to 0,0.5 and 1, respectively. The choice of the method depends on the task properties and the the designer preference. A comprehensive study explaining the parametrizations for the class of closed curve can be found in [70].

In [9], the involvement of data parametrization in continuous data is addressed through the chordal method applied to the convex hull inscribing the contour points. The detailed steps are presented in Algorithm 10.

Algorithm 10:Chordal parameterization with incontinuous data Input: Contour evidence setCi ={ci,1, ci,2, ..., ci,n}

Output: The parameter value setT ={ti,1, ti,2, ..., ti,n}.

1. Determine the convex-hullQ={q1, q2, ..., qL}inscribing the contour evidenceCi. 2. Order the points on the convex-hullQ, and apply the chordal parameterization, Equation

50.

3. Construct the parameter value setTi, by assigningtsobtained in step 2 toti,j ∈Ti, j = 1,2, ..., nwhere the pointqson the convex hullQis the nearest point toci,j.

The mixture model of B-splines is defined within the probabilistic framework that uses maximum likelihood to determine the missing parts of the contours while regulated to a particular shape model. That is, given the contour evidence set C = {c1, c2, ..., cm}, the main goal is to address two coupled problems at the same time: 1) determine the shape model where object ci comes from 2) determine the missing parts of the object contour.

The shape modeling problem in the mixture-model of B-splines lies behind the estimation of the number and the poses of control points, as it is generally true that the B-spline shape properties are described through the number and position of control points. Here, the number of control pointspis specified by the user and the position of control points adaptively is inferred from the present objects in the image.

For each contour object ci ∈ C comprising of ni contour evidences ( points in 2-dimensional discrete image space), is identified by the stacked column vectorxi ∈ R2×ni. The corresponding shape model is represented by control point vector pi ∈ R2×p which is similarly stacked by p

control pointspi,h ∈ R2. The shape vectorpi not only encapsulates the shape information, but also the pose parameters, scalings, rotationθ, and translationc. Without loss of generality, the shape model is separated from the normalized shape model vectorp˜ias

pi = 1 p×1andNstands for the Kronecker product.

GivenKpossible different shapes, each producing Gaussian distribution model with mean vector mukand the covariance matrixΣk, the full hypothesis shape model, considering every possible shape, is described by a mixture of Gaussian distribution models

P(p˜i |gi,µ,Σ) =

K

Y

k=1

N(p˜ikΣk)gi,k (53)

whereµandΣare the sets including all the shape parameters as follow:

µ={µ12, ...,µK}

1, if objectibelongs to shape groupk.

0, otherwise.

(55)

It is assumed that the prior probability of the hidden group membership variable follows a multi-nomial model as

where, αis the vector comprising the probabilities(α1, α2, ..., αK)T that any contour evidence vector comes from each shape model such thatPK

k=1αk = 1.

In a more general form, assuming the pose parameters correspond to the shape groups, the shape hypothesis model is given by

P(p˜i |gi,sii,ci,µ,Σ) =

Recalling Equation 44 and the shape vector pi, the likelihood of a single contour evidence is modelled by multivariate Gaussian distribution

P(xi |pi, σ2) = N(xiipi, σ2I2ni) (59) where Φi ∈ R2ni×2p is the block matrix storing the hypothesis B-spline model output whose (j, h)th component isφh,d(ti,j)I2.

Applying the general optimization methods to the complete maximum likelihood of Equation 60 is not straightforward, in [9], the estimation of unknown parameters is resolved using Expectation

Maximization (EM) along (ECM) algorithm [71].

3.4.3 Active Contour

The active contour for the set of overlapping objects relies on active contour with shape prior, Active Shape Model (ASM), which is extended to address the segmentation of multiple overlap-ping objects simultaneously [72, 73]. ASM in its level set form is an implementation of level-set based active contour image segmentation where a shape prior knowledge is combined as shape driven functional.

The active contour with shape prior in its original form is an extension of the two phase Mumford-Shah model [74] in which the region and short boundary enforcing functional are combined with a functional governing the shape prior constraint as

F =Fregion+Fboundary +Fshape (61)

The contours representation is in a level set form of the Lipschits distance function. Given the imageI : Ω−→fand the contoursC ∈Ωdividing the image the to foreground and background regionsΩf andΩb respectively, the zero-level set representationφof interfaceCis given by

φ=

whereD(c,C)is the function returning the minimum distance fromc∈Ωto contoursC.

The regional functionalFregionand the boundary functionalFboundary are defined as follows:

Fregion(φ, λ1, λ2) =λ1

and

Fboundary(φ, λ4) =λ3 Z

g(|∇f(x)|)|φ(x, y)| ∇Hφ(x, y)dxdy, (64)

whereHφis Heaveside distribution,g is the edge detecting function, andcinandcoutare. λ12

andλ3are positive parameters that determine the proportion of each energy functional.

The shape functionalFshapeexploits the sum of squared differences between the current contour representation φ(x, y) and the adjusted shape prior ψ(A(x, y)) with similarity transformation A(x, y)as

Fshape(φ,ψ, λ4) =λ4 Z

(φ(x, y)−ψ(A(x, y))2dxdy. (65)

The shape priorψis essentially in the level set form and it may be a single reference shape or an involved statistical model obtained from a collection of training shapes. The evolving modelφis aligned and adjusted to the shape priorψvia a 2-dimensional Euclidean similarity transformation

A(x, y) = (s, θ, T). (66)

The original form of ASM is restricted to single object segmentation and can not deal with more complex scenarios of multiple occluded objects. To deal with the segmentation of multiple overlapping objects, ASM is applied in a manner that it allows each pixelf(x, y)to be associated with any of the objects or the background region [73].

Formally, assuming the imageIconsists ofnobjects{O1, O2, ..., On}, the characteristic function χis defined to associate one particular evolving model per object as

χi =

The energy functional for simultaneous segmentation of multiple occluded object is given by where the key feature is the forth term in which by imposing a penalty on the area among the segmenting regions, any pair of evolving curves,φi andφj, is avoided to be identical.

Keeping the parameterscinandcoutfixed, minimizing the functional in Equation 70 with respect to the dynamic variables φi, si, θ and Ti, using calculus of variations, leads to the associated Euler-Lagrange equation. Parameterizing the descent direction within an artificial time span t >0, the flow direction of evolving curve is given by

∂φi

∂t = δφi

[ (

(−λ1(f−cin)22(f −cout)2)(1−Hφj)

)

+ λ3∇.

(

∇φi

|∇φi|

)

− λ4Hφj

]

− 2λ5i−ψi). (70)

The flow direction of similarity transformation is given by

∂θi

∂t = 2λ5

Z

i−ψi)(∇ψi.∇θAi)dxdy (71)

∂Ti

∂t = 2λ5

Z

i−ψi)(∇ψi.∇TAi)dxdy (72)

∂si

∂t = 2λ5

Z

i−ψi)(−ψi

s + ∇ψi.∇sAi)dxdy, (73)

whereψ = ψ0(A(x,y))s

i andi6=j.

4 Experiments

This section presents the experimental results using both synthetic and real datasets. In order to analyze the performance of the segmentation framework in a more detailed manner, each part of the framework, including seed point extraction, contour evidence, and contour estimation, is evaluated separately. Each module of the framework is explored and evaluated qualitatively and quantitatively. Based on the obtained results a new method is proposed combining the best sub-method for each tasks.

4.1 Data

The experiments were carried out using synthetic and real datasets of overlapping objects.

Synthetic dataset: The synthetic dataset is a collection ellipse-shape objects that are uniformly randomly scaled, rotated, and translated. The elliptic objects are generated in a way to represent various degrees of overlap. The dataset consists of 150 sample images divided into three class of overlap rates. The maximum overlapping area allowed between two objects in the first dataset is 40%, in the second dataset 50% and in the third dataset 60%. Each class of image in the dataset contains 50 images of 40 objects in each image. The ellipses are simulated with random orientation with the minimum width and length of 30, and the maximum width and length of 45.

The images size is300×400pixels. Figure 15 shows examples of the synthetic dataset.

Real dataset: The real dataset contains crystal particle images (see Figure 16), captured by transmission electron microscopy. In total, this dataset contains 11 images of4008×2672 pix-els. Around 200 particles are marked manually in each image by an expert as ground truth.

The original annotations in ground truth consist of manually drawn contours of objects and this information is used to compute the centroids and visible boundaries. The ground truth was computed from the marked particles which contain the centroid and boundary of each marked particles. Since the ground truth of all objects present in the real datasets were not marked, a pre-processing step was applied to eliminate the unmarked objects from the images. It should be noted that the foreground-background segmentation is not very involved in real images where the object instances as dark regions could be separated from the white background. Three examples of the preprocessed real datasets are shown in Figure 17.

Figure 15.Examples of synthetic dataset.

Figure 16.Examples of real dataset.

Figure 17.Examples of annotation and preprocessed image.