• Ei tuloksia

Structure of the Thesis

This thesis is organized in 6 chapters as follows. In Chapter 2, the introductory objectives reviewing the traditional segmentation methods, segmentation methods with shape prior, and state-of-art methods for overlapping object segmentation are presented. In Chapter 3, the general framework to be adopted for segmentation of overlapping objects and the existing methods for each framework step are described. In Chapter 4, each module of the framework is explored and evaluated and based on the obtained results, a method for segmentation of overlapping convex objects is proposed. The proposed method is applied to the synthetic and real datasets and the es-timated results are depicted in parallel with the two existing methods. In Chapter 5, a discussion on future research directions are presented. Finally, in Chapter 6, the conclusions are drawn.

2 Machine Vision and Image Segmentation

This chapter provides introductory information about different segmentation techniques. In Sec-tion 2.1, popular segmentaSec-tion methods are reviewed. SecSec-tion 2.2 provides basic informaSec-tion about the segmentation of objects with known shapes. Section 2.3 investigates several existing methods for overlapping object segmentation.

2.1 Segmentation Methods

Segmentation is one of the fundamental steps in machine vision and image analysis. The main objective of image segmentation is to divide an image into constituent parts and extract important details in an image. Image segmentation partitions an image into several similarity regions in which each of the pixels in the regions are similar with respect to homogeneous characteristics such as gray-level, color, intensity, or texture. There are several approaches available for image segmentation. In the following sections, popular segmentation methods are briefly reviewed.

2.1.1 Threshold-Based Segmentation

Threshold-based segmentation is considered to be the simplest technique for extracting a fore-ground region from the image backfore-ground. The main idea is based on the intensity difference where the areas of an image, whose pixel values fall under a certain range, corresponds to a par-ticular segment. The characteristic of regions of interest is set by a predefined threshold value.

The threshold-based segmentation methods depend solely on pixel values and do not consider the spatial properties of the image. Global and local thresholding are the two most commonly used approaches for threshold-based segmentation. In global thresholding all pixels in the image are segmented with same threshold value while in local thesholding the threshold value varies locally. The key of thresholding segmentation is to choose the threshold value. Otsu’s [10], and Kittler’s [11] methods are the two most commonly used methods to choosing the threshold value.

The Otsu’s method is aimed at finding the optimal value for the global threshold by minimizing the within-class variance of the resulting foreground and background classes. Kittler’s method presents the segmentation as a two class classification problem that can be distinguished based

on the gray-level histogram of the image, so as to find a threshold that minimizes the number of miss-classified pixels by using Bayesian classification rules. An example of image segmentation by the thresholding method is shown in Figure 2. A comprehensive survey can be found in [12].

Figure 2. An example of threshold based segmentation: a) Original image; b) Segmentation result. [13]

2.1.2 Edge-Based Segmentation

The edge-based detection techniques are a common approach in image segmentation in which the edges obtained from the image are exploited to recognize the regions boundaries. The main motivation is that every single object in an image is surrounded by a closed border that can be de-tected by means of image intensity values. Edge points are the discontinuities in image intensity which basically are estimated by image first and second order derivatives. The most commonly used ones are the Sobel [14], Canny [15], Prewitt [16], and Roberts [17] edge detectors.

Edge-based segmentation basically applies an edge filter to the image, by which each pixel is classified as either edge or non-edge. Complementary processing, e.g. thresholding, is required to combine edges into edge chains by building up better boundaries and dividing the image into different segments. Figure 3 shows an example of edge-based segmentation with thresholding.

Figure 3. An example of edge-based segmentation. [18]

2.1.3 Region-Based Segmentation

Region-based segmentation algorithms are based on the idea that the pixels in the neighborhood of a particular region share similar intensity values. In general, region-based segmentation is a pixel-wise operative algorithm in which each pixel is compared to its neighborhood pixels, and if a pre-defined homogeneity condition is satisfied, the pixel and its neighbors are classified as a particular object. Region-based techniques can be divided into two categories: region growing [19] and region splitting-merging [20] algorithms.

The region growing segmentation starts with a set of seed points as the initial region hypothe-sis and then proceeds to grow the regions by adding adjacent pixels to the regions based on a predefined criteria. This procedure continues untill there is no more adjacent pixels that satisfy the criteria. The seed point initialization and similarity criteria are the key elements for region growing algorithm affecting their performances.

The region splitting-merging algorithm is an iterative implementation of region-based segmenta-tion that follows a top-to-bottom approach. The splitting-merging algorithm compares adjacent regions and merges them if they are homogeneous (see Figure 4). Specifically, it starts with the whole image as a single region, and then splits it into subsequent four disjoint quadrants. It iter-atively continues this process for each subsequent region until all pixels contained in the regions do not satisfy a certain similarity constraint, e.g. grey-level homogeneity. The splitting-merging algorithm produces adjacent regions that contain very similar properties. The most serious

dis-advantage of this method is that the algorithm is sensitive to image translation [21].

2.1.4 Watershed Transform

Watershed transform [22] can be seen as a region-based segmentation applied to a gradient im-age. It is basically, a morphological algorithm in which image edges are processed in order to extract the regions of similar and homogeneous pattern or color. The Watershed algorithm defines the region by pixels with local intensity minima and the region boundaries, watershed lines, by pixels that have the highest gradient magnitude intensities. In watershed transforma-tion, the gradient image is considered as a topographical surface in which the edges resemble the region boundaries (watershed lines) and the areas with local intensity minima resemble the regions (catchment basins).

Watershed transform is simple, computationally efficient, reliable, and it can reach acceptable results even with low contrast images. However, there are limits to what extent watershed trans-form can be used for segmentation. As the Watershed transtrans-form explores the image gradient it is highly sensitive to local minima as at each local minima an object is allocated. Watershed shows low performance when it is applied to images with thin structures [23]. In [24], three approaches namely filter and preprocessing, marker watershed, and hierarchical watershed are proposed to solve the problem of over-segmentation. Figure 5 shows an example of watershed segmentation where region markers were adopted to eliminate the overestimation.

Figure 4. Examples of region based segmentation: a) Original image; b) Segmentation result. [25]

Figure 5. An example of watershed segmentation: a) Original image; b) Gradient image; c) Watershed segmentation without using region markers (oversegmentation); d) Watershed segmentation using region markers. [26]

2.1.5 Active Contour

Active contour or snake is one of the most widely used methods in image processing [27]. Ac-tive contour aims to divide an image into multiple sub-regions of continuous boundaries. Given an initial boundary shape in the form of a closed contour, active contour models seek for the region of interest by modifying the shape boundaries through shrinking and expansion opera-tions of subject in a particular set of constraints, a so-called contour evolution. The underlining constraints that controls the contour evolution process is modeled by minimization of a certain energy functions, such as the region-based segmentation methods. The energy functionals in active contour models resemble a sort of energy criteria that, e.g. through Euler-Lagrange

equa-tions, build up a parabolic Partial Differential Equation (PDE) whose solution is the changes to apply active contour model.

The original active contour model introduced by Kass et al [28] is a boundary-based method that are governed by two terms of energy components: internal and external energy functionals.

The internal energy functional controls the smoothness of the curves and reduces the tension and stiffness. The external energy functional forces the curve to move towards the boundary of the objects in the image. Also region-based active contour models have been proposed, in which the energy functionals are based on the region statistical features, e.g. mean intensity value, rather than boundary driven forces. The Mumford and Shah model [29] is the commonly used approach that divides image a into a set of disjointed homogeneous regions. Figure 6 shows an example of active contour segmentation.

Figure 6. Active contour segmentation: a) Original image with initial contour; b) Segmentation result.

[30]

2.1.6 Graph Cut

In the recent years, there has been an increasing interest in the graph cut segmentation methods.

Graph cut methods are energy minimization segmentation models where the image is interpreted as discrete graph space. The graph model of an image is formed by nodes, and the edges rep-resenting the pixels, and the connected pixel based on similarity measurements. Figure 7 shows an example of a graph cut segmentation on an image, where the background pixels are blue, the

foreground pixels red, and the unlabeled pixels green.

Several graph cut methods with different energy functions have been proposed. Graph cut seg-mentation methods can be categorized as speed up-based graph cut, interactive-based graph cut and shape prior-based graph cut methods [31]. In general, the energy function to be minimized is constructed from summing up the edge weights. The weight of an edge in the image graph model is basically a measure of the similarity between the pixels that depends on the design of the algorithm. Graph cut is similar to active contour, but it is non-iterative and can reach global optimal result for different classes of energy functions. An advance discussion is further available in [31].

Figure 7. Graph cut segmentation: a) Original image; b) Graph cut segmentation result. [32]