Developing raster bands computed from LiDAR to improve forest

(1)

Degree Programme in Technomathematics and Technical Physics

Yuliya Fetyukova

DEVELOPING RASTER BANDS COMPUTED FROM LIDAR TO IMPROVE FOREST STAND SEGMENTATION

Examiners: docent Ph.D. Tuomo Kauranne professor Ph.D. Heikki Haario

(2)

Lappeenranta University of Technology Faculty of Technology

Degree Programme in Technomathematics and Technical Physics Yuliya Fetyukova

Developing raster bands computed from LiDAR to improve forest stand segmentation

Master's thesis 2013

135 pages, 107 gures, 1 table

Examiners: docent Ph.D. Tuomo Kauranne professor Ph.D. Heikki Haario

Keywords: LiDAR, surface interpolation, inverse distance weighting, noise reduction, edge preserving ltering, median ltering, mean shift ltering, segmentation, region growing, homogeneity criteria, AIC, forest stand

This study examines the use of dierent features derived from remotely sensed data in segmentation of forest stands. Surface interpolation methods were applied to LiDAR points in order to represent data in the form of grayscale images. Median and mean shift ltering was applied to the data for noise reduction. The ability of dierent compositions of rasters obtained from LiDAR data and an aerial image to maximize stand homogeneity in the segmentation was evaluated. The quality of forest stand delineations was assessed by the Akaike information criterion.

The research was performed in co-operation with Arbonaut Ltd., Joensuu, Finland.

(3)

This thesis is dedicated to my parents, Olga and Nikolay Fetyukov, for their love, endless support and encouragement.

(4)

I would like to express my deep gratitude to Dr. Tuomo Kauranne, my research supervisor, as well as Juhani Saastamoinen and Anna Eivazi, my research instructors, for their guidance and encouragement throughout this research work.

I would like to thank Arbonaut Ltd. for providing me with the study materials and software. I am particularly grateful to the sta of the company for the assistance in doing the forest data analysis.

I would like to thank the Department of Mathematics and Physics and Arbo- naut Ltd. for their nancial assistance.

My special thanks are extended to Professor Alexander Kolesnikov for his continuous professional guidance, valuable support, useful and constructive recommendations on the issues related to image analysis and computer vision.

I am particularly grateful for his willingness to give his time so generously.

I would like to acknowledge the support provided by Javier Rodriguez and his family during my thesis writing. I wish to thank my friends for their encouragement and motivation. I am especially grateful to Alexey Voronov, Damira Kuanyshpayeva, Tatyana Chuzhanova, Olga and Alexey Kulikov. I would like to thank Professors Valentina and Vladimir Kulikov for their support throughout my study.

I would like to express my sincere thanks to my family. I show my special gratitude to Vera Fetyukova-Skulditskaya for her commitment to excellence in all aspects of my life.

Lappeenranta, February 20, 2013.

Yuliya Fetyukova

(5)

1 INTRODUCTION

1.1 Remote sensing technology overview

Remote sensing has developed rapidly in recent years providing scientists with new ways of analyzing data and solving problems in areas such as natural resource management, urban planning, and geographical information system (GIS) [5, 29, 47]. Remote sensing is the technology of obtaining information about objects from a distance. A remote sensing data source takes many forms, including aerial photography, satellite imagery, and airborne (or aerial) laser scanning (ALS). Remotely sensed data can be collected from aircraft, spacecraft, and ships. The data are used to produce maps of the earth's surface, seaoor topography, natural resources, and urban infrastructure as well as to determine current atmospheric conditions [5].

Remote sensing systems derive information about features being evaluated by analyzing the energy reected or emitted from them. There are two main types of remote sensing systems: imaging (passive remote sensing) systems and nonimaging (active remote sensing) systems. Imaging systems, such as an aerial camera and photographic lm, systematically record the energy received from an area of the earth's surface and produce a detailed picture of the scene.

Nonimaging systems, like radar, laser and sonar positioning systems, collect a series of measurements at discrete locations or repeatedly along a single line.

Such systems use microwaves, laser light, or sound waves to illuminate the target area [5].

The process of remote sensing involves an interaction between incident radiation and the targets of interest (Figure 1). In much of remote sensing, the process involves the following elements [5, 68, 76].

(10)

Figure 1: The remote sensing process

1. Energy source or illumination, a. The rst requirement for remote sensing is to have an energy source which illuminates or provides electromag- netic energy to the target of interest. The source of the energy reected by the object may be the sun or the remote sensing system itself. For example, radar systems illuminate features on the earth by emitting microwave pulses.

2. Radiation and the atmosphere, b. As the energy travels from its source to the target, it will come in contact with and interact with the atmosphere it passes through. This interaction may take place a second time as the energy travels from the target to the sensor. Atmospheric eects on this energy dier depending on the wavelength of the energy and the atmospheric conditions.

3. Interaction with the target, c. Once the energy reaches an object, some of it is reected back, some of it is transmitted through the object, and some of it is absorbed. Objects reemit absorbed energy as heat.

4. Recording of energy by the sensor, d. The energy reected or emitted from the object then interacts and passes through the atmosphere to be detected by sensor systems. Sensors systems detect the energy from the object using lm (photography), electronic detectors (scanners, digital cameras, etc.), or antennae (radar) and record it as a physical image or as digital data that can be used to generate an image.

5. Transmission, reception, and processing, e. The energy recorded by the sensor has to be transmitted, often in electronic form, to a receiving and processing station where the data are processed into an image.

(11)

6. Interpretation and analysis, f. The processed image is interpreted, visually and/or digitally or electronically, to extract information about the target which was illuminated.

7. Application, g. The results of interpretation and analysis are used to apply the information that was extract from the imagery about the target in order to better understand it, reveal some new information, or assist in solving a particular problem.

Remote sensing provides information that is essential for resource inventory and monitoring. The technology oers important advantages over other methods of data collection that have led to its use in a wide range of applications.

Some of the major applications by dierent areas include: crop condition monitoring, forest inventory and forest health monitoring, climate monitoring and seasonal climate prediction, monitoring of water and land surface features, rapid mapping and damage assessment after natural disasters (hurri- canes, earthquakes, landslides, ood etc.), mapping of corridors (roads, railway tracks, pipelines, etc.) and electrical transmission lines, landscape design, location of re, ground and see targets for military purposes and so on [5, 49].

1.2 Study objectives

During the 2000's ALS has started to have an important role in practical forest inventories especially in Scandinavia. In Finland, the current forest inventory system is based on a combination of ALS data, aerial imagery, and eld surveys.

Nowadays there has been considerable interest in replacing old eld inventory with remotely sensed data interpretation [47]. Remote sensing and GIS enable to improve monitoring, mapping, and management of forest resources. The data that support forest management are stored primarily in the form of forest inventory databases within a GIS environment. A forest inventory can be dened as a survey of the location, composition, and distribution of forest resources. A forest inventory generalizes complex forest resource attributes into mapping units useful for forest management. The types of attributes attached to individual mapping units, or polygons (stands), might include stand species composition, density, height, age, leaf area index, and so on.

The remotely sensed data collected for the forest inventory are generated by interpretation of aerial photographs or airborne and satellite digital imagery

(12)

[5].

A forest stand is the basic unit for management planning and data collection.

Stands are delineated on the basis of forest management needs. Using image segmentation techniques, remotely sensed data can be divided into spatially disjoint homogeneous regions. Image vision research has focused on developing segmentation algorithms based on either colour or texture features, and some attempts have been made to combine these to build a unied segmentation concept [42].

Forest stand delineation using homogeneous criteria is based on knowledge about the area specic tree species distribution. The most important com- mercial tree species groups in Finnish forests are Scots pine (Pinus sylvestris), Norway spruce (Picea abies), and deciduous trees. The two conifers consti- tute more than 80% of the growing stock. The latter group consists of mainly birches (Betula pubensces and Betula pendula) [47]. A typical Finnish forest area is considered in this research. The aim of this study is to improve the segmentation of forest stands using ALS and aerial image data. In particular, this work examines the use of dierent features derived from remotely sensed data which maximize stand homogeneity in the segmentation. The following specic objectives were established:

1. to create rasters from LiDAR points and an aerial image by means of interpolation and image arithmetic methods;

2. to reduce noise level in the rasters;

3. to test dierent combinations of the rasters and parameters to nd reli- able inputs for image segmentation;

4. to dene stand homogeneity and evaluate the segmentation performance in delineation of homogeneous forest stands.

1.3 Thesis organization

This thesis is arranged as follows. Chapter 2 provides an introduction to forest inventory methods such as eld surveys and remote sensing. Chapter 3 describes data interpolation methods for representation of remotely sensed data in the form of grayscale images. Chapter 4 is devoted to image processing and analysis methods widely used in forest inventories: visual image interpreta-

(13)

tion, image noise reduction, segmentation and its quality assessment. Chapter 5 describes the experimental part of the study. It contains the explanation of the study materials and methods used throughout the research to achieve forest stand delineation: data preparation, raster creation, raster ltering, segmentation and its results. Chapter 6 summarizes the experimental results and presents conclusion of the study.

(14)

2 FOREST INVENTORY METHODS

The variables of interest in forest inventory can include the amount of trees by means of stem volume or biomass, stand structural information, health data or plant physiological data, volume variables such as tree size (diameter), tree height, total number of trees as well as forest growth, tree species and dierent area information (site classes, stand development stages). Forest inventory is usually based on principles of sampling theory. In Finland, forest management planning in private forests is usually based on information collected by forest stands. Aerial images have already been used for decades in practical forestry in Finland for stand delineation. Visual interpretation of aerial images for single tree characteristics such as species, height, crown width and area, and breast height diameter has been tested in many experimental studies [30].

ALS has recently become an important technique for tree data acquisition.

It produces a 3D point cloud which approximates the surface of the earth.

ALS can observe the non-visible part of forests, such as co-dominant and sup- pressed canopy layers. ALS data are regarded as having a greater potential for characterizing the canopy structure than other remote sensing materials [30, 47].

2.1 Field surveys

Field survey requires visiting every stand of target area and measuring a suf- cient number of subjectively located sample plots, typically several per each stand. Usually such forest stand characteristics as basal area, which is the total cross-sectional area of the trees in a stand at breast height (1.3 meters or 4.5 feet above the ground), basal-area medium diameter, and mean height are visually assessed from these sample plots [20]. Plot locations are determined with a sampling strategy that depends on the aims of the inventory, the shape of the inventory area, and possibly forest characteristics. It is recommended to include each forest type of the area in the samples. The number of plots required depends on the aspired accuracy of the estimates, and variability of the forest characteristics in the inventory area [19].

In boreal forests, stand characteristics of eld sample plots are mainly measured using relascope sampling. In such measurements, the trees are viewed from the centre point of the plot, and included in it if the breast height di-

(15)

ameter lls the horizontal angle of the relascope. The inclusion probability is proportional to the basal area of the tree [19].

Single-tree measurements of elds sample plots are acquired when more accurate measurement information of stand characteristics is needed. Such observations may also be needed to assess the role of small undetectable trees [19, 21].

Field observations are used for reasons of validation and calibration of a model based on data derived from ALS or aerial imagery. Model accuracy can be controlled and improved by analyzing the dierences between eld-measured and model-estimated characteristics. Field sample plot measurements serve as ground-truth data source [21].

Modern eld measurements are carried out with the aid of eld computers and satellite-positioning systems, with which measured forest data can be trans- ferred directly to databases accompanied by accurate positioning data [16].

2.2 Remote sensing and GIS

Technologies for acquiring spatial forest resource data have developed rapidly in recent years. Modern remote sensing is now able to provide cost-ecient spatial digital data [16]. Various sources of remotely sensed data have been tested and used in forest inventory by now. Remotely sensed data obtained from dierent source types have been used as auxiliary data covering the entire inventory area. Plotwise estimates are derived by merging remote sensing data and eld measurements using a suitable mathematical model [19].

Forest inventory characteristics obtained with remote sensing are linked to their measurement spatial location by geographical coordinates using GIS. In forest inventory, GIS means merging inventory data and database technology.

This information can be stored digitally in vector or raster forms. Vector form denes areas by vectors limiting them, raster form by rows and columns of pixels, as small area units. Remotely sensed data are operated in a positioning system, and all measurements are handled together with the given map information [19]. Tree main types of remotely sensed data used in forest inventory, satellite images, aerial photographs, and LiDAR data, are considered below.

(16)

2.2.1 Satellite images

Satellite images refer to images that are collected using remote sensing instruments in orbit around the earth. The space instruments are typically satellite-based, although-low earth orbiting space stations or reusable vehicles, such as the space shuttle, are also used [5].

Forest inventory applications based on satellite image processing and analysis have been widely used to produce information on forest resources for large areas (municipal, regional or national level) [26, 46]. Satellite image acquisition is problematic since it is important to obtain cloud-free images. A suitable imaging season in Finland is from mid-May until the end of August with the optimal time from early June until the end of July [44].

Satellite image data may consist of several spectral bands, or channels, with each channel representing an image with a dierent wave length, varying from ultraviolet light and visible light to infrared. Satellite image spatial resolution range varies depending on the equipment and the channel. Spatial resolution may be dened as the minimum distance between two objects that a sensor can record distinctly [44]. Spatial resolution often refers to the neness of detail visible in an image [5]. Mid-resolution satellite images, for example, Landsat and Spot, have spatial resolution between some meters to approximately 30 meters. Spatial resolution of modern high-resolution satellite imagery, such as IKONOS and QuickBird, varies from less than a meter to some meters, depending on the band mode [19]. In forestry, high-resolution satellite imagery could in principle be used for forest planning purposes, but the high cost of these images has hindered such development up to now [16].

2.2.2 Aerial photographs

The term airborne imagery or aerial photographs refers to data collected from aircraft, helicopters, and other vehicles operating within the earth's atmosphere [5]. Aerial photographs can cover large areas. Multiple photographs are combined to cover the whole area. Aerial photographs can be used for small or large area inventories with relatively cheap costs [19].

Aerial photographs can compose of several spectral channels. Visible light channels measuring red, green and blue (RBG) colour wavelengths or near

(17)

infrared (NIR) or colour infrared (CIR) channels can be used in airborne imagery. CIR is a combination of RGB (or RG) and NIR channels. Pixel size and ight altitude dene the resolution and usability of the data [19].

Aerial images provide map-like representations of the earth's surface such as mosaics and orthophotos. Aerial mosaics are produced by assembling adjacent aerial photographs to form a single image representing a regional view beyond that provided by any single photograph. Digital aerial photographs can be rectied to the desired coordinate system. In order to provide a planimetrically correct aerial image, the eects of image displacements due to camera tilt and terrain relief must be removed [5]. The result of such a correction, an orthophoto, is spatially almost as accurate as an ordinary map, and the image is highly scalable. Digital orthophotos are currently used mostly as background images in forestry mapping and GIS applications, for example, for on-screen visual interpretation [16].

Photographing can only take place under an unclouded sky. Dierent conditions of lightning and shades depending on the weather and time of the day and year aect the colour range and shade directions in each photograph. Suit- able correction methods are needed to standardize the photographs of a given area to t the same conditions. After standardization, inventory data can be produced using human interpretation or automated mathematical methods [19].

2.2.3 LiDAR data

ALS is one of the most recent remote sensing data source applied to forest inventory. ALS is often referred to as Light Detection And Ranging (LiDAR).

LiDAR measurements are mainly used for small area inventories, for example in forest management planning [19].

LiDAR is based on distance measurements and precise orientation of the measurements between a sensor and a reecting surface [41]. The position and rotation of the sensor is continuously recorded using a dierential global positioning system (GPS) and inertia measurement units (IMU) along the ight path. The airborne sensor in an aeroplane sends laser pulses towards the ground and records the time lapse between the launch of the beams and the return of the signals to the sensor. By knowing the sensor position, the distance

(18)

and the incidence angle of each measurement, the coordinates of the reecting object can be calculated. The scanning mechanism sweeps the laser beam across the ight line, providing coverage across the ight track. In forestry applications, the typical ight altitude is 200 - 2 000 meters above ground level. The scanning angle has usually varied between 7 to 15 degrees and the pulse density on the ground is mostly between 0.1 - 10 hits per square meter.

In forest areas, some of the LiDAR beams are reected from branches, tree- tops, and the canopy, and some of them are reected from the ground, thereby producing accurate three-dimensional information from the forests (Figure 2) [17, 18, 30, 55].

Figure 2: Laser scanning over forest areas [25]

There are two main approaches to the derivation of forest information from ALS data: the laser canopy height distribution approach, usually applied for low resolution data (pulse density below one hit per square meter) and the individual tree based approach, usually applied for high resolution data (pulse density above one hit per square meter). The limitations of the use of high resolution data for forest inventory purposes are mainly the cost and dicul- ties to automatically interpret the detailed information with complex texture [30]. Data with dense LiDAR measurements can be utilized to obtain detailed estimates, for example, estimates of individual trees, while data with lower resolution is generally sucient for statistical estimates, for example, total volume of trees within a given area [19].

(19)

LiDAR has denite benets compared to other remote sensing data with regard to the condence and objectivity of the data. Satellite and aerial photographs are limited to weather conditions. The measurements of LiDAR can be performed under a cloudy sky (if the ight altitude is below the cloud altitude) and at night, since LiDAR is an active sensor that provides its own energy. The measurements are handled automatically by physical or statistical methods so no human interpretation is included at any point [19].

2.2.4 Remote sensing data source selection

Nowadays modern remote sensing techniques have become very attractive al- ternatives for expensive eld measurements in forest inventories. Dierent sources of remotely sensed data are utilized for dierent purposes. ALS plays an important role in small area forest inventories in Scandinavia. There is no spaceborne laser scanning instrument available at the moment and therefore large-area remote sensing cannot rely on laser scanning. In order to estimate large forest areas, relatively cheap and easily acquired data techniques are needed. Satellite images and aerial photographs are used giving tolerable estimates of forest inventory stand parameters. It has been shown that ALS combined with aerial photographs give promising results with tolerable costs.

Aerial images are utilized together with ALS data since the use of spectral data improves the separation of tree species. Recently, prices of ALS inventory have reduced as the method has seen wider use. Overall, if the inventory area is compact and unscattered, unit price of the inventory becomes cheaper than for a scattered inventory area [19, 29].

(20)

3 SURFACE INTERPOLATION FROM LIDAR DATA

It is very common in GIS that sample measurements are taken only at a set of points [39]. Such a data set represents changes in landscape or environment.

Interpolation methods can be used to visualize and store the continuity and variability of observed data across a surface for further analysis. A surface can be interpolated to dierent types of surface representation, including contours or isolines, a triangulated irregular network (TIN), and a regular grid representation (Figure 3). Surface representation in its simplest form is done by storing x, y and w(x, y) values that dene the location of a sample and the change characteristic represented by thew(x, y) value [62].

Contours or isolines are used to dene a common characteristic along a line.

Contours join locations of equal value to each other (Figure 3a).

A TIN is a vector data structure used to store and display surface models. A TIN partitions geographic space using a set of irregularly spaced data points, each of which has x, y and w(x, y) values. These points are connected by edges that form contiguous, nonoverlapping triangles and create continuous surface that represents the terrain (Figure 3b).

A regular grid is a spatial data structure that denes space as an array of cells of equal size that are arranged in rows and columns. Each cell contains an attribute value that represents a change inw(x, y) value. The location of the cell in geographic space is obtained from its position relative to the grid's origin (Figure 3c) [62].

Regular grid representation have emerged as the most widely used data structure because of their simplicity (simple elevation matrices that record topo- logical relations between data points implicitly) and ease of computer implementation. The size of the grid mesh often aects the storage requirements, computational eciency, and the quality of the results [51]. In this work, a regular grid representation is used to visualize and store a surface model.

Interpolation is a procedure used to predict the value of cells at locations that lack sampled points. It is based on the principle of spatial autocorrelation or spatial dependence, which measure degree of relationship or dependence

(21)

(a) Contours (b) TINs

(c) Grids

Figure 3: Examples of surface representation [62]

between near and distant points [62]. In other words, interpolation is based on the assumption that values of points that are close to one another are more alike than those values of points that are farther apart [39]. Practically, the interpolating procedure task is to shift irregular grid points to regular grid points (Figure 4).

The characteristics of an interpolated surface can be controlled by limiting the input points used. Points can be limited by specifying a maximum number of points, which will select the nearest points until that maximum number of points is reached. Alternatively, a radius can be specied in map units.

Specifying a xed radius in map units will select only points within the radius distance from the center of the output cell unless there are not enough points

(22)

Figure 4: The irregular (red dots) and regular (blue dots) grid points within that radius (Figure 5) [62].

Figure 5: Limiting the input points used by specifying a maximum number of points (left) or a radius (right)

A variety of methods for interpolating a surface from a set of points have been proposed [39], for example, inverse distance weighting (IDW), kriging, and spline interpolation. Description of the IDW interpolation and the main idea of kriging and spline interpolation are given below. The IDW method is used in this thesis. Additionally, methods based on statistical and colour properties of features derived from LiDAR data are proposed.

(23)

3.1 Inverse distance weighting interpolation

The IDW interpolation assumes that each measured point has a local inuence that diminishes with distance. Thus, points in the near neighbourhood are given high weights, whereas points at a far distance are given small weights.

The general formula of IDW interpolation for 2D problems is the following:

ˆ

w(x, y) =

N

X

i=1

λ_iw_i, λ_i = 1

d_i p

N

X

k=1

1 dk

p, (1)

where wˆ(x, y) is the predicted or output cell value at location (x, y), N is the number of nearest known points surrounding (x, y), λ_i are the weights assigned to each known point value w_i at location (x_i, y_i), d_i are the 2D Eu- clidean distances between each(x_i, y_i)and(x, y), andpis the exponent, which inuences the weighting of w_i on wˆ.

For 3D problems, the IDW interpolation function is similar as Equation (1), by measuring 3D Euclidean distances for d_i [39].

The IDW interpolation gives greater weights to points closest to the prediction location, and the weights diminish as a function of distance, hence the name inverse distance weighted. Weights are proportional to the inverse of the distance (between the data point and the prediction location) raised to the power value p. As a result, as the distance increases, the weights decrease rapidly.

The rate at which the weights decrease is dependent on the value ofp. Ifp= 0, there is no decrease with distance, and because each weightλ_i is the same, the prediction will be the mean of all the data values in the search neighbourhood. As p increases, the weights for distant points decrease rapidly. If the pvalue is very high, only the immediate surrounding points will inuence the prediction. A power of 30 would be considered extremely large. Whenp= 2, the method is known as the inverse distance squared weighted interpolation.

In practice, p = 2 is used as a default value, although there is no theoretical justication to prefer this value over others. A surface calculated using IDW depends on the selection of the power value p and the search neighbourhood strategy. IDW is an exact interpolator, where the maximum and minimum values in the interpolated surface can only occur at sample points [56].

(24)

3.2 Kriging interpolation

Kriging is a spatial interpolation technique developed by the French mathe- matician Georges Matheron (1970), who named the technique after D.J. Krige, a South African mining engineer who rst practiced on the topic (1951) [39, 63].

Kriging is a weighted average technique. When interpolating a surface at an unobserved point, the distance and direction of every point pair of a specied number of neighbouring points or all neighbouring points within a specied radius is quantied to provide information on the spatial autocorrelation of the sample point set. Next, a best-t model is automatically applied to the data and the unknown values are predicted [79, 62].

The general formula for kriging can be written as [39, 62, 63]

ˆ

w(x, y) =

N

X

k=1

λ_iw_i

N

X

k=1

λ_i

, (2)

where wˆ(x, y) is the predicted value at location (x, y), N is the number of nearest known points surrounding(x, y),λ_iare the kriging weights assigned to each known point valuew_iat location(x_i, y_i). λ_iare chosen such that the kriging variance (also called kriging error): Var[ ˆw(x, y)−w(x, y)] is minimized subject to the condition: E[ ˆw(x, y)−w(x, y)] = 0 (E is the mathematical expectation).

The interpolated values can exceed value range of samples, and the surface does not pass through samples. There are several types of krigins (simple kriging, ordinary kriging, universal kriging, etc.) diering in statistical properties of a point set [39, 62].

3.3 Spline interpolation

Spline estimates values using a mathematical function that minimizes overall surface curvature. This guarantees a smooth-looking surface passing exactly through the sample points [62, 79].

(25)

The spline interpolation method ts a exible surface across all the known point values. A surface passes through each sample point and may exceed the value range of the sample point set [79]. Creating a spline surface involves taking the product of the same spline blending functions used for spline curves [60]. The spline curve interpolation represents the graph with a simple function on each interval between data points. The simplest spline is obtained by connecting the data with lines. If a quadratic function is applied on each interval then the graph is smoother. A quadratic spline makes the curve smooth at the data points themselves by matching up the derivatives. Using cubic functions or fourth degree functions results smoother curves [71]. Cubic is the optimal degree for splines. It provides the best trade-o between performance and computation eciency in spline families. The cubic spline interpolation is the mostly used in practice [71, 72].

For the givenM×N grid, the general formula for spline surface interpolation can be written as [60, 81]

ˆ

g(x, y) =

M

X

i=1 N

X

j=1

P_i,jN_i,p(x)N_j,q(y), (3)

where N_i,p(x) and N_j,q(y) are blending functions or polynomials of degree p andq, respectively,P_i,j are control points; the scalar coecients, which can be obtained by processing the bidimensional set of samples in one direction rst, and then by processing the remaining direction (the x− and y−directions).

This separable approach is the most ecient way to process data with more than one dimension.

Dierent variations of splines for curve and surface interpolations and their applications can be found, for example, in [45, 61, 65, 73].

3.4 Interpolation method selection

The type of interpolation method which can be used depends on many fac- tors. The choice is inuenced by the real-world knowledge of the surface to be modeled. A surface created with IDW will not exceed the known value range. A surface created with spline or kriging can exceed the known value range. While a spline surface passes exactly through each sample point, IDW

(26)

and kriging surfaces will not pass through any of the sample points [62, 79].

Kriging is a complex interpolation technique. Comparing to kriging, IDW and spline have implementation simplicity. However, IDW performs low on large datasets while spline performs stable [54].

IDW is successfully used for a phenomenon whose distribution is strongly correlated with distance. For example, noise which falls o very predictably with distance. IDW gives explicit control over the inuence of distance. The IDW interpolation should be used when the set of points is dense enough to capture the extent of local surface variation needed for analysis.

Kriging is most appropriate when a spatially correlated distance or directional bias in the data is known. The method is often used for applications in soil sci- ence and geology. Kriging is widely used for representing the terrains changing abruptly or frequently. The method can be applied when the sample points are poorly distributed or there are few of them. Directional inuences, such as prevailing winds and random error, can be accounted for using Kriging.

The spline interpolation cannot represent a surface correctly when sample points are close together and have extreme dierences in value due to the method is based on slope calculations. Therefore, phenomena that cause surface values to change suddenly, such as a cli face or a fault line, are not represented well by using spline. It can predict ridges and valleys in the data and is the best method for representing the smoothly varying surfaces of phenomena such as temperature.

Applying dierent interpolation methods and comparing the results might determine the best interpolation method for a given project (Figure 6). Each method works dierently but most utilize the concept of spatial autocorrelation; near samples are more alike than samples far away [62, 79].

(27)

(a) IDW (b) Kriging

(c) Spline

Figure 6: Interpolating surfaces [62]

(28)

4 IMAGE ANALYSIS TO FOREST INVENTORIES

4.1 Visual interpretation of aerial and satellite imageries

Visual interpretation refers to the exploration and assessment of ne-resolution satellite and aircraft data by human interpreters. Image interpretation can provide information not easily obtained from other sources. The practice of visual image analysis requires the explicit recognition of eight elements of image interpretation that form the framework for understanding the meaning of in image: shape (the outline of a feature), size (the dimensions of a feature), tone (the average brightness of an area, the dominant colour of a region), texture (the variation in tone over a surface, the apparent roughness of a surface), shadow (the large, distinctive shadows that reveal the outline of a feature as projected onto a at surface), site (a feature's position with respect to topography and drainage), association (distinctive spatial interrelationships between features), and pattern (distinctive arrangements of features) [5].

Visual interpretation can include such tasks as classication (assigning image objects to classes), enumeration (listing or counting discrete image items), mensuration (measuring lengths, widths, distances, and volumes of image objects), and delineation (demarcating image regions, separating regions characterized by distinctive tones and textures and nding the edges between adjacent regions). The image interpretation tasks apply the elements of image interpretation in a way that helps the interpreter derive useful information from the imagery [5].

Visual image interpretation is regarded as a labour intensive, time consuming and therefore expensive process. Besides, a visual analysis by human interpreters might be subjective, for example, when estimating forest stand boundaries. Semi-automated and automated approaches to image processing can produce more objective results and reduce time and costs [27].

(29)

4.2 Digital image processing

Digital image analysis has become an important tool used in processing remotely sensed imagery. Nowadays specialized image analysis software oers a wider range of complex computational methods for improving image quality or generating required information from remotely sensed data [5]. Digital image processing techniques are used in a variety of problems. In this thesis, image enhancement and segmentation techniques are considered.

Image enhancement can be dened as a change in the image appearance in order to both make visual interpretation easier and to make the images more amenable to subsequent automatic analysis. It is typically the stage in image analysis that follows image capture. Methods can include adjustment of brightness or contrast, edge detection and edge sharpening, noise reduction, removal of small irrelevant features, and adjustment for background intensity variation [6]. In this thesis, noise reduction techniques were applied to aerial images and images obtained from LiDAR data before forest stand delineation.

4.2.1 Image noise reduction

4.2.1.1 Impulse noise and Gaussian noise. LiDAR pulses as any other signals acquired through aircraft measurement sensors are contaminated by noise generated by electronic devices on the aircraft (rotating electrical ma- chines, pulsed electronic equipment, propeller systems, receiver oscillators, and so on) as well as external events: wind, vibrations, variations of temperature, variations of humidity, and so on [75]. In this study, LiDAR data are presented in the form of grayscale images representing forest inventory information by their intensity values. The grayscale images are obtained by data interpolation techniques.

A grayscale image of sizeM ×N is considered as a matrix dened as follows [14]:

(30)







f(0, 0) f(0, 1) · · · f(0, N −1) f(1, 0) f(1, 1) · · · f(1, N −1)

... ... ... ...

f(M −1, 0) f(M−1, 1) · · · f(M −1, N −1)







, (4)

wheref(x, y)is an intensity or gray level of the image at the pixel (x, y) for x= 0, 1, 2, . . . , M −1 and y= 0, 1, 2, . . . , N −1.

The imagef can be decomposed into a desired component,g, and a noise component (an unwanted image component),ε. The most common decomposition is additive [9]:

f =g+ε. (5)

Image noise determines the statistical behavior of the gray level values in the noise componentε. These values may be characterized by a probability density function (PDF). PDF denes probability for the noise component values ε to be added to the gray level values of the desired component g. A plot of PDF presents the amount of distortion of a pixel value against the frequency with which it occurs [14].

There are various types of additive noise. It is often assumed to be impulse noise or Gaussian noise [37].

The PDF of impulse noise is given by

p(ε) =











P_a for ε =a, P_b for ε =b, 0 otherwise.

(6)

where ε represents gray level, P_a and P_b are the noise densities for a and b respectively (explanation is given below).

A plot of this function is shown in Figure 7, left.

(31)

Impulse noise alters at random some pixel intensities. P_a and P_b dene the noise density, that is the percent of the image area containing noise values.

These values mean the probability of occurrence of white and black pixels.

Noise impulses can be negative or positive. It is assumed that a and b are saturated values. They are equal to the minimum and maximum allowed values in the image. As a result, negative impulses appear as black points and positive impulses appear as white points in an image. Because of its appearance as white and black dots superimposed on an image, this noise is also called salt-and-pepper noise or spike noise (Figure 8b and Figure 9c) [14].

The PDF of a Gaussian random variable,ε is given by

p(ε) = 1

√2πσe^−(ε−µ)²^/2σ², (7)

where ε represents gray level, µ is the mean value of ε, σ is the standard deviation ofε, σ² is the variance ofε.

A plot of the PDF of Gaussian noise is shown in Figure 7, right.

Whenεis described by Equation (7) approximately 70% of its values will be in the range[(µ−σ), (µ+σ)]and about 95% will be in the range[(µ−2σ), (µ+ 2σ)].

Figure 7: PDF of impulse noise (left) and Gaussian noise (right) Gaussian nose changes every image pixel from its original value usually by a small amount. Additive zero-mean Gaussian is often used in practice. A value drawn from a zero-mean Gaussian probability density function is added to the true (original) value of every pixel (Figure 8c, c and Figure 9c) [37].

The fragment of the image boston.tif from the MatLAB library corrupted

(32)

with impulse noise (P_a = P_b = 0.02) and Gaussian noise (µ = 0 and σ = 0.005) is given in Figure 8. All the pixels in the original image were scaled to the interval [0, 1] by multiplying each pixel by the quantity 1/Max, where Max is the maximum pixel value in the image [40].

(a) Original

(b) Image with impulse noise, Pa=Pb = 0.02

(c) Image with Gaussian noise, µ= 0,σ² = 0.005

Figure 8: Original image (a) and images corrupted with impulse noise (b) and Gaussian noise (c)

(33)

Figure 9 illustrates the fragments of the corresponding images in Figure 8 for better visual comparison of performance of considered noise models.

(a) Original

(b) Image with impulse noise, Pa=P_b = 0.02

(c) Image with Gaussian noise, µ= 0,σ² = 0.005

Figure 9: Fragments of the corresponding images in Figure 8

4.2.1.2 Introduction to smoothing lters. The objective of noise reduction is to detect and remove noise from an image. In this case, ltering is applied. Filtering can be dened as an image processing operation where a pixel is replaced with some function of the pixel intensities in the neighbourhood of that pixel. Filtering is used for noise reduction, contrast enhancement, and edge detection [6]. Filters used for noise reduction called as smoothing lters are discussed in this thesis. Smoothing lters can be classied into two classes: linear and non-linear.

In linear lters, a pixel is replaced with a linear combination of intensities of neighbouring pixels [6]. Filter operations work with the values of the image pixel in the neighbourhood and the corresponding values of a subimage that has the same dimensions as the neighbourhood. The subimage is called a mask, kernel, or window. The values in a mask are referred to as coecient, rather

(34)

than pixels [14].

The mechanics of ltering consists of moving the mask from pixel to pixel in an image. The response of smoothing at each pixel (x, y) is calculated as a sum of products of the mask coecients and the corresponding intensities of image pixels in the area spanned by the mask. For a mask of size m×n, it is assumed that m = 2a+ 1 and n = 2b+ 1, where a and b are nonnegative integers. The most natural form of a lter mask is rectangular with sides of odd length [14].

In general, linear ltering of an imagef of size M ×N with a mask wof size m×n is given by the expression [14]:

g(x, y) =

a

X

s=−a b

X

t=−b

w(s, t)f(x+s, y+t), (8)

where a = (m−1)/2, b = (n −1)/2 for x = 0, 1, 2, . . . , M −1 and y = 0, 1, 2, . . . , N −1.

To ensure that the average intensity in an image remains the same after l- tering, the mask coecients need to be normalized by dividing by their sum so thatP

w(s, t) = 1 [38, 53]. The mask size aects which image details can be accepted as irrelevant details. The irrelevant details mean pixel regions that are small with respect to the size of the lter mask. The ltering task is to reduce such irrelevant details [14].

Smoothing linear lters are applied to reduce Gaussian noise. In order to reduce impulse noise, rank order ltering is used [37]. This ltering is discussed below.

A simple smoothing linear lter is a mean lter having a mask where all of the values have the same weight. Figure 10 shows the simplest mask of this ltering, that is the3×3mean lter mask. Filtering with this mask yields the standard average of the pixel intensities under the mask [14].

Figure 11 shows results of ltering with the3×3and 7×7mean lter masks applied to images with impulse noise and Gaussian noise illustrated in Figure 8.

For better visual comparison of results, corresponding image fragments are published in Figure 12.

(35)

Figure 10: The3×3 mean lter mask

The lter enhanced better the image corrupted with the Gaussian noise especially in the case of using the7×7lter mask. However, the ltering removed some small light objects (for example, the moving cars on the roads, which are light dots) and blurred the borders of light objects.

Smoothing linear lters can have various masks with dierent coecients. Us- ing such a lter mask means that pixels are multiplied by dierent coecients, thus giving more weight to some pixels at the expense of others. This ltering is referred to the so-called weighted average. Weighing the center point the highest and then reducing the value of the coecients as a function of increasing distance from the origin is aimed to reduce blurring in the smoothing process. The widely used mask is a Gaussian kernel that represents the shape of a two-dimensional Gaussian distribution. It denes a Gaussian lter [14, 53]. More examples of dierent kernels for linear lters can be found, for example, in [14].

The Gaussian lter of an M ×N imagef with anm×n kernel is dened by the expression:

g(x, y) =

a

X

s=−a b

X

t=−b

e⁻

s²+t²

2σ² f(x+s, y+t)

a

X

s=−a b

X

t=−b

e⁻

s²+t² 2σ²

, (9)

where a = (m−1)/2, b = (n −1)/2 for x = 0, 1, 2, . . . , M −1 and y = 0, 1, 2, . . . , N −1.

The σ is a parameter dening the extension of the m×n neighbourhood. A new pixel value is set to a weighted average of them×n neighbourhood. The original pixel value receives the largest weight (having the highest Gaussian

(36)

(a) Impulse noise reduction with the

3×3 mean lter (b) Impulse noise reduction with the 7×7 mean lter

(c) Gaussian noise reduction with the

3×3 mean lter (d) Gaussian noise reduction with the 7×7 mean lter

Figure 11: Results of mean ltering with the3×3(a, c) and7×7(b, d) masks value) and neighbouring pixels receive smaller weights as their distance to the original pixel increases [14, 35].

The denominator in Equation (9) is the normalization constant. The normalization ensures that the average gray level of the image remains the same after smoothing with this kernel. This is known as average gray level invariance [80].

In practice, the neighbourhood of the pixelp= (x, y)considers only the pixels q = (s, t) such that kp−qk ≤ 2σ. The rationale is that the contributions of pixels farther away than 2σ is negligible. Hence, the kernel size is limited to size 4σ+ 1×4σ+ 1 [35]. The result depends on the steepness of kernel slope.

More smoothing is obtained since a gentle kernel is used. A steep kernel yields

(37)

3×3 mean lter (b) Impulse noise reduction with the 7×7 mean lter

3×3 mean lter (d) Gaussian noise reduction with the 7×7 mean lter

Figure 12: Fragments of the corresponding images in Figure 11 slight smoothing.

The examples of the Gaussian lter with σ = 2 and σ = 4 are presented in Figure 13 and the corresponding fragments are in Figure 14. The2σ+1×2σ+1 kernels were used in the ltering. The edges are lost with high values ofσ since more averaging is performed.

By using Equation (8), the ltering process results in an image with reduced sharp transitions in pixel intensities. Noise typically consists of sharp transitions in pixel intensities and edges of image objects also are characterized by sharp transitions in pixel intensities, so the smoothing linear lters have the undesirable side eect that they blur edges [14].

Smoothing linear lters are unable to reduce noise without blurring edges.

With nonlinear lters, this task is possible. Nonlinear lters also operate on neighbourhoods. However, they do not explicitly use mask coecients. The ltering operation is based conditionally on the values of the pixels in the

(38)

(a) Impulse noise reduction,σ= 2 (b) Impulse noise reduction, σ= 4

(c) Gaussian noise reduction, σ= 2 (d) Gaussian noise reduction,σ = 4 Figure 13: Results of the Gaussian ltering

neighbourhood under consideration. For example, in order-statistics lters, a response is based on ordering (ranking) the pixels contained in the image area encompassed by the lter mask, and then replacing the value of the center pixel with the value determined by the ranking result. The most common rank order lter is the median lter. It replaces the pixel with the median of the pixel intensities in a neighbourhood [6, 14, 37].

The median lter outputs the median value of the sorted list of pixel values in the neighbourhood. When impulse noise is present in an image, noise pixels are almost always at the beginning or end of the sorted list, and seldom appear at the median location. For this reason, median ltering is the preferred method for impulse noise removal [38].

Figure 15 shows results of median ltering with the 3 ×3 and 7 ×7 lter

(39)

(a) Impulse noise reduction,σ= 2 (b) Impulse noise reduction, σ= 4

(c) Gaussian noise reduction, σ= 2 (d) Gaussian noise reduction,σ = 4 Figure 14: Fragments of the corresponding images in Figure 13

masks applied to images with impulse noise and Gaussian noise illustrated in Figure 8. For better visual comparison of results, corresponding image fragments are published in Figure 16. More examples of nonlinear lters can be found, for example, in [13].

The 3× 3 median lter removed the impulse noise successfully and at the same time left edges and small image details legibly. The 7×7 median lter also removed the impulse noise eectively and left edges but small light details (for example, the moving cars on the roads) were deleted. In this case, the mask size was large and those objects were accepted as noise. In the case of Gaussian noise reduction, the lters were not able to accomplish its task such eciently as dealing with the impulse noise. However, the7×7median lter removed Gaussian noise better than the7×7mean lter. The median ltering preserved edges and details better. The image after ltering is blur less as in the case of the mean lter.

(40)

3×3 median lter (b) Impulse noise reduction with the 7×7 median lter

3×3 median lter (d) Gaussian noise reduction with the 7×7 median lter

Figure 15: Results of median ltering with3×3 (a, c) and 7×7(b, d) masks 4.2.1.3 Edge preserving lters. In order to reduce noise eectively and not to lose image details, advanced smoothing methods are needed. In stand delineation algorithms, it is an important task to preserve image edge information after ltering as a preprocessing step for image segmentation [17, 42].

There has been a remarkable eort to nd a nonlinear operator able to remove noise, while preserving edges and corners. The most popular edge preserving lters are based on median ltering, morphological analysis, total variation, bilateral ltering, mean shift, and anisotropic diusion [34]. These lters provide a reasonable trade-o between noise reduction and the retention of image details. The lters are proposed to remove noise while preserving edges, by means of a nonlinear combination of nearby image values. In most cases, edge preserving ltering is based on simultaneous processing of both the spatial and

(41)

3×3 median lter (b) Impulse noise reduction with the 7×7 median lter

3×3 median lter (d) Gaussian noise reduction with the 7×7 median lter

Figure 16: Fragments of the corresponding images in Figure 15

intensity information of image pixels. It ensures that each pixel will be inuenced mainly by neighbouring pixels that have spatial closeness and similar intensities [7, 13, 34, 35, 43, 50, 52]. In this thesis, median and mean shift ltering are applied to grayscale images.

Mean shift ltering is widely used in computer vision and image processing. It has been shown that eective image analysis can be implemented based on the mean shift procedure. The mean shift algorithm is a useful tool for bottom-up computer vision tasks such as edge preserving ltering and segmentation. The method is appropriate as a preprocessing step for image segmentation [11, 12].

An image is smoothed and its important edges might be easier detected after mean shift ltering. The ltering is controlled by two parameters: a spatial radius and a spectral radius (or a intensity range). These parameters measure the amount of ltering. The lter diers from the lters discussed above in changing the neighbourhood of a given pixel dynamically until a convergence criterion is met.

(42)

For each pixel of an image, the set of neighbouring pixels within a spatial radius and a intensity range is determined. For this set of neighbouring pixels, the new spatial center (spatial mean) and the new intensity mean value are calculated. These calculated mean values will serve as the new center for the next iteration. The described procedure will be iterated until the spatial and the intensity mean stops changing. At the end of the iteration, the nal mean intensity will be assigned to the starting position of that iteration [59].

Letf(x, y)andfˆ(x, y)forx= 0,1,2, . . . , M−1andy= 0,1,2, . . . , N−1be original and ltered image intensities. The upperscriptssandrwill denote the spatial and spectral pixel information, respectively. The mean shift ltering procedure can be done using the following algorithm.

For each x= 0, 1, 2, . . . , M −1 and y= 0, 1, 2, . . . , N −1.

1. Initialize k = 1 and y_k=f(x, y). 2. Compute y_k+1 = 1

n_w(y_k₎ P

f(x, y)∈w(y_k)f(x, y), k ←k+ 1 till convergence.

3. Assign fˆ(x, y) = y_conv^r (x^s, y^s)

The last assignment species that the ltered data at the spatial location of the current pixel will have the range component (intensity) of the point of convergence y_conv. w is the set of neighbouring pixels within a spatial radius and an intensity range dened as input parameters. The neighbourhood for each pixel, window w, varies its size and shape depending on image content.

n_w denotes a number of pixels in w. Figure 17 shows results of the mean shift ltering with the spatial radius of 3 pixels and spectral radius of 0.3 applied to the images with impulse noise and Gaussian noise illustrated in Figure 8.

The lter with the given parameters enhanced the image corrupted with the Gaussian noise while preserving edges. However, in the case of the impulse noise, the ltering left white noisy pixels and decrease black noisy pixels. It happens due to the large intensity dierence between noisy and original pixels.

Black noisy dots were removed from dark objects since their intensities were in the acceptable intensity range. The noisy pixels could not be replaced if their neighbouring pixels dier in intensities more than the given spectral radius.

Figure 18 shows results of the impulse noise reduction with the spectral radius

(43)

(a) Impulse noise reduction (b) Gaussian noise reduction

(c) Impulse noise reduction (d) Gaussian noise reduction Figure 17: Results of mean shift ltering, s= 3 and r = 0.3

which was increased to 0.9. In this case, most of the noisy pixels were replaced;

only white dots over the black area were left.

In this thesis, to improve the preprocessing procedure, the mean shift ltering is applied to the images processed with the median ltering rst.

4.2.2 Delineation of forest stands

The forest stand is the basic unit to collect data for forest management and planning. A stand can be dened as a homogeneous forest area, typically 1 - 3 ha in size, that diers from adjacent areas in site or stand characteristics.

Typical stand criteria have been timber size, density and species, as well as site type [22, 42].

There are dierent approaches to delineate forest stands. For decades in practi-

(44)

Figure 18: Results of mean shift ltering for impulse noise reduction, s = 3 and r = 0.9

cal forestry in Finland the traditional method used for forest stand delineation is based on visual interpretation of aerial photographs in combination with eld surveys. This approach is time consuming and therefore expensive. Several dierent solutions could be obtained since the estimation of stand boundaries by visual interpretation is always subjective. The stands can be delineated with regards to forest management and they do not necessarily represent eco- logically homogeneous units [27, 30].

Forest stand delineation based on image segmentation produces more objective delineation reducing time and costs. Image segmentation is a process of dividing an image into spatially continuous, disjoint and homogeneous regions [27].

There are several segmentation algorithms to dene homogeneity of a region that can be based on either colour or texture features as well as their combination. For stand delineation purposes two basic segmentation approaches have been used: edge-based methods and region-based methods [42].

Edge-based methods use edge (or boundary) information to control or rene a regional segmentation process. The basic assumption is that the change in pixels values between neighbouring pixels inside a region is not as signicant as the change in pixels values on the regions' boundary [28, 38, 42].

Region-based methods gather similar pixels according to some homogeneity criteria. They are based on the assumption that pixels, which belong to the same homogeneous region, are more alike than pixels from dierent homogeneous regions. In this study, image segmentation of forest stands is based on the region-growing technique [28, 38, 42].

(45)

4.2.3 Image segmentation

LetR represent the entire image region occupied by an image. Segmentation can be dened as a process that partitionsRintonsubregions,R₁, R₂, . . . , R_n, such that

(a) Sⁿ

i

Ri =R.

(b) Ri is a connected set, i= 1, 2, . . . , n. (c) R_i∩R_j =∅ for all i and j, i6=j. (d) Q(R_i) =TRUE for i= 1, 2, . . . , n.

(e) Q(R_i∪R_j) =FALSE for any adjacent regions R_i and R_j.

Condition (a) indicates that the segmentation must be complete; that is, every pixel must be in a region. Condition (b) requires that points in a region must be connected in some predened sense. Two pixels p and q are said to be connected inR_iif there exists a path between them consisting entirely of pixels inR_i. For any pixelp in R_i, the set of pixels that are connected to it in R_i is called a connected component of R_i. If it only has one connected component, then setR_i is called a connected set. General connectivity can either be based on 4- or 8-connectivity. In 4-connectivity, pixels are connected horizontally and vertically (by their edges). In 8-conncetivity, pixels are connected horizontally, vertically, and diagonally (by their edges or corner). Condition (c) indicates that the regions must be disjoint. Condition (d) deals with the properties that must be satised by the pixels in a segmented region - for example, Q(Ri) = TRUE if all pixels inRi have the same intensities. Condition (c) indicates that regionsR_i andR_j are dierent in the sense of predicateQ[14]. Q(R_i)denotes a homogeneity criterion that controls the homogeneity of the regions. The criterion has to be true for each region and false for adjacent regions. This ensures that every region is distinct from every other region, that is R_i∪R_j is FALSE when R_i is adjacent to R_j [33, 36]. In this section, region-based segmentation techniques are discussed.

4.2.3.1 Region growing. Region growing is a procedure that groups pixels or subregions into larger regions based on predened criteria. The basic

(46)

approach is to start with a set of seed (starting) points and from these grow regions by appending to each seed those neighbouring pixels that have properties similar to the seed. The growing is predened by similarity (homogeneity) criteria such as specic intensity ranges [14, 38].

There can be one or more seed points in a image to be segmented. One way to select seed points is to do so interactively. Seed points can be selected interactively (user specied seed points). A user can select seed points by clicking a mouse on desired pixels. Another way is to scatter seed points around the image and grow regions. If the current seed points are insucient, additional seed points can be added by selecting points not already in segmented region.

Selecting a set of seed points often can be based on the nature of the problem and image content. When a priori information is not available, the procedure is to compute at every pixel the same set of properties that ultimately will be used to assign pixels to regions during the growing process. If the result of these computations shows clusters of values, the pixels whose properties place them near the centroid of these clusters can be used as seeds [14, 69].

Growing process is controlled by a stopping rule. Growing a region can stop when no more pixels satisfy the criteria for inclusion in that region. Additional criteria that increase the power of a region-growing algorithm utilize the concept of size, likeness between a candidate pixel and the pixels grown so far (such as a comparison of the intensity of a candidate and the average intensity of the grown region), and the shape of the region being grown [14].

Letf(x, y) be an input image array; S(x, y) denote a seed array containing 1s at the locations of seed points and 0s elsewhere, x = 0, 1, 2, . . . , M −1 and y = 0, 1, 2, . . . , N −1. Let Q denote a predicate to be applied at each location(x, y). A basic region growing algorithm based on 8-connectivity mat be stated as follows.

1. Find all connected components in S(x, y) and erode each connected component to one pixel; label all such pixels found as 1. All other pixels in S are labeled 0.

2. Form an imagef_Qsuch that, at a pair of coordinates(x, y), letf_Q(x, y) = 1if the input image satises the given predicate,Q, at those coordinates;

otherwise, let f_Q(x, y) = 0.

3. Let g be an image formed by appending to each seed point in S all the

(47)

1-valued points in f_Q that are 8-connected to that seed point.

4. Label each connected component in g with a dierent region label (for example, 1, 2, 3, ...). This is the segmented image obtained by region growing [14].

A measure of similarity can be based on intensity dierences with seed points, intensity dierences with the grown region, or more complex schemes. When the predicate is based on a single threshold, T, intensity dierences are used as a measure of similarity. In this case, the predicate applied at each location (x, y) is [14]

Q=











TRUE if the absolute dierence of the intensities between the seed and the pixel at(x, y)≤T FALSE otherwise.

(10)

Figure 19 shows the example of segmentation by growing a region from a user specied single seed point using intensity mean of the grown region as the measure of similarity. The region is growing around the seed point by comparing intensities of all unallocated neighbouring pixels to the intensity mean value of the region currently grown; T = 0.1.

Figure 19: The seed point (red) and grown region (white), T = 0.1

4.2.3.2 Region Splitting and merging. The region splitting and merging algorithm begins by considering the whole image as a single region, R. If this region is not homogeneous according to the selected metric (the predicateQ),

Developing raster bands computed from LiDAR to improve forest

Contents

1 INTRODUCTION

1.1 Remote sensing technology overview

1.2 Study objectives

1.3 Thesis organization

2 FOREST INVENTORY METHODS

2.1 Field surveys

2.2 Remote sensing and GIS

3 SURFACE INTERPOLATION FROM LIDAR DATA

3.1 Inverse distance weighting interpolation

3.2 Kriging interpolation

3.3 Spline interpolation

3.4 Interpolation method selection

4 IMAGE ANALYSIS TO FOREST INVENTORIES

4.1 Visual interpretation of aerial and satellite imageries

4.2 Digital image processing