The Assessment of the Uncertainty of Updated Stand-Level Inventory Data S F

(1)

The Finnish Society of Forest Science · The Finnish Forest Research Institute

The Assessment of the Uncertainty of Updated Stand-Level Inventory Data

Arto Haara and Pekka Leskinen

Haara, A. & Leskinen, P. 2009. The assessment of the uncertainty of updated stand-level inven- tory data. Silva Fennica 43(1): 87–112.

Predictions of growth and yield are essential in forest management planning. Growth predictions are usually obtained by applying complex simulation systems, whose accuracy is difficult to assess. Moreover, the computerised updating of old inventory data is increasing in the management of forest planning systems. A common characteristic of prediction models is that the uncertainties involved are usually not considered in the decision-making process. In this paper, two methods for assessing the uncertainty of updated forest inventory data were studied. The considered methods were (i) the models of observed errors and (ii) the k-nearest neighbour method. The derived assessments of uncertainty were compared with the empirical estimates of uncertainty. The practical utilisation of both methods was considered as well.

The uncertainty assessments of updated stand-level inventory data using both methods were found to be feasible. The main advantages of the two studied methods include that bias as well as accuracy can be assessed.

Keywords measurement error, non-parametric methods, observed error, simulation, stand-level inventory, uncertainty

Addresses Haara, University of Joensuu, Faculty of Forest Sciences, P.O. Box 111, FI-80101 Joensuu, Finland; Leskinen, Finnish Environment Institute, Research Programme for Produc- tion and Consumption, P.O. Box 111, FI-80101 Joensuu, Finland

E-mail arto.haara@joensuu.fi

Received 25 March 2008 Revised 24 November 2008 Accepted 26 January 2009 Available at http://www.metla.fi/silvafennica/full/sf43/sf431087.pdf

(2)

1 Introduction

Field data concerning the state of the forests, as well as predictions of forest growth and yield, are essential information sources in forest management planning. Inventory data are used in forest planning systems to predict stand growth under different management schedules and to optimise the management schedules for the stands depending on the landowner’s preferences or objectives (e.g. Leskinen 2001) and regulations, as well as the recommendations based on current forest management practices. Forest planning systems are usually quite complex, containing models for predicting the development of stands, e.g.

models for regeneration, growth and mortality, and models for simulating the impact of different management schedules based on these predictions (e.g. Jonsson et al. 1993, Siitonen 1993, Lund- ström and Söderberg 1996, Eid and Hobbelstad 2000). Accurate predictions of stand growth and yield are essential, because inaccurate predictions lead to wrong conclusions and non-optimal treatment schedules (Kangas and Kangas 1997).

The practice in the Nordic Countries is for field data to be usually collected stand-by-stand using subjective forest inventory methods, e.g. ocular inventory methods. In Finland, for example, the inventory data on non-industrial private forests are mainly collected by using Bitterlich (1984) sample plots. The stand basal area is assessed as the average of the subjectively selected represent- ative sample plots. Furthermore, measurers often change the average basal area of the sample plots based on their own subjective views. Moreover, tree heights and diameters at breast height are not measured on the sample plots; instead, the trees are tallied using a relascope and a subjectively chosen basal area median diameter tree per tree species per stand is assessed by the measurer.

Thus, it is difficult to estimate the sampling errors and the accuracy of stand-level inventory.

The accuracy of stand-level inventory con- tains wide variation due to the subjectivity of the method. For example, the measurement error of the basal area (BA) varies from about 10%

to 22% (e.g. Mähönen 1984, Laasasenaho and Päivinen 1986, Ståhl 1992, Pigg 1994, Haara and Korhonen 2004). In addition, the mean diameter at breast height (DgM), the mean height (HgM),

and the mean age (Age) in stand-level inventory are determined by referring to the same basal-area median-diameter tree. This implies, for example, that errors in these stand characteristics are positively correlated (e.g. Ståhl 1992, Pigg 1994, Haara 2003). The measurement error for mean age is about 20% and for mean diameter and mean height about 10–20% (e.g. Mähönen 1984, Laasasenaho and Päivinen 1986, Ståhl 1992, Pigg 1994, Haara and Korhonen 2004). In stand-level inventory, the standard error of the mean stand volume (V) can vary between 15% and 38% (e.g.

Poso 1983, Laasasenaho and Päivinen 1986, Ståhl 1992, Haara and Korhonen 2004). The considerable variation between measurers’ accuracy has been also noted in many studies (e.g. Laasas- enaho and Päivinen 1986, Nersten and Næsset 1992, Ståhl 1992, Kangas et al. 2002, Haara and Korhonen 2004).

Long-term management planning typically uses the diameter distribution approach in growth and yield predictions (e.g. Clutter et al. 1983, Siitonen 1993). When doing so, the collected field data with tree and stand characteristics are used to estimate the theoretical diameter distribution of trees. The practice in Finland is to use basal area diameter distributions instead of the stem diameter distributions (e.g. Kilkki et al. 1989, Maltamo 1998). Each sample tree from the theoretical diameter distribution predicts a certain number of trees in the stand. The mean stand volume can be predicted by multiplying the volume of each sample tree by the prediction of its number and by summing up the estimated volumes.

The uncertainty of growth and yield predictions should be taken into account in forest planning and decision-making (e.g. Hof et al. 1988, Hof and Pickens 1991, Pickens et al. 1991, Hof et al.

1992, Kangas 1999). There are several approaches to assessing this uncertainty. Perhaps the simplest way to assess the uncertainty of growth predictions is to use re-measured data and compare the growth predictions obtained to observed growth.

However, empirical assessments are time and place constrained, i.e. the assessments carried out in a certain time-and-place combination can not be directly applied elsewhere (Kangas 1999).

Besides, the assessments of the reference growth data are usually based on accurate measurements instead of ocular assessments. The second

(3)

approach to assessing the accuracy of growth predictions is to utilise the Monte Carlo simulation or Taylor series approximation methods (e.g.

Gertner and Dzialowy 1984, Mowrer and Frayer 1986, Gertner 1987, Mowrer 1991, Gertner et al.

1995, Kangas 1997). When using these methods, the total prediction error is composed of several error sources (Kangas 1999). In both methods, accuracy can be assessed without independent re-measurement of data sets. However, from the technical point of view, it may be very difficult to take all error sources into account, especially when dealing with large model sets. A third pos- sibility in studying the accuracy of simulation systems is to model the observed (past) errors as functions of explanatory variables at the aggre- gated level (e.g. Hansen and Hahn 1983, Soares et al. 1995, Kangas 1999). In this approach, the errors caused by different growth and yield models need not be specified. The main restriction in using this approach is the requirement of independent continuous inventory data, in which both planning data and correct data are available at the beginning and at the end of the planning period.

Comparative studies between different methods for the purpose of determining their accuracy, e.g. Kangas (1999), are rare. Most studies have focused on dealing with Monte Carlo simulation or Taylor series approximation methods.

In addition to the above techniques, Haara (2002) assessed the uncertainty of growth and yield predictions by using the k-nearest neighbour method (k-NN method). The uncertainty of the predicted stand characteristics of the target stand was derived from the uncertainty assessments of growth predictions of the nearest neighbour stands. The variables of the distance function, as well as the weights of these variables, were chosen using multi-objective optimisation (Haara 2002).

The differences between the predicted growth of the target stand and the predicted growths of the reference stands were minimised. As a result, the stand characteristics and predicted growth were as similar as possible between the target stand and the reference stands. In practice, the quality of the k-NN method predictions depends on the availability of extensive reference data. If good reference data are available, the examined method is a very promising way to predict the uncertainty of growth predictions (Haara 2002).

The quality of forest management planning depends greatly on the accuracy of the inventory data, which can be improved, but doing so results in increasing inventory costs. In general, the costs of field inventory are significant, com- prising approximately half of the total costs of forest management planning (Uuttera et al. 2002).

On the other hand, the costs of the field inventory can be reduced by utilizing old inventory data by updating the stand characteristics contained in the old data computationally using forest simulation systems (e.g. Clutter et al. 1983, Siitonen 1993).

However, the accuracy of the updated data must be adequate for planning purposes. Computational updating is already being used in forest planning without accurate knowledge of the uncertainty of the updated stand characteristics, although there are some studies, which have demonstrated that updated stand characteristics can be as accurate as stand characteristics obtained from new inventory data (e.g. Pussinen 1992, Anttila 2002, Hyvönen and Korhonen 2003)

The aim of this paper is to further study the uncertainty of updated stand-level inventory data by comparing two different methods for assessing the uncertainty. First, the uncertainty of growth and yield predictions is estimated by modelling the observed errors obtained by comparing the empirical stand characteristics with the updated stand characteristics. Second, the assessment of uncertainty is carried out by applying the k-nearest neighbour (k-NN) method and multi-objective optimisation. These two assessment methods have been presented as promising methods for predicting uncertainty, and in this study the methods are further studied and compared in connection with a large amount of test data. The updating is carried out using the MELA forest management planning system (Redsven et al. 2004). The assessments of uncertainty are compared with the empirical uncertainty. The sources of uncertainty considered in this study are the errors in the basic forest inventory data and the errors of the predictions of the forest development due to these inventory errors.

(4)

2 Material and Methods

2.1 Study Material

The study data consisted of three independent data sets. The first data set was obtained from fixed-radius permanent plots (INKA data) located in pure and mixed stands (Gustavsen et al. 1988).

The INKA data were collected between 1976 and 1992 by Finnish Forest Research Institute staff from forest stands growing on mineral soils (Table 1). Three fixed circular sample plots were measured in each stand 1–3 times at intervals of 5 years. Plots were located 40 meters apart from each other. The plot size varied: at least 120 sample trees per stand were measured in Southern Finland and 100 sample trees per stand in Northern Finland. The area of the study plots varied between 0.008 and 0.13 hectares mean area being 0.011 hectares. Tree diameters at breast height of all trees of the plot were measured but tree heights were measured only from trees, which were located in a smaller plot at the center of each sample plot. The area of smaller plot was 1/3 of the area of the sample plot. The INKA data consisted of a total of 754 stands.

The second data set was comprised of three independent data sets collected compartment-by- compartment from Eastern Finland in 1998–2002 (Hyvönen 2002, Hyvönen and Korhonen 2003,

Haara and Korhonen 2004). These three data sets were combined and consisted of a total of 1223 stands (Table 1). A stand-level inventory was first carried out by measurers in large stand data. The accuracy of the stand-level inventory was then controlled by measuring a large check inventory data.

A systematic network of fixed circular sample plots was measured within each stand of the checking inventory data. This dataset is referred as CONTROL data. The number of the plots and the radius of the plots depended on the area of each stand, on the development class of the particular stand, and on the number of tree species within the stand. The average radius of the plots was 7.5 metres, and the radius of the plots varied from 3.99 metres (young stands) to 10 metres (mature stands). The average number of sample plots was 6.2 plots per stand.

The tree species and the diameter at breast height were determined for each tree on the plots. The heights of the basal-area median-diameter trees of all tree species were measured on at least three plots within each stand.

The third data set (NORTH data) consisted of independent and controlled compartment data including 1842 stands from Northern Finland and collected in 1990–1994 (Haara 2003). A stand- level inventory was carried out by measurers. The control inventory was then done by measuring a systematic net of relascope sample plots in each stand. The average number of sample plots was

Table 1. The main stand characteristics of the INKA data, CONTROL data, and NORTH data.

Min Max Mean SD

INKA (754 stands)

D_gM(cm) 1.9 35.2 15.8 6.3

HgM (m) 1.9 28.7 12.5 5.6

BA (m² ha^–1) 0.2 38.0 16.4 8.4

V (m³ ha^–1) 0.7 387.8 114.5 84.0

Age (years) 2 154 52.7 30.8

CONTROL (1223 stands)

DgM (cm) 7.5 42.6 19.2 5.9

HgM (m) 4.3 31.9 15.5 5.2

BA (m² ha^–1) 1.5 40.8 20.3 6.4

V (m³ ha^–1) 8.5 447.4 155.8 80.0

Age (years) 6 179 56.9 26.4

NORTH (1842 stands)

DgM (cm) 1 31.6 15.1 5.5

HgM (m) 1.3 16.6 9.7 3.1

BA (m² ha^–1) 1 36 10.8 6.9

V (m³ ha^–1) 1 250 60.3 46.7

(5)

7.7 plots per stand. Tree species and diameters at breast height of all trees on the sample plots were determined (Table 1). The heights of the basal-area median-diameter trees of all tree species within the plot were measured. The three datasets, and the four data sets generated from those in Chapters 2.2 and 2.3, are introduced in Table 2.

2.2 Generation of True Stand-Level Inventory Data

Tree-specific stand data were first generated from the INKA data. The trees from three plots in each stand were combined to represent the empirical diameter distribution of the stand. The heights of the trees were taken only from the sample trees.

Any missing heights h were estimated for each tree using Näslund’s (1936) height model

h d

b b d

= + e

(

+

)

⁺

1 3 ² 1

0 1

. 2 ( )

where d refers to the diameter. The parameters b0, b₁ and Var(e) were estimated separately for each stand. The missing tree heights were predicted using the model (1) so that a random variable sampled from the normal distribution N(0, Var(e)) was added to the predictions. The volumes of the trees from the diameter distributions were calculated using Laasasenaho’s (1982) specieswise

volume models. The correct stand volumes were obtained by summing up these tree volumes.

True stand-level inventory data including diameter distribution model errors, referred as INKA1, were generated from the true tree-specific stand data as outlined above (Fig. 1). The structure of the INKA1 data was the same as in current forest planning practice in Finland (Koivuniemi and Korhonen 2006, Metsäsuunnittelun maasto- työopas 2006). Each tree species was described by the basal area, the diameter at breast height of the basal-area median tree, the height of the basal-area median tree, and the age of the basal- area median tree. These species-specific characteristics were calculated from empirical diameter distributions. The theoretical basal area diameter distributions were predicted from the calculated stand characteristics using the three-parametric approach of the Weibull function (Mykkänen 1986, Kilkki et al. 1989). The heights of the sample trees obtained from the theoretical diameter distribution were predicted using Veltheim’s (1987) height models. Then the predicted heights were calibrated with ratio estimation using the height of the basal-area median tree of the stand.

The volumes of the model trees obtained from the theoretical diameter distribution were estimated using Laasasenaho’s (1982) volume models. The volume estimates of the stand-level inventory data were obtained by summing up these volumes.

Furthermore, the basal area median tree of the Table 2. The study data sets.

Data Origin Format Errors Purpose

INKA Treewise None Observed errors of growth predictions

CONTROL Standwise From measurers Measurement errors, error generation

NORTH Standwise From measurers Measurement errors, error generation

INKA1 INKA Standwise Diameter distribution

model errors (DDME) Test data for uncertainty assessment methods, original data for simulation of standwise data with measurement errors

CONTROL1 INKA1 Standwise Simulated from

CONTROL+DDME Modelling data for the models of observed errors, reference data for the k-NN method and multi-objective optimisation

CONTROL2 INKA1 Standwise Simulated from

CONTROL+DDME Test data for uncertainty assessment methods

NORTH2 INKA1 Standwise Simulated from

NORTH+DDME Test data for uncertainty assessment methods

(6)

Fig. 1. Flow chart showing the generation of the study data.

Calculate mean stand characteristics of stand i

Generate true stand level inventory data of stand i

Treewise INKA-data of stand i

True stand level inventory data INKA1

Reference data 1 Stand level inventory data CONTROL

Reference data 2 Stand level inventory data NORTH

Find five nearest neighbours c1,..,c5 of the target stand i

Generate 5 realisations of the stand characteristics of a target stand i from measurement errors of the neighbour stands c1,..,c5

Exclude one of the generated stands with measurement errors with random sampling

Find nearest neighbour stand j of the target stand i

Generate measurement errors of the neighbour stand j into the stand characteristics of a target stand i

True INKA-data i.e. treewise data

True stand level inventory data INKA1

Erroneous stand level inventory data CONTROL1

Erroneous stand level inventory data CONTROL2

Erroneous stand level inventory data NORTH2 1 stand

4 stands

stand was obtained from the theoretical diameter distribution. The diameter, height and age of the basal area median tree of the stand were used as stand’s DgM, HgM and Age.

2.3 Generation of Erroneous Stand-Level Inventory Data

In order to assess the uncertainty of growth and yield predictions by examining practical stand-level

inventory data, the study data should also include measurement errors. Haara (2003) found the one nearest neighbour method (1nn method) to be a useful tool for generating error structures for target stands reflecting those in stand-level inventory. In the method, the measurement errors of the neighbour stand are used directly as the measurement errors of the target stand. The differences between the stand characteristics (e.g. main tree species, stand basal area) of the target stand and neighbor- ing stand are as small as possible. In this study,

(7)

k error realisations of the stand characteristics of the target stand are needed and the errors of the k nearest neighbours are used one by one, i.e. the k-NN method is applied. Because the use of the k-NN method requires extensive reference data, the NORTH and the CONTROL data sets were utilised to generate the measurement errors of the stand-level inventory into the INKA1 data (Fig. 1).

The procedure was as follows. First, the measurement errors of both NORTH and CONTROL data sets were examined by deriving true tree and stand characteristics. The missing heights of the sample trees were estimated using Veltheim’s (1987) height models. The height predictions were calibrated using the heights of the stand’s sample trees. The trees in the sample plots were combined to provide a compounded empirical stand diameter distribution. This distribution was used to calculate the true stand characteristics. Errors in the true stand characteristics originated from the errors in the measurement of the sample trees, from the height and volume models, and from the sampling error in the control inventory. The measurement errors of the stand-level inventory of CONTROL data and NORTH data were calculated by comparing true stand characteristics and assessed stand characteristics (Table 3). When the sampling error of the checking inventory was noted, the RMSE of the stand volume of CONTROL data decreased 3.4 percent units, the RMSE of the stand basal area 3.6 percent units, and RMSEs of the D_gM and the H_gM decreased 2.2 and 0.7 percent units, respectively.

After studying the measurement errors of the CONTROL data and the NORTH data, the next step was to use them as the reference data for the k-NN method for generating erroneous stand-

level data in the 1st and 2nd measurements of the INKA data. Stand-level data with measurement errors were first generated from the measurement errors of the CONTROL data. The level and the structure of the measurement errors were assumed to correlate with the stand characteristics. The search for five nearest-neighbour stands was done by using commonly assessed stand characteristics as the distance function variables in the k-NN method. Similarly, distance func- tions were applied depending on the tree species.

The chosen standardised variables of the distance function were the basal area median diameter, the stand mean basal area, and the proportion of tree species in the stand in terms of the basal area per hectare. Five different error realisations for the target stand were obtained from the measurement errors of the target stand’s five neighbour stands. One of the five random error generations of each target stand was excluded from the data using simple random sampling. The remaining data with four measurement error realisations of each stand (CONTROL1) were used for modelling the observed errors and the excluded data with one measurement error realisation for each stand were used as the test data (CONTROL 2).

In this way, each stand had different measurement errors in the modelling and in the testing phase.

The second test data (NORTH2) were generated with the 1nn method and with NORTH data as the reference data.

The generated stand level data CONTROL1 were used as modelling data for the models of the observed errors as well as for the reference data for the k-NN method for the uncertainty assessments of the updated stand level data. CONTROL2 and NORTH2 data were used for testing both of these Table 3. The root mean square errors (RMSE) and biases of the assessed stand characteristics

in the two control inventory data (CONTROL and NORTH). The relative RMSEs and biases are shown in parentheses.

CONTROL NORTH

RMSE Bias RMSE Bias

DgM (cm) 2.5 (13.1) 0.6 (3.0) 2.5 (13.1) 0.6 (3.0)

HgM (m) 2.4 (15.5) 0.05 (0.3) 2.4 (15.5) 0.05 (0.3)

BA (m² ha^–1) 3.9 (19.3) 0.6 (3.1) 3.9 (19.3) 0.6 (3.1) V (m³ ha^–1) 38.6 (24.8) 4.0 (2.6) 38.6 (24.8) 4.0 (2.6)

(8)

Table 5. The RMSEs and biases of the two test data. In the first test data (CONTROL2) the measurement errors in the stand level inventory are generated from the CONTROL data and in the second test data (NORTH2) the measurement errors are generated from NORTH data.

1st measurement 2nd measurement

RMSE Bias RMSE Bias

CONTROL2

DgM (cm) 2.1 (13.5) 0.3 (2.1) 2.1 (12.4) 0.3 (2.0)

HgM (m) 2.3 (18.7) 0.1 (0.9) 2.3 (17.2) 0.01 (0.1)

BA (m² ha^–1) 2.9 (18.2) 0.2 (1.0) 3.5 (18.5) 0.3 (1.4) V (m³ ha^–1) 26.8 (24.4) 1.4 (1.3) 35.6 (26.0) 1.2 (0.9) NORTH2

D_gM (cm)

HgM (m) 1.9 (12.3) –0.03 (–0.2) 2.0 (12.0) 0.1 (0.4)

BA (m² ha^–1) 2.4 (20.0) 0.1 (0.9) 18.4 (2.4) 0.1 (0.7) V (m³ ha^–1) 2.5 (15.6) 0.3 (1.7) 2.7 (14.0) 0.4 (1.9) Table 4. The RMSEs and biases of the stand characteristics in the true INKA stand data includ-

ing model errors (INKA1) and reference INKA stand data including measurement errors generated from the CONTROL data (CONTROL1).

1st measurement 2nd measurement

RMSE Bias RMSE Bias

INKA1

DgM (cm) 0.8 (4.9) –0.1 (–0.4) 0.8 (4.6) –0.03 (–0.2)

HgM (m) 1.3 (11.0) 0.05 (0.4) 1.1 (8.6) –0.03 (–0.3)

BA (m² ha^–1) 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) 0.0 (0.0) V (m³ ha^–1) 6.2 (5.6) –1.0 (–0.9) 8.6 (6.5) –2.2 (–1.7) CONTROL1

DgM (cm) 2.0 (12.8) 0.4 (2.4) 2.1 (12.7) 0.5 (2.7)

HgM (m) 2.3 (18.8) 0.04 (0.3) 2.3 (17.1) –0.02 (–0.1)

BA (m² ha^–1) 2.9 (18.0) 0.0 (0.0) 3.2 (16.9) 0.05 (0.2) V (m³ ha^–1) 28.0 (25.0) –0.4 (–0.4) 32.8 (24.5) –1.2 (–0.9)

methods. The errors in the stand characteristics in the data including diameter distribution model errors (INKA1) as well as the errors in the modelling data (CONTROL1) are presented in Table 4, and the errors in the stand characteristics in both test data are presented in Table 5.

2.4 Growth Simulations

All stand-level inventory data (i.e. correct stand inventory data, modelling data and both test data) were updated by using the MELA forest simulator

(Siitonen 1993, Hynynen et al. 2002, Redsven et al. 2004) for modelling growth, regeneration establishment, and tree mortality. The growth of the 1st measurements of the correct stand inventory data and modelling data were predicted 5 and 10 years, and the growth of the 2nd measurements were predicted 5 years. The growth of both test data (CONTROL2, NORTH2) was predicted 5 and 10 years. The logging and silvicultural operations carried out between inventories were simulated.

The timing of these operations had been recorded in the inventory. The cuttings were simulated following the thinning and regeneration models

(9)

(Hyvän metsänhoidon… 2001). The uncertainty of the growth predictions of the MELA forest simulator included errors of the growth, regeneration and mortality models, and the errors due to the processing of the inventory data besides the measurement errors of the inventory data.

2.5 Models for Observed Errors

The first method for assessing the uncertainty of the updated stand characteristics was to model the observed errors. The empirical errors in the stand characteristics were achieved by calculating the differences between the updated stand characteristics and empirical stand characteristics. The models for the observed errors were estimated for both true and erroneous stand-level data. In both cases, the models were estimated first for stands in which no logging or silvicultural treatments had been carried out during the simulation period, and secondly, for the entire data. The models for the treated stands were not estimated because of the small number of these stands. The models for the observed errors of the basal area median diameter, the basal area median height, the basal area, and the mean stand volume were estimated.

The models were of the form:

ln(Error_i²)= +α β₁SC₁+β₂SC₂+ +... e_i ( )2 or

Error_i²= +α β₁SC₁+β₂SC₂+ +... e_i ( )3 where Errori2 denotes the squared observed error of the updated stand characteristics, SCk denotes stand and site characteristics (k = 1,2,…), and ei

is an error term.

The stand and site characteristics, which could have been measured as such or derived from stand level inventory data (e.g. growth predictions), were used as independent variables in these models. By using these models, the predictions of stand’s MSEs of the updated stand characteristics can be achieved. Furthermore, the bias models for the observed errors were estimated. In these models, the observed errors of the updated stand characteristics, i.e. dependent variables in these models, were not squared.

2.6 K-nearest Neighbour Method and Multi- Objective Optimisation

The second method for assessing the uncertainty of the updated stand characteristics was the combination of the k-NN method (Härdle 1989, Altman 1992) and multi-objective optimisation.

The uncertainty of the predicted stand characteristics of the target stand, i.e. RMSEs and biases, was predicted from the uncertainty of the growth predictions of the 10 nearest neighbour stands.

The search for the reference stands was carried out by using standardised commonly measured stand characteristics, i.e. stand basal area, stand age and the class variable main tree species, as the variables of the distance function. The variables of the similarity distance function, as well as the weights of the variables, were determined using multi-objective optimisation. The non-linear pro- gramming algorithm (Hooke and Jeeves 1961) was used to find the combination of decision variables minimizing the objective function. The computer program developed by Osyczka (1984) was modified and adapted to deal with the k-NN method. The differences between the predicted growth of the target stand and the predicted growth of the reference stand were minimised in the optimisation. Thus, the stand characteristics and predicted growth were as similar as possible between the target stand and the reference stands.

The (R)MSEs and biases of the updated stand characteristics of the neighbour stands were used as target stand’s ucertainty assessments.

In the case of the INKA1 data, i.e. true stand- level inventory data, the reference data of the target stand consisted of all stands besides target stand in INKA1 data. As regards the stand data with measurement errors, the reference data consisted of the CONTROL1 data besides target stand, i.e. INKA data with measurements errors generated from the CONTROL data.

2.7 Uncertainty Assessments of the Updated Stand Characteristics

The usability of both assessment methods was tested by using the true stand-level inventory data INKA1 and two generated test data, i.e.

CONTROL2 and NORTH2. All of these data

(10)

sets were updated and the empirical errors of the growth predictions were calculated from the differences between the updated stand characteristics and the true stand characteristics. The test criteria used in the comparison of the predictions were RMSE and bias of the observed errors. The RMSE and the bias of the empirical errors were calculated as

RMSE=

 −

 



−

∧

∑

= ^Y ^Y

n

i i

i

n 2

1

1 ( )4

and

bias=  −

 



∧

∑

=

1 5

n 1 Y_i Yi i

n

( )

where Yi denotes the true value of the stand characteristics, Yi

∧ denotes the predicted value of the stand characteristics, and n is the number of stands. The relative RMSE and the relative bias are obtained by dividing RMSE and bias by the average of the true stand characteristics.

The uncertainty of the updated stand characteristics for the true stand-level inventory data and two generated test stand data were predicted with both assessment methods, i.e. the models for the observed errors and the k-NN method. The standwise RMSEs and biases were first predicted with both assessment methods. After that RMSEs and biases of the three test data sets were obtained by adding together the standwise estimates of the RMSEs and biases, and by dividing the sums by the number of the stands. The predictions of uncertainty were compared to empirical RMSEs and biases.

The considered stand characteristics were DgM, HgM, BA and V. The predictions and related uncertainty measures were produced for stands without logging and silvicultural treatments and together for the stands with and without treatments, using both methods, i.e. the models for the observed errors and the k-NN method.

The uncertainty assessments obtained by both methods, i.e. the models for the observed errors method and the k-NN method, were also studied at the stand level. The 95% confidence intervals of the updated stand characteristics were estimated for each stand based on the uncertainty assess-

ment of the stand characteristics depending on the method used. After estimating the confidence intervals, the proportions of the observed errors included in the estimated confidence intervals were studied by using both methods. The studied updating period was 10 years.

3 Results

3.1 Models for Observed Errors

The models for the observed errors (Eqs. 2–3) were estimated in the first phase for the true stand-level inventory data (INKA1). Here the modelling was carried out by first excluding the stands in which management operations had been performed during the updating time (Table 6).

Then the modelling was carried out in whole INKA1 data (Table 7). The bias models (Eq. 5) for the growth predictions of the stand characteristics were also estimated for both cases (Tables 8,9). The uncertainty of the growth predictions increased with the applied updating period. In addition, the relative uncertainty of the growth predictions was clearly higher for young stands.

The effect of treatments was considerable: growth and yield predictions were more difficult in treated stands. The uncertainty of the updated BA and V were assessed more reliable than the uncertainty of the updated median tree characteristics.

In the second phase, the models of the observed errors were estimated using stand-level inventory data CONTROL1, which included generated measurement errors. Now the treatment effect was not as clear as it has been when using true stand- level inventory data (Tables 10,11). Moreover, the effect of the updating time on the uncertainty of the stand characteristics had diminished. Further- more, the bias models for the growth predictions of the stand characteristics were estimated for data with and without the stands in which management operations had been performed during the updating time (Tables 12,13).

(11)

Table 6. The models of the observed errors of the growth predictions of the stand characteristics with the correct stand inventory data (INKA1) without treated stands.

Model R² R²adj Std. error Predictors Coefficients Std. error of

coeff.

Ln(V error²) 0.30 0.30 2.664 Constant –0.305 ^* 0.211

Ti (a) 0.381 ^*** 0.015

BA 0.05507 ^*** 0.016

V(i) 0.01215 ^*** 0.001

BAPI –0.738 ^*** 0.155

Age 0.004529 ^*** 0.002

dd –0.0003519 ^*** 0.000

V 0.0004854 ^*** 0.002

SC4 0.262 ^*** 0.113

BA error² 0.16 0.16 5.265 Constant 1.733 ^*** 0.426

Ti (a) 0.436 ^*** 0.029

Age –0.007329 ^* 0.004

BABI 3.009 ^*** 0.554

BA(i) 0.165 ^*** 0.028

Species 0.289 ^** 0.105

DgM –0.148 ^*** 0.032

SC1,2 0.916 ^** 0.356

V 0.1081 ^*** 0.004

BA –0.06588 ^** 0.033

DgM error² 0.07 0.07 2.453 Constant 2.077 ^*** 0.381

Ti (a) 0.092 ^*** 0.016

BAPI –1.917 ^*** 0.356

BASP –1.483 ^*** 0.383

(DgM)² 0.0032 ^*** 0.000

V –0.01096 ^*** 0.002

SC4 –0.355 ^** 0.132

BA 0.04205 ^** 0.020

ln(HgM error²) 0.10 0.10 2.455 Constant –1.312 ^** 0.610

Ti (a) 0.139 ^*** 0.016

Species 0.174 ^** 0.085

BAPI –0.316 ^* 0.172

BAMAX –1.845 ^*** 0.504

DgM 0.03718 ^*** 0.010

HgM(i) 0.09542 ^*** 0.032

BABI 0.756 ^** 0.347

Ti(a), Growth prediction time, years; BA, Basal area, m² ha^–1; V(i), Estimated growth of mean stand volume, m³ ha^–1; BAPI, Proportion of Scots pine in stand basal area, (0–1); Age, Stand age, years; dd, Total annual temperature sum; V, Mean stand volume, m³ ha^–1; SCx, Fertil- ity class according to Kuusela and Salminen (1969), dummy-variable in which x = 1,2 ... 8. Definition of x values: 1 = Very rich, 2 = Rich, 3 = Moist, 4 = Dryish, 5 = Dry, 6 = Barren, 7 = Rocky sites, sandy sites and alluvial soils, 8 = Hilltops and fells; HgM, Height of basal area median tree, m; BABI, Proportion of birch in stand basal area, (0–1); BA(i), Estimated growth of stand basal area, m² ha^–1; Species, Number of tree species in stand; DgM, Basal area median diameter, cm; BASP, Proportion of Norway spruce in stand basal area, (0–1); BAMAX, Maximum proportion of basal area of tree species in stand basal area, (0–1); HgM(i), Estimated height growth of basal area median tree, m; *, coefficient is significant at 0.05 level; **, coefficient is significant at 0.01 level, ***, coefficient is significant at 0.001 level

(12)

Table 7. The models of the observed errors of the growth predictions of the stand characteristics with the correct stand inventory data (INKA1) with treated stands.

coeff.

ln(V error²) 0.38 0.38 2.638 Constant –0.878 ^** 0.392

Ti (a) 0.382 ^*** 0.014

BA 0.0335 ^* 0.018

Treat 2.29 ^*** 0.159

V(i) 0.009143 ^*** 0.001

BAPI –0.778 ^*** 0.134

dd –0.0003805 0.000

V 0.004856 ^*** 0.001

ln(V) 0.318 ^** 0.126

SC1,2 –0.341 ^** 0.161

ln(BA error²) 0.69 0.69 3.255 Constant 20.536 ^*** 2.541

Ti (a) 1.174 ^*** 0.019

East –0.000003532 ^*** 0.000

dd –0.003604 ^*** 0.000

Treat 2.392 ^*** 0.192

BABR 3.837 ^** 1.899

BA 0.05744 ^** 0.024

HgM –0.112 ^*** 0.021

SC1,2 0.629 ^** 0.252

SC3 0.277 ^* 0.157

SC4+ –0.620 ^** 0.252

ln(BA) –0.649 ^** 0.284

DgM error² 0.11 0.11 3.283 Constant 0.725 ^** 0.326

Treat 0.790 ^*** 0.229

BAPI –1.121 ^*** 0.209

Ti (a) 0.114 ^*** 0.022

(DgM)² 0.004078 ^*** 0.000

V –0.009106 ^*** 0.001

BA(i) –0.146 ^*** 0.022

DgM(i) 0.230 ^*** 0.037

Species 0.239 ^*** 0.080

Ti (a) 0.150 ^*** 0.015

BAMAX –1.963 ^*** 0.454

BA 0.03275 ^*** 0.006

HgM(i) 0.09878 ^*** 0.028

BABI 0.678 ^** 0.284

Species 0.177 ^** 0.075

Treat, dummy-variable in which 0 = Stand not treated during growth prediction time,1 = Stand treated during growth prediction time; East,Y coordinate, m; BABR, Proportion of broadleaves in stand basal area, (0–1); DgM(i), Estimated growth of basal area median diameter, cm;

(13)

Table 8. The bias models of the growth predictions of the stand characteristics with the correct stand inventory data (INKA1) without treated stands.

coeff.

V 0.20 0.19 13.46 Constant –2.225 ^* 2.713

Age –0.194 ^*** 0.017

Ti (a) 0.934 ^*** 0.160

SC3 4.062 ^** 0.835

BA 2.522 ^*** 0.267

V(i) –0.431 ^*** 0.000

V² 0.00246 ^*** 0.155

DgM 1.529 ^*** 0.159

V –0.349 ^*** 0.041

Ln(BA) –6.877 ^*** 1.870

Species –1.865 ^*** 0.449

SC5 –4.681 ^*** 1.347

BA 0.17 0.16 1.578 Constant 0.921 ^*** 0.223

Age –0.2435 ^*** 0.002

BA(i) –0.125 ^*** 0.014

SC5 –1.459 ^*** 0.169

Ti (a) 0.154 ^*** 0.021

SC4 –0.530 ^** 0.099

(DgM)² 0.009718 ^** 0.001

DgM 0.981 ^** 0.017

V –0.09889 ^*** 0.002

dd –0.003055 ^** 0.000

DgM 0.13 0.13 1.030 Constant 1.149 ^*** 0.135

Age –0.06807 ^*** 0.001

V 0.09646 ^*** 0.001

Species –0.153 ^*** 0.034

SC5 –0.433 ^*** 0.103

BA –0.4641 ^*** 0.010

dd –0.001376 ^** 0.000

SC3 –0.135 ^** 0.063

DgM –0.4524 ^** 0.100

HgM 0.14 0.13 1.302 Constant 1.267 ^*** 0.378

(HgM)² –0.1025 ^*** 0.001

DgM 0.184 ^*** 0.017

Age –0.06955 ^*** 0.002

V 0.1461 ^*** 0.003

ln(HgM) –1.172 ^*** 0.257 HgM(i) –0.5798 ^*** 0.017

Ti (a) 0.3899 ^** 0.015

BABI –0.5223 ^** 0.022

V(i), Estimated growth of stand mean volume, m³ ha^–1

(14)

Table 9. The bias models of the observed errors of the growth predictions of the stand characteristics with the correct stand inventory data (INKA1) with treated stands.

coeff.

V 0.36 0.36 19.8 Constant 6.241 ^** 3.727

Treat 15.795 ^*** 1.315

V(i) –0.179 ^*** 0.013

Age –0.306 ^*** 0.023

BASP 8.542 ^*** 1.799

DgM 1.351 ^*** 0.190

Ti (a) 1.502 ^*** 0.225

SC5 –10.701 ^*** 2.119

ln(BA) –7.813 ^*** 2.542

BA 2.070 ^*** 0.332

V –0.180 ^*** 0.025

SC4 –4.721 ^*** 1.313

Species –1.775 ^*** 0.578

dd –0.02747 ^** 0.001

BA 0.39 0.39 2.287 Constant 3.132 ^*** 0.339

BA(i) –0.304 ^*** 0.015

Age –0.3598 ^*** 0.003

BAPI –0.694 ^*** 0.217

Ti (a) 0.188 ^*** 0.026

SC5 –2.413 ^*** 0.342

Treat 0.890 ^*** 0.160

SC4 –1.078 ^*** 0.273

dd –0.003734 ^*** 0.000

DgM 0.7275 ^*** 0.019

V –0.04655 ^*** 0.001

BABR –0.921 ^*** 0.320

SC3 –0.441^** 0.218

DgM 0.13 0.12 1.175 Constant 1.278 ^*** 0.155

Age –0.08555 ^*** 0.001

DgM (i) –0.808 ^*** 0.012

V 0.07698 ^*** 0.001

Species 0.004078 ^*** 0.033

DgM –0.5687 ^*** 0.010

SC5 –0.480 ^*** 0.106

BA –0.3242 ^*** 0.010

BABR 2.126 ^** 0.842

Treat –0.187 ^*** 0.071

Ti (a) 0.289 ^** 0.012

HgM 0.13 0.12 1.316 Constant 1.329 ^*** 0.328

(HgM)2 –0.07747 ^*** 0.001

V 0.06993 ^*** 0.001

DgM 0.147 ^*** 0.015

Ln(HgM) –1.107 ^*** 0.192

Age –0.0587 ^*** 0.002

HgM(i) –0.8539 ^*** 0.017

Ti (a) 0.6044 ^*** 0.013

SC1 0.307 ^** 0.113

Species –0.108 ^*** 0.036

SC5 –0.28 ^** 0.118

(15)

Table 10. The models of the observed errors of the growth predictions of the stand characteristics with the errone- ous stand inventory data (CONTROL1) without treated stands.

coeff.

ln(V error²) 0.16 0.15 2.413 Constant –3.024 ^*** 0.434

Ti (a) 0.133 ^*** 0.015

Alt 0.002387 ^*** 0.001

dd 0.003198 ^*** 0.000

BASP 0.190 ^*** 0.094

Vlevel 0.00995 ^*** 0.004

SC5+ –0.312 ^*** 0.102

Age 0.003948 ^*** 0.001

ln(V) 0.762 ^*** 0.049

1/V 1.539 ^*** 0.509

SC1,2 0.463 ^*** 0.108

V(i) –0.005416 ^*** 0.002

BA error² 0.04 0.04 2.874 Constant –2.721^*** 0.243

Ti (a) 0.09566 ^*** 0.011

BA 0.874 ^*** 0.084

BA(i) 0.08338 ^*** 0.014

BASP –0.803 ^*** 0.090

Species 0.114 ^*** 0.035

dd 0.000671 ^*** 0.000

DgM –0.023 ^*** 0.006

1/DgM 0.800 ^*** 0.228

BABR 1.895 ^** 0.951

DgM error² 0.07 0.07 2.453 Constant 2.077 ^*** 0.381

Ti (a) 0.092 ^*** 0.016

BAPI –1.917 ^*** 0.356

BASP –1.483 ^*** 0.383

(DgM)² 0.0032 ^*** 0.000

V –0.01096 ^*** 0.002

SC4 –0.355 ^** 0.132

BA 0.04205 ^* 0.020

Ti (a) 0.139 ^*** 0.016

Species 0.174 ^** 0.085

BAPI –0.316 ^** 0.172

BAMAX –1.845 ^*** 0.504

DgM 0.03718 ^*** 0.010

HgM(i) 0.09542 ^*** 0.032

BABI 0.756 ^** 0.347

Alt, Elevation above sea level, m; Vlevel, Expected RMSE of mean stand volume of measurer;