• Ei tuloksia

Parts of the dissertation research have implemented hedonic price theory and insights of the AMM model through spatial hedonic regressions. The regressions estimated the marginal effect of ecological risks and amenities on house prices. Other parts of the dissertation implemented urban complexity theory via the calibration of cellular automata models and via a novel combination of fractal geometry and co-integration analysis. The former focused on flood risk management and urbanization dynamics, while the latter on the spatiotemporal dynamics of house prices. Although the two families of tools originate in disciplines with notably different theoretical assumptions, the concept of spatial interaction binds them together in the study of spatial economic processes.

5.1. Spatial hedonic models

The most widely used statistical technique for implementing hedonic price theory is the estimation of hedonic regressions. The regressions are typically linear models that express the market price of a dwelling as a function of its attributes. In this context, the estimated regression coefficient of a hedonic attribute is the marginal price of that attribute: a unit change in the quantity or quality in that attribute will modify the price of a typical dwelling by the estimated beta coefficient in the measured units of price. Relating to the AMM model, hedonic attributes have been often categorized as structural, locational, or neighborhood (Dubin 1988). This results in Equation (1), in which y is a vector of the selling prices of a sample of properties, S, L, and N are matrices of, respectively, structural, locational, and neighborhood attributes, γ, δ, and ϑ are coefficient vectors, and 𝛜𝛜 a vector of random errors.

𝐲𝐲 = γ𝐒𝐒 + δ𝐋𝐋 + ϑ𝐍𝐍 + 𝛜𝛜 (1).

Spatial econometrics enhances hedonic analysis by introducing spatial interaction between dwellings (in disaggregate observations) or regions (in aggregate data). This is achieved by encoding contingency information in a spatial weights matrix, which lists the neighbors of each observation, typically in a binary manner (1: neighbor; 0: non-neighbor). Neighborhood is identified

either through contingency (common nodes or vertices of polygons) or through distance rules (nearest neighbors or a distance radius). The first-order von Neumann and Moore neighborhoods are most often used to determine contingency. For a given polygon, the von Neumann neighborhood identifies the adjacent polygons at the cardinal points, while the Moore neighborhood identifies the complete ring of adjacent polygons surrounding the polygon in question (Figure 3a).

Figure 3a: The 1st order von Neumann (left) and Moore (right) neighborhoods of the central cell.

The concept of order reflects whether the extended neighborhood of a given polygon is included.

The first-order von Neumann and Moore neighborhoods do not consider extended neighbors, while higher orders add the neighbors of neighbors (second order), neighbors of neighbors of neighbors (third order) and so on (Figure 3b). In either case, since one’s neighborhood has its own neighborhood, and so on, growth and spatial interaction effects propagate gradually in all members of a regional system. Thus, even if the spatial weights matrix may model local contingency, the local processes modelled by spatial econometric tools have global effects, since all elements in a regional system belong to each other’s n-order neighborhood.

Figure 3b: The 2nd order von Neumann (left) and 2nd order Moore (right) neighborhoods of the central cell.

The 1st order neighborhoods of Figure 2a are nested in the 2nd order neighborhoods here, which forms the basis for local interactions having global effects.

Although the above neighborhood rules search for edges and vertices of geometrical shapes, they are applicable also to point observations. This is achieved by computing the Thiessen polygons of points and using the edges and vertices of these polygons to derive contingency. In this dissertation the Thiessen polygon method has been employed in the hedonic regressions of articles I-III whenever disaggregated point observations (individual dwellings) were involved. The main assumption represented by this choice is that each property has equal weight in the competition with surrounding properties in establishing its influence area.

The spatial weights matrix serves as a moving average window (Anselin 1988). The matrix passes over each observation and computes a spatially averaged value in its defined neighborhood, called the observation’s spatial lag. The benefit to hedonic analysis is twofold:

Firstly, the majority of hedonic attributes exhibit spatial autocorrelation. This means that locations near to each other exhibit similar values (Tobler 1970). If there are unobserved variables in a

regression, then the error will also exhibit spatial autocorrelation and will violate the assumption of identical and randomly distributed errors. Spatial regression addresses this issue by clearing the error from spatial autocorrelation. The most widely used model of this type is the spatial error model, shown in Equation (2), in which matrix X includes the structural, locational, and neighborhood attribute matrices of Equation (1), β and λ are coefficient vectors, W is a spatial weights matrix, u a vector of spatially autocorrelated error terms, and 𝛜𝛜a vector of uncorrelated random errors. The nonrandom spatially autocorrelated error that is unaddressed in the non-spatial setup of Equation (1) has been split in the spatial setup of Equation (2) in two components: the spatially autocorrelated component Wuand the truly random component 𝛜𝛜.

𝐲𝐲 = β𝐗𝐗 + λ𝐖𝐖𝐖𝐖 + 𝛜𝛜 (2).

Secondly, certain spatial hedonic regressions move beyond clearing the error and exploit spatial autocorrelation to infer spatial behavior in the housing market. The most common behavior of this kind is the interaction between a location or property and its neighboring locations/properties in terms of prices and hedonic determinants. The most widely used model of this kind is the spatial lag model, shown in Equation (3), in which Xis the matrix of structural, locational, and neighborhood attributes of Equation (2), β and ρ are vectors of regression coefficients, W is a spatial weights matrix, and 𝛜𝛜a vector of random errors. In Equation (3), the spatially autocorrelated term is Wy, which is the spatially lagged form of the dependent variable (transaction price).

𝐲𝐲 = ρ𝐖𝐖𝐲𝐲 + β𝐗𝐗 + 𝛜𝛜 (3).

The main motivation for estimating the spatial econometric models of Equations (2) and (3) is addressing estimation problems that arise from the assumption of non-random residuals due to spatially autocorrelated unobserved variables (Gerkman 2012). This assumption holds in hedonic analysis settings as it is impossible to know from theory or measure all the variables explaining the behavior of agents in the house market or their perception of geographical space. A further motivation is the identification of endogenous spatial interaction behavior in property prices and exogenous spatial interaction behavior in the marginal effects of hedonic attributes. Parameter λof Equation (2) and parameter ρ of Equation (3) help identify and interpret such effects.

While Kuminnof et al. (2010) concluded that spatial econometric models are among the most trustworthy for hedonic studies, their use is not free of criticism. Identification and causality issues and the uncritical application and interpretation of these models is an area of active debate and the argument reaches deep into the conceptual approach of economic analysis, the policy questions they aim to address, and via what mathematical approaches (see, e.g., Manski 1993; Gibbons and Overman 2012). Gibbons and Overman (2012) argue that spatial econometric models should be used in conjunction with strict identification techniques rather than as replacements. In line with their approach, dissertation’s article III implements a difference-in-differences identification strategy in a spatial econometric setting, and is thus exploiting the two techniques to their fullest. It is also worth noting that, despite the methodological debates about spatial econometrics, its critical implementation in articles II and III provided estimates that are in line with international and Finnish hedonic literature. The implicit prices of urban green estimated in article II are close to prior Finnish studies that used standard econometric techniques, whereas the flood risk information shock estimated in article III corresponds to independent flood damage cost functions.

either through contingency (common nodes or vertices of polygons) or through distance rules (nearest neighbors or a distance radius). The first-order von Neumann and Moore neighborhoods are most often used to determine contingency. For a given polygon, the von Neumann neighborhood identifies the adjacent polygons at the cardinal points, while the Moore neighborhood identifies the complete ring of adjacent polygons surrounding the polygon in question (Figure 3a).

Figure 3a: The 1storder von Neumann (left) and Moore (right) neighborhoods of the central cell.

The concept of order reflects whether the extended neighborhood of a given polygon is included.

The first-order von Neumann and Moore neighborhoods do not consider extended neighbors, while higher orders add the neighbors of neighbors (second order), neighbors of neighbors of neighbors (third order) and so on (Figure 3b). In either case, since one’s neighborhood has its own neighborhood, and so on, growth and spatial interaction effects propagate gradually in all members of a regional system. Thus, even if the spatial weights matrix may model local contingency, the local processes modelled by spatial econometric tools have global effects, since all elements in a regional system belong to each other’s n-order neighborhood.

Figure 3b: The 2nd order von Neumann (left) and 2ndorder Moore (right) neighborhoods of the central cell.

The 1st order neighborhoods of Figure 2a are nested in the 2nd order neighborhoods here, which forms the basis for local interactions having global effects.

Although the above neighborhood rules search for edges and vertices of geometrical shapes, they are applicable also to point observations. This is achieved by computing the Thiessen polygons of points and using the edges and vertices of these polygons to derive contingency. In this dissertation the Thiessen polygon method has been employed in the hedonic regressions of articles I-III whenever disaggregated point observations (individual dwellings) were involved. The main assumption represented by this choice is that each property has equal weight in the competition with surrounding properties in establishing its influence area.

The spatial weights matrix serves as a moving average window (Anselin 1988). The matrix passes over each observation and computes a spatially averaged value in its defined neighborhood, called theobservation’s spatial lag. The benefit to hedonic analysis is twofold:

Firstly, the majority of hedonic attributes exhibit spatial autocorrelation. This means that locations near to each other exhibit similar values (Tobler 1970). If there are unobserved variables in a

regression, then the error will also exhibit spatial autocorrelation and will violate the assumption of identical and randomly distributed errors. Spatial regression addresses this issue by clearing the error from spatial autocorrelation. The most widely used model of this type is the spatial error model, shown in Equation (2), in which matrix X includes the structural, locational, and neighborhood attribute matrices of Equation (1), β and λ are coefficient vectors, W is a spatial weights matrix, u a vector of spatially autocorrelated error terms, and 𝛜𝛜 a v ector o f uncorrelated random errors. The nonrandom spatially autocorrelated error that is unaddressed in the non-spatial setup of Equation (1) has been split in the spatial setup of Equation (2) in two components: the spatially autocorrelated component Wu and the truly random component 𝛜𝛜.

𝐲𝐲 = β𝐗𝐗 + λ𝐖𝐖𝐖𝐖 + 𝛜𝛜 (2).

Secondly, certain spatial hedonic regressions move beyond clearing the error and exploit spatial autocorrelation to infer spatial behavior in the housing market. The most common behavior of this kind is the interaction between a location or property and its neighboring locations/properties in terms of prices and hedonic determinants. The most widely used model of this kind is the spatial lag model, shown in Equation (3), in which X is the matrix of structural, locational, and neighborhood attributes of Equation (2), β and ρ are vectors of regression coefficients, W is a spatial weights matrix, and 𝛜𝛜 a v ector of r andom errors. In Equation (3), t he spatially a utocorrelated term i s Wy, which is the spatially lagged form of the dependent variable (transaction price).

𝐲𝐲 = ρ𝐖𝐖𝐲𝐲 + β𝐗𝐗 + 𝛜𝛜 (3).

The main motivation for estimating the spatial econometric models of Equations (2) and (3) is addressing estimation problems that arise from the assumption of non-random residuals due to spatially autocorrelated unobserved variables (Gerkman 2012). This assumption holds in hedonic analysis settings as it is impossible to know from theory or measure all the variables explaining the behavior of agents in the house market or their perception of geographical space. A further motivation is the identification of endogenous spatial interaction behavior in property prices and exogenous spatial interaction behavior in the marginal effects of hedonic attributes. Parameter λ of Equation (2) and parameter ρ of Equation (3) help identify and interpret such effects.

While Kuminnof et al. (2010) concluded that spatial econometric models are among the most trustworthy for hedonic studies, their use is not free of criticism. Identification and causality issues and the uncritical application and interpretation of these models is an area of active debate and the argument reaches deep into the conceptual approach of economic analysis, the policy questions they aim to address, and via what mathematical approaches (see e.g. Manski 1993; Gibbons and Overman 2012). Gibbons and Overman (2012) argue that spatial econometric models should be used in conjunction with strict identification techniques rather than as replacements. In line with their approach, dissertation’s article III implements a difference-in-differences identification strategy in a spatial econometric setting, and is thus exploiting the two techniques to their fullest. It is also worth noting that, despite the methodological debates about spatial econometrics, its critical implementation in articles II and III provided estimates that are in line with international and Finnish hedonic literature. The implicit prices of urban green estimated in article II are close to prior Finnish studies that used standard econometric techniques, whereas the flood risk information shock estimated in article III corresponds to independent flood damage cost functions.

The models Equations (2) and (3) differ in their interpretation. In the spatial error model of Equation (2), the estimated coefficients are treated as in ordinary, non-spatial, least squares regressions. The spatially autocorrelated error term is left uninterpretable as it includes the neighborhood effects of unidentifiable variables. In the spatial lag model, the dependent variable is in both sides of Equation (3) and so the estimated coefficients cannot be interpreted at their face value; they contain both a pure and a spatial spillover component (Anselin 2003). LeSage (2008) and LeSage and Pace (2010) propose a multiplier method that renders the coefficients interpretable by dividing them in direct, indirect, and total spatial impacts. Assuming a unit change in a hedonic attribute, direct is the impact on the price of a typical dwelling when that attribute changes in the dwelling itself. Indirect is the price impact when that change happens in the neighboring locations of that dwelling. Total is the price impact when that change happens across the study area concurrently. Spatial impacts are often used to access the effects of policy changes in a spatially interacting system, as in article III of this thesis. Spatial impacts can also be used to assess the opposite direction, as it is done in article II: if an investment is made or an externality is present, spatial impacts trace how much of the capitalization of the investment or impacts of externality is contained at the investment site, and how much of it spills over to neighboring locations.

Spatial econometrics share a fundamental commonality with the urban complexity tools described in the next two sub-Sections (5.2. and 5.3.). The methodology of spatial econometrics to utilize a matrix that encodes neighborhood relationships between observations is, in fact, a first step in introducing an elementary spatial intelligence in the analyzed objects: geographical entities, as well as physical or socioeconomic attributes that these entities might represent, are made aware of their location in a spatial system relative to surrounding entities. Consequently, interaction of these entities is a main factor in the properties and behavior of the system.

5.2. Cellular automata

Cellular automata (CA) are computational tools that model the spatiotemporal evolution of complex systems. Their fundamental assumption is that the aggregate characteristics of a system are entirely the bottom-up result of local spatial interaction and spatial spillover effects (Batty 2007). For urban and regional planning, CA represent a class of computer models with the ability to both reproduce observed urban morphologies and optimize those morphologies according to planning objectives (Batty 1997). Cellular automata are founded upon the work of Turing (1952), von Neumann (1951), and von Neumann and Burks (1966) on self-reproducing phenomena, which helped understand the role of atomic units in the construction and functioning of biological and physical phenomena.

A cellular automaton is a rule-based system that changes states in discrete time. A generic CA consists of the following elements:

Cells: A set of contingent cells arranged inside an n by k lattice,

States: The initial state of a cell and possible states to which it may transition,

Neighborhood: A definition of neighborhood by contingency rules,

Transitionrules: A set of ‘if…, then…’ rules that determine a cell’s transition (or absence of) to a new state in time t+1 based on its state and that of its neighborhood in time t.

CA contain what Batty (1997: 267) calls the “generic development principle” ofthe evolution of spatial systems and which he describes as:

IF something happens in a cell’s neighborhood, THEN some-other-thing happens to this cell.

In the majority of cellular automata applications in the context of urban and regional planning, cells represent land, while the initial and possible states are understood as developed (built-up) and undeveloped (natural) land. Neighborhood is typically defined by the aforementioned von Neumann and Moore neighborhoods (Section 3.1 and Figure 3) or modifications of those, whereas several specialized transition rules are usually involved. Batty (1997, 2007) demonstrates that different combinations of transition rules and neighborhood definitions, as well as the controlled introduction of randomness in state transitions, produces spatial forms that are affine to known urban morphologies. The NetLogo language (Wilensky 1999, 2007) can be used to illustrate alternative CA specifications and the resulting morphologies. These morphologies (Figure 4) show that cellular automata are able to capture basic elements in the growth and morphological variation of real-world urban forms, including the aspect of development cycles.

Figure 4: Urban morphologies generated from alternative CA specifications (light grey tones indicate the most recent growth cycles and dark grey tones earlier ones). Left: development if only one cell in the von Neumann neighborhood is developed, 39 growth cycles.Center: development if only one cell in the Moore neighborhood is developed, 32 growth cycles. Right: development if only one cell in the Moore neighborhood is developed and the probability of development is 50%, 32 growth cycles.

Recent advances in CA have enhanced realism and detail in modelling growth and land use transitions in real-world urban systems. The main developments have been: the inclusion of application-specific cell states; the introduction of a greater number of modelled land uses; the ability to model particular urban growth drivers and mechanisms; the inclusion of the transport network’s role; and the ability to calibrate models with empirical data (Kim and Batty 2011;

Chaudhuri and Clarke 2013). Experimentation with non-binary states and fuzzy transition rules has also contributed to the flexibility and realism of modelling real-world cities. These advances, combined with the rapid increase of computational capacity during the past decades have resulted in the increased use of CA models beyond theoretical explorations and their implementation in operational planning projects. The dissertation has used a highly developed and validated CA model for studying urbanization dynamics and adaptation strategy in Helsinki’s metropolitan region.

Figure 5 depicts two snapshots of this implementation, illustrating the aforementioned advances in CA modelling: the use of empirical data and real-world built environments; and interaction of urban growth and urban morphology with land use, the transport network, and topographical constraints.

The models Equations (2) and (3) differ in their interpretation. In the spatial error model of Equation (2), the estimated coefficients are treated as in ordinary, non-spatial, least squares regressions. The spatially autocorrelated error term is left uninterpretable as it includes the neighborhood effects of unidentifiable variables. In the spatial lag model, the dependent variable is in both sides of Equation (3) and so the estimated coefficients cannot be interpreted at their face value; they contain both a pure and a spatial spillover component (Anselin 2003). LeSage (2008) and LeSage and Pace (2010) propose a multiplier method that renders the coefficients interpretable

The models Equations (2) and (3) differ in their interpretation. In the spatial error model of Equation (2), the estimated coefficients are treated as in ordinary, non-spatial, least squares regressions. The spatially autocorrelated error term is left uninterpretable as it includes the neighborhood effects of unidentifiable variables. In the spatial lag model, the dependent variable is in both sides of Equation (3) and so the estimated coefficients cannot be interpreted at their face value; they contain both a pure and a spatial spillover component (Anselin 2003). LeSage (2008) and LeSage and Pace (2010) propose a multiplier method that renders the coefficients interpretable