• Ei tuloksia

Estimation of the Pukthuanthong & Roll integration measure

In this study, stock market integration is measured using the method developed by Kuntara Pukthuanthong & Richard Roll (2009). In the method, returns of a single stock market are regressed on factors estimated by principal component analysis. In this model, the proportion of variance explained by common factors (the coefficient of determination: ) is the measure for integration. Because the level of integration and the volatility of stock returns change over time, the regression models are estimated using overlapping moving window (or “rolling”) regressions. Lagged terms of factors can also be included in the model if considered necessary.

As discussed in the previous chapter, data of this study consists of logarithmic stock returns from 12 Eurozone stock indices. When the return a of a stock market index = 1 … on time is denoted by , , , is the factor loading of principal component on time , and the time window used is -observations, the estimated models can be presented in matrix form as follows:

= + 1)

where =

, ,,

, ,,

⋯ ⋯ ⋱ ⋮

, ,,

The estimation is conducted by using OLS, and as mentioned, the measure of integration is the of the regression model. Then, the smaller the squared residuals are, the greater the degree of integration, because = 1 , where is the residual sum of squares of the regression and is the models total sum of squares. Often, adjusted = 1 is used, because it penalizes adding variables that do not actually improve the fit of the model. In this study, adjusted used as a measure of integration.

The factors used are estimated using principal components. In principal components analysis, the variation of correlated group of variables

, , … , can be presented using a new group of uncorrelated variables , , … , .

Each principal component is estimated as a linear combination of the original variables by selecting coefficients , that explain largest possible proportion of the variation of the original variables:

= + + ⋯ + 2)

= + + ⋯ +

⋮ = + + ⋯ +

Because the principal components are linear combinations of the original variables, they are independent relative to each other. First principal component is estimated to explain most of the variance of the original variables, and after that more principal components are estimated. Because the variances could be maximized by setting infinitely large weights, the sum of the principal component weights is constrained to 1:

, = 1 3)

The maximum number of components estimated is equal to the number of original variables, in which case the components explain 100% of the variation of the original variables. Objective is to reduce the number of variables, by explaining a large proportion of the variation by using as few components as needed.

Equation (2) can be represented in matrix form:

= 4)

where is the vector of principal components, is the matrix of original variables and is matrix of -rows, where on row is a , = 1, … , vector containg the principal component weights. These vectors are the eigenvectors of matrix , and the single cells (weights) are the principal component loadings, computed from the variance-covariance matrix (or correlation matrix) of the original variables.

After the principal components have been estimated, these principal components can be used in statistical analyses instead of the original variables by computing the principal component scores as multiplying the original variables (standardized by mean and standard deviation) with principal component loadings (eigenvectors):

= 5)

In this study, the risk factors (principal components) are estimated using the log-returns from the same sample of countries that is used when estimating the integration measures (however, due to bias corrections, the number of countries used in the actual estimations for risk factors is 11 instead of 12, for more

information: see the end of this chapter). When , is used to denote the logarithmic return of a stock market index = 1 … on time , and the length of the time window is -observations, the estimated models for the risk factors can be represented in matrix notation as follows (notation is otherwise as in Equation 4, but with time indices ):

= , 6)

where =

, ,,

, ,,

⋯ ⋯ ⋱ ⋮

, ,,

No trivial rule of thumb exist, how many principal components should be included in the regression models. The proportion of variance that needs to be explained is dependent on the research question. In further analyses, 8 principal components are used. They explain almost 95% of the average variation of the original variables. This choice is discussed more thoroughly in Chapter 4.2.

Also, the length of time window used can have effect on the variance explained by the principal components. In Figure 1 the average cumulative variance explained by 1…12 principal components is presented.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

1 2 3 4 5 6 7 8 9 10 11 12

FIGURE 1 The average cumulative ratio of variance explained using 1-12 risk factors and 100, 200 and 300 day time windows

In this study, the risk factors have been estimated using a 200 day estimation window length. As a robustness check, 100 and 300 day windows were also used, but the chosen time window does not on average (the sample average of the cumulative variance explained using 1…12 risk factors during the period of this study) seem to have any notable effect on the variance explained.

However, the length of the estimation window has an effect on the integration measures estimated using moving-window regressions (See Chapter 4.1.1).

Besides choosing the optimal estimation window, some other considerations must also be taken into account when estimating the Putkhuanthong & Roll integration measure, otherwise the integration estimated could be seriously biased. First potential concern for bias arises from the case of national holidays. Different dates for national holidays (in the Datastream indices used, value for last trading day is recorded for these holidays), or because very small trading volume for some stock indices (so called “thin trading” or stale prices) can lead to asynchroneity in stock returns in different countries under study.

In this study, the problem of national holidays is corrected by including only returns, which time distance to the last and next trading day is 1 (Tuesday, Wednesday, Thursday), distance to the last trading day 1 and distance to the next is 3 (Friday), or distance to the last trading day is 3 and distance to the next is 1 (Monday), and also excluding the returns where the index value is the same than for previous day (holidays). The problem of “thin trading” is likely to be smaller for the data of this study consisting of the 12 relatively developed Eurozone stock markets than for example studies consisting of very small and underdeveloped stock markets. However, as an attempt to correct the problem, one factor lag for all factors is included when estimating the integration of a stock market.

Different closing times for stock exchanges (mainly due to time zone differences) are also a source for potential bias. Stock market, which closing time is the latest, can react to information that for the stock markets already closed is absorbed only in the next morning when the stock market opens again.

To remedy this potential bias, Pukthuanthong & Roll (2009) suggest including the lagged return for the stock market that closes the latest. However, different closing times are likely to be a smaller problem for the data of this study consisting only of European stock indices than if for example, North American countries would be present. Due to this, no correction for the different closing times is made.

In addition to the potential biases caused by the data, the estimation technique utilizing principal component analysis and moving-window regressions, can render the integration measures seriously biased. If the risk factors and integration measures are estimated using the same data, the integration measures can be upward biased. In this study, an attempt to remedy this potentially serious bias has been done by estimating the integration measures with risk factors where the dependent variable used in the estimation of the integration has been omitted from the data used in the estimation of the risk factors. For example, when estimating the integration measures for Germany, a data of 11 (12 – Germany) stock return indices were used in the estimation of risk factors. As an additional precautionary measure, sample weights from previous day were used when computing the risk factors: the principal components scores were computed by multiplying the stock returns by the factor loadings estimated for the previous day.

In addition to the biases already discussed, volatility could also prove to be problematic when estimating the integration measures. Volatility highly affects the :s used as integration measures. There could be changes in the measure due to volatility even if the level of integration was really constant.

Using moving-window regressions remedies this bias to a degree, but it does not remove it entirely. In this study, no specific corrections for volatility are made, as volatility is used as a determinant of integration in the panel models.

To assess the degree of this bias caused by volatility, in Chapter 4.1.1 estimations are conducted where volatility (measured by VSTOXX and VIX indices) are included as first factors when estimating the integration measures.

In overall, the effect is not large. However, volatility seems to have more effect on the integration measures for the least integrated Eurozone countries like Greece than for the most integrated like France.

It can be concluded that estimating risk factors with principal component analysis is little more laborious than using a regional stock index like EUROSTOXX. However, the major asset of the former method is that it captures only the common variation to all countries under study. Using this methodology, it is also possible to analyze how many factors are required to explain the variation of returns in Eurozone stock markets, and how has the number of factors required been changed during the EMU era. If a small number of factors can sufficiently explain this variation, the risk exposure of Eurozone stock markets is quite similar. However, if a larger number of factors is needed, the exposure is more heterogeneous.

In the next chapter, the panel models used in the panel regressions for the determinants of integration are described.