• Ei tuloksia

Further Division of Respondents into Clusters Based on Attitudes The factor analysis sheds some light on the general attitudinal dimensions that

7. Attitudes, Information, Preferences, and Willingness to Pay

7.2. Further Division of Respondents into Clusters Based on Attitudes The factor analysis sheds some light on the general attitudinal dimensions that

can be found in the data. However, the factors carmot be attached to single observations without further manipulation. This can be done by means of factor scores which express how the observation is ranked in respect of a certain factor19. If the factor scores are normalized into the form of a standard normal distribution, then we can easily see what the relative position of the observation on the factor in question is. By using the so-called cluster analysis, it is possible to use the information that is inherent in the factor/ scores. The idea of the cluster analysis is to divide observations in such a inanner that observations with a similar factor score pattem will be grouped together. More generally, the cluster analysis aims to allocate a set of observations to a set of such clearly identifiable groups in which observations within a group are similar to one another while observations in different groups are dissimilar (Chatfield and Collins 1980, p. 212).

The clustering process can be conducted in various ways. It is possible to group variables instead of observations (in this respect, the cluster analysis has quite similar goals as the factor analysis). The groupings can be disjoint, hierar-chical, overlapping, or fuzzy. Disjoint clusters place each object in one and only one cluster. Hierarchical clusters are organized so that one cluster may be entirely contained within another cluster, but no other kind of overlap between clusters is allowed. Overlapping clusters can be constrained to limit the number of objects that belong simultaneously to two clusters, or they can be uncon-strained, allowing any degree of overlap in cluster membership. Fuzzy clusters are defined by a probability or grade of membership of each object in each cluster. They can belong to any of the cluster categories mentioned above, i.e.

they can be disjoint, hierarchical, or overlapping (SAS 1987, p. 47). In this study, the cluster type chosen was the disjoint cluster because otherwise the analysis of the relationship between attitudes and willingness to pay would have become too complicated.

The clustering analysis was conducted by using the FASTCLUS procedure that is included in the SAS Statistical Software Package. The FASTCLUS procedure finds disjoint clusters of observations by means of a k-means method.

This method was developed by MacQueen (1967), who suggested the term k-means for describing his algorithm that assigns each item to the cluster having

19 See Appendix F for a more detailed explanation of the nature of factor scores.

the nearest centroid (mean). In its simplest version, the process is composed of three steps. First, the observations are partitioned into k initial clusters (k is defined by the researcher). Then, the algorithm proceeds through the list of observations, assigning an observation to the cluster whose centroid (mean) is the nearest (distance needed in this process is usually computed using Euclidean distance with either standardized or unstandardized observations). Next, the centroids for the cluster receiving the new observation and for the cluster losing the observation are recalculated. This procedure is repeated until no more assignments take place (Johnson and Wichem 1982, pp. 555-556). The FASTCLUS procedure does not start by the partition of ali observations into k preliminary groups, but a corresponding approach is applied. A set of points called cluster seeds is selected as a first guess for the centroids (means) of the clusters. Bach observation is then assigned to the nearest seed to form temporary clusters (SAS 1987, p. 494).

The FASTCLUS procedure does not automatically recommend a certain number of clusters, but it prints two statistical criteria, the pseudo-F or the Calinski and Harabasz index and the cubic clustering criterion (CCC), which can be used in the identification process of clusters. The pseudo-F value is computed as [trace B/(k-1)] / [trace W/(n—k)J where n and k are the total number of observations and the number of clusters in the solution, respectively.

The B and W terms are the between and pooled within cluster sum of squares and cross product matrices. The CCC is a product of two terms. The first term is the natural logarithm of [1--E(R2)] / (1--R2) where R2 is the proportion of vari-ance accounted for by the clusters, and its expected value is determined under the assumption that the data have been sampled from a uniform distribution based on a hyperbox. The second term is (np/2)" / [0.001+E(R2)11 2 where p is an estimate of the dimensionality of the between cluster variation. The constant terms are chosen based on extensive simulation results (Milligan and Cooper 1985). There is no unambiguous way to interpret the pseudo-F and CCC criteria, but usually the recommendation is to look for consensus between the two statistics. The idea is to find simultaneous local peaks for both test statistics.

When calculating the pseudo-F and the CCC, we also assume that the variables which the clustering process is based on do not correlate with each other (SAS 1993, p. 98). In this case, the noncorrelation is guaranteed because the factor scores used in the cluster analysis were created through an orthogonal factor analysis.

In Figure 7.1 the results that the test criteria produce are represented graphi-cally. There are three local peaks, two minimums (6 and 8 clusters) and one maximum (7 clusters). In this case, the seven-cluster solution was chosen be-cause it appeared to offer a meaningful interpretation for existing attitudes. The essential information conceming the chosen solution can be found in Table 7.2.

The mean factor scores (MFS) represent the mean value of factor scores that the

factor in question has received in the cluster. Because factor scores are normal-ized and standardnormal-ized, a positive mean value of factor scores indicates that the cluster having the positive mean value has a stronger-than-average tendency to support the views expressed in that specific factor. Similar but inverse logic cän he applied to negative values. The interpretation of the cluster-related mean factor scores is the main source of inference when essential features of the clusters are analyzed further.

The clustering procedure was carried out with a reduced sample size. Five observations were removed because of insufficient factor scores. Sixteen obser-vations were excluded because of high individual WTPs (WTP FIM 2500).

This was done in order to receive more reliable mean WTP estimates within clusters. The exclusion of observations did not influence the cluster structure in any significant way. The contents of the clusters were interpreted by using the means of normalized factor scores and the mean WTPs of each cluster. The aim was to find a plausible explanation that offers a consistent overall picture about the relationship between attitudes and WTP. In addition, a few socio-economic variables were tested in order to find out possible differences regarding the gender, age, income, place of living, and education between the clusters. The characterization of the clusters is presented below, but first some general re-marks are made. The cluster related mean values of socio-economic variables are represented in Table 7.3.

Figure 7.1. Relation Between the Number of Clusters and the Test Criteria.

Table 7.2. Solution with Seven Clusters.

Cluster Frequency % Mean score,

Factor 1 Mean score,

Factor 2 Mean score,

Factor 3 Mean WTP, FIM/year

1 53 8.2 -0.83 -1.09 -1.61 149

2 66 10.3 -1.38 -1.34 0.03 360

3 175 27.3 0.79 0.72 0.71 410

4 103 16.0 0.39 0.77 -0.57 223

5 11 1.7 -0.78 1.41 -3.57 64

6 78 12.2 -1.27 0.54 -0.47 276

7 156 24.3 0.36 -0.71 0.55 418

642 100.0 333

Table 7.3. Cluster Related Means of Certain Socio-Economic Variables.

Cluster Gender females-%/

males-%

Age, years Gross income,

FIM/year Place of

living* Education*

1 40 / 60 43 160300 4.23 2.87

2 53 / 47 37 187600 4.68 2.82

3 55 / 45 43 164400 2.99 2.98

4 36 / 64 37 160500 2.81 3.42

5 27 /73 28 160000 3.27 3.18

6 47 / 53 35 160800 3.03 3.13

7 63 / 37 38 154900 3.61 2.92

Average 51 / 49 39 163000 3.40 3.03

* Place of living and Education variables are presented in the form of an index. In the case of Education, the index can vary from 1 to 6, 1 representing elementary level education and 6 indicating a university degree. The index of Place of living can also vary from 1 to 6, 1 representing the center of a big city and 6 indicating a sparsely populated rural arca. See Appendix B, questions 5-1 and 5-5 for a detailed description of the response alternatives.

The relationship between attitudes and WTP seems to be quite logical. For instance, the highest mean WTPs take place in Clusters that have the highest MFCs (Clusters 3 and 7) in relation to Factor 3. This is no doubt what could be expected because Factor 3 represented clearly articulated appreciation of sus-tainable development. Correspondingly, Cluster 5 has the lowest MFS in re-spect of Factor 3 and it has the lowest WTP of ali Clusters, which is consistent.

The obvious conclusion is that attitudes have some infiuence on the stated individual WTPs. When comparing the mean WTP differences across the clus-ters, some statistically significant2° differences were found (Table 7.4). When

20 When reference is made to statistical significance, the 5% risk level is meant if not stated otherwise.

socio-economic variables were tested against Clusters, it was quite surprising that there were no differences in means among Clusters regarding income and education. However, some statistically significant differences across Clusters were detected in respect of age, gender, and place of living (Table 7.4).

The problem with pre-tax income was that quite many respondents did not give the information (103 out of 642 or 16%). The missing answers were replaced by the average pre-tax income, which obviously led to a uniform cluster-related income distribution. This remarkably decreases the explanatory power of income and partly explains why no differences in income across clusters were detected. Another explanation is that attitudes do not depend on financial matters. Nevertheless, the economic theory presupposes that there is a positive correlation between income and WTP. However, in this data, the corre-lation coefficient between income and WTP is extremely low, only 0.017, even Table 7.4. The Statistically Significant (t=0.05) Differences in the Cluster-Related Means of Age, Gender, and Place of Living.

Compared clusters Age Gender Place of living WTP 1-2

1-3 *** 1 *** 3

1-4 *** 1

1-5 *** 1

1-6 *** 1 *** 1

1-7 *** (60%--37%) *** 1 *** 7

2-3 *** 2

2-4 *** (47%--64%) *** 2

2-5 *** 2

2-6 *** 2

2-7 *** 2

3-4 *** 3 *** (45%--64%) *** 3

3-5 *** 3

3-6 *** 3

3-7 *** 3 *** 7

4-5 4-6

4-7 *** (64%--37%) *** 7 *** 7

5-6

5-7 *** (73--37%)

6-7 *** (53%--37%) *** 7

(The statistical significance is indicated by the symbol "***". When age, place of living and WTP are concerned, the number following the symbol "***" tells which of the clusters has a higher mean. In the case of gender, the percentage numbers refer to the percentage of males inherent in the clusters under comparison.)

when the respondents not revealing their income are removed. This means that it is not appropriate to derive elaborated conclusions that are based on income data.

If the respondents' mean WTP is cross-tabulated in relation to education, some observations can be made. If those who have the highest level of education (a university degree) are compared to those who have the lowest education, there is a statistically significant difference in means, FIM 550 (N=52) versus FIM 237 (N=162), showing that people who have a university degree are ready to pay the higher amount. Although a clear linear correlation between education and WTP cannot be found, the considerable difference between the least and most educated groups suggests that education might have some influence on WTP. However, because we cannot detect any statistically significant differ-ences in respect of education when the attitude-based clusters are in question, it may be safe to conclude that education is not a decisive factor when an indi-vidual takes his position on issues related to the environment and agriculture or their interaction.

The verbal interpretation of Clusters:

Cluster 1: People who prefer conventional agricultural practices. At the first glance this cluster seems to be somewhat difficult to interpret because ali the MFSs are negative, meaning that people in this group do not really think that current agricultural production prac-tices are environmentally harmful or that farming is a burden to taxpayers. In addition, people in this group neglect sustainable development, which is indicated by a relatively low mean WTP. In further tests, it appeared that 26 out of 53 members of this group were farmers, when random selection would have produced only 2 or 3 farmers. Thus, the possible explanation is that people belong-ing to this group are professionally related to agriculture and have mainly for this reason a positive attitude towards agriculture. This is why the sustainability factor scores such low points: these people do not see any need for change. According to their opinion, the current way of farming is both socially and environmentally the most desirable production alternative. This is perceivable because they also defend their economic interests: they have invested in a certain production technology and want to receive a decent return for their investment. This group has male dominance, although the difference is statistically significant only when compared to Clus-ter 7, which has the largest relative amount of women. Moreover, because of the dominance of farmers, people belonging to this

group most likely live in the countryside. The difference in means21 is statistically significant compared to Clusters 3, 4, 6, and 7 the members of which are more likely to be city-dwellers.

Cluster 2: People who have a positive attitude towards agriculture in gen-eral. This group resembles Cluster 1 in respect of the first two factors. According to the MFS of Factor 1, members of Cluster 2 are even less critical towards environmental problems caused by conventional farming than members of Cluster 1. However, mem-bers of Cluster 2 are more concerned about sustainable develop-ment and state higher mean WTPs (although the difference in means is not statistically significant), probably as a consequence of their concern. Despite their appreciation of farming, the members of this group do not oppose changes in production practices as strongly as the members of Cluster 1. The members of this Cluster have a positive attitude towards agriculture in general and are even to some extent environmentally conscious, even though this con-sciousness is at the average level (the MFS of Factor 3 is very close to zero, 0.03). This group has female dominance, although the difference is statistically significant only when compared to Clus-ter 4, which has the second largest relative amount of women. We can say that Cluster 2 is a female equivalent of Cluster 1. Conse-quently, again, people belonging to this group most likely live in the countryside. The difference in means is statistically significant compared to ali other Clusters (except to Cluster 1, of course). We can conclude that the increase in the relative number of woman shifts the attitudinal emphasis to a less conservative direction when environmental issues are in question. It is also interesting that in this group the average gross income is the highest. However, in-come differences across Clusters are not statistically significant, as already indicated before.

Cluster 3: People who demand a change in current farming practices and are