• Ei tuloksia

Chapter 2 Research History and Pottery Studies

2.7 Statistical Approach to Pottery

Archaeological inference is dependent on the supposition that the material culture of ancient people reflects something inherent in the culture and life of those people. The patterns that are detected in the material can be interpreted in terms of culture. The similarities as well as differences between artifacts reflect in a systematic way something related to the culture that produced and used them (see e.g. Rice 1987: 274–275). What are now archaeological materi-als were once part of a living culture, and inferences of this culture should be obtainable through it. However, the ancient population is reached only by inferences. The remains are partial at any site due to the processes active in the formation of the archaeological record (discard patterns etc.) by the ancient population, as well as everything that might afterwards affect the deposited material (like scavenging) before the modern excavation. Links between the materials that are collected into data sets and the human activities in the past have been called middle range theories (Johnson 1999: 48–61). The basic idea is that information that is independent of the archaeological observations may be used to falsify or validate the inter-pretation of archaeological materials (Binford 1980: 13–19; Yu et al. 2015:3). It is important that the reasoning and analogies are explicitly stated so that they may be evaluated.

The human mind is inclined to see patterns even in cases where there are none, and to be blind to unexpected phenomena if one is focusing on something else, or if the phenomenon is not a dramatic one (Kahneman 2011). Both these restrictions can be compensated for by using statistics. The impressions one has of an existing pattern can be tested. This is possible to do by using statistical tools that may expose a difference between two things (like decora-tion on jars being less common in a later phase than the previous one), as well as whether a

co-occurrence of some features is due to chance rather than reflecting a real difference. Sta-tistical testing allows one to define the staSta-tistical significance of any measurable difference or co-occurrence. At the same time, statistical tools enable one to detect small scale variations that would easily go unnoticed if not expected. The use of statistical tools for classification is essentially heuristic – as are the traditional, more intuitive ways of typological work as well.

Another benefit is the consistent treatment of the material. It is easy to check how observed features and typological assignations fit together, that is if the patterns are constant. As a result, this can be used inconfirming the typological groups, or, if the patterning is weak, ex-posing a reason to reconsider them.

The most commonly used statistical tools in pottery studies in Israel-Palestine are descriptive statistics. The descriptive summary methods include frequency tables of pottery types accord-ing to areas and phases (e.g. Arie 2006, Panitz-Cohen 2009, section 5.2) and/or graphic fre-quency presentation of the same data in trend lines (Hunt 1987), bar-charts (e.g. Bunimowitz

& Finkelstein 1993; Mazar & Panitz-cohen 2001; Zarzecki-Peleg et al. 2005), or pie-charts (Gal

& Alexandre 2000; Liebowitz 2003), along with the tables. The frequency information is then used to indicate differences between the settlement layers, or areas within a site or between sites. This summarizing information of the pottery assemblage is indeed useful when assem-blages from different sites are compared, and local developments in pottery tradition are ex-plored. In most cases it is also needed because the vast amount of the material would other-wise be incomprehensible without organizing it through typology combined with frequency data. The descriptive methods are also the requisite first step for more advanced analyses.

Type and feature frequencies and distributions of measured features like length or thickness are needed for more complicated statistical analyses. However, such more detailed analyses are usually lacking in pottery reports, even though interactions between the different ob-served details might be highly interesting. This is a gap I wish to fill in for the Tel Kinrot mate-rial, by testing the significance of some differences as well as looking for associations between features like tempering and vessel form.

The summary descriptive statistics do not necessarily aim at generalizations. As long as one is only describing the features within the assemblage at hand, there are no strict demands on the collection of data and its representativeness for a certain, defined population. At the mo-ment one aims at defining a relationship or difference within the “real” world that was once there, and of which the materials derive from, one is faced with higher standards of data re-trieval. This is because the analysis, in order to be trustworthy (i.e. reliable and valid) has to be based on material that can be considered representative for the defined original popula-tion (see below). It is good to remember that the frequency tables of vessel classes and types are dependent on the typology according to which the material was sorted. Variation within the defined types is rarely discussed in any detail in pottery reports in Israel-Palestine. In the pottery typology (section 5.2) I have included tables, as well as pie-charts, of the frequency of the types I have created and identified in the material. In addition, I have presented some

information on the variability of rim forms and diameters within the vessel types (in histo-grams and box-plots). I also tested if there were statistically significant differences for certain features, like rim diameter, between vessels of same type from different stratigraphic layers (section 5.3). The results indicated that there was a real difference only in some vessel types (cooking ware), and only between some of the stratigraphic phases. Such results are rather easy to explain for a continuous settlement with a relatively short time span.

It is nowadays customary to use computer registration of pottery finds (e.g. Gal & Alexandre 2000). However, it is rare that the readers are given details of the registration: what kind of features have been recorded and how they have been measured, not to speak of the actual data, including the measurements of all recorded items. The last mentioned would in most cases make a vast and a hardly readable catalogue (like the appendices in Schmidt 2013).

However, providing the information as a table on a web site of an excavation would be feasi-ble, and could be used by all interested. There are also some excavations in Israel that have opened up the registration process and the details that were selected to be measured: the projects at Tell Qasile (Mazar 1985), the Yoqneʿam Regional Project (Hunt 1987), and Timnah (Mazar & Panitz-Cohen 2001), as well a set of 356 vessels from Tell es-Safi (Zweig 2012: 432–

433). Two projects lead by Pfälzner in Northern Syria present the registered variables in detail as well (Pfälzner 1995: 10–12; Schmidt 2013: 7–22). Many reports leave such information out, such as the reports on the excavations at Tel Yin’am (Liebowitz 2003), Tel Miqne-Ekron (Meehl et al. 2006), and Tell Abu al-Kharaz (Fischer 2013), as well as series of the renewed excavations at Tel Beth Shean (Mazar 2006; ed. by Mazar & Mullins 2007; ed. by Panitz-Cohen & Mazar 2009), Megiddo (ed. by Finkelstein et al. 2006; ed. by Finkelstein et al. 2013), and Tel Dor (ed.

by Stern 1995).

Surface treatment of slip, burnish, and decoration were recorded separately in the published systems of Tell Qasile (Mazar 1985: 22–24) and the Yoqneʿam Regional Project (Hunt 1987:

220–223). Such features must have also been recorded in the renewed Megiddo projects, at least for the Iron Age material discussed by Arie (2006, 2013a, 2013b). Even though the rec-orded details were not given, the discussion and tables concerning the decoration techniques indicate that the surface treatments were recorded in detail. Details concerning the clay (color, inclusions) were recorded at Tell Qasile (Mazar 1985: 21), at Tell Šēh Ḥamad (Pfälzner 1995: 10–12), and at Tall Mozan (Schmidt 2013: 10–11). While the descriptions of pottery types usually include some discussion of the vessel sizes, it is surprising how the size related features seem to be absent from recording sheets. The size measurements have been included in the published recording system for material from Tell Šēh Ḥamad (Pfälzner 1995: 10–12), Tall Mozan (Schmidt 2013: 20–21), and in the study of the Tell Zarʿa po. ery (Dijkstra et al.

2009: 58). In order to provide a reader unfamiliar with statistical work a glimpse at the practi-cal work, a screenshot of the Tel Kinrot recording sheet appears in Fig. 2.4, and a photo in Fig.

2.5 includes shards that appear to me as normal for the kind of material I have analyzed. All measured features and the method of their measuring is explained in detail in section 5.3.

While the size related features are not commonly reported in the recorded details, they are inherently taken into account in the typologies, as the vessel sizes are built in the typological classifications (as in section 5.2). The volumes of many well preserved vessels have been pub-lished in several reports from the 21st century (e.g. Panitz-Cohen 2009, Martin 2013 and Fischer 2013). Volumes can be calculated when the vessel profile can be reconstructed. In this respect, the fragmentary nature of most archaeological assemblages restricts the amount of vessels for which the volume can be counted. While the rim diameter or circumference is a far more readily available feature, it would be desirable to establish its relation to other size-related features. Such a trial has been made for the cooking pots from Iron Age Cooking pots at Tell Zarʿa, where a stable rela onship between rim circumference and original surface was found (Dijkstra et al. 2009: 58–61). However, in order to be reliable, such a study requires that the vessel form is approximately constant, or that the changes are known and follow a pattern that can be identified and given some approximation.

There are some concepts that in their statistical use have a somewhat more restricted mean-ing than their use in common language. Such concepts include terms like population, repre-sentativeness, bias, reliability, and validity.Population in statistics means the totality of all the items of interest for the study that can be studied, like all Finnish school kids, or all European sea bass in the Mediterranean Sea. The trustworthiness of the collected data is related to its representativeness. It is important to be clear about what thetarget populationis, for which thesample should be representative. Ideally, the sample is drawn from the population so that each item in the population has an equal chance to be included in the sample, regardless of its properties (except for the property of belonging to the target population). However, the target population is commonly slightly different from thesampled population, from which the actual sample is drawn (e.g. not all school kids were included in the lists that were used for sampling), and the sampling often does not fulfill the requirement of randomness because of missing observations (e.g. not all those schoolkids that were sampled randomly from the lists were actually reached). The things that are missing tend to be not random, and therefore the samples that we have as our data include somebias.

As for the Tel Kinrot ceramics, the target population could be all the pottery vessels produced and used at the site during its settlement (in the Iron Age), while the sampled population would be all the pottery that was preserved through the centuries and is in principle available for study through excavation. The sample would be the pottery actually excavated (including also ceramics deriving from periods before our target of the Iron Age). This sample is inevitably abiased one. This is because some ceramic items are more prone to disappear because of their ware and post-depositional conditions (for example, brittle shards on a trampled loca-tion). Some items are more easily missed or overlooked during the excavation (like small and/or worn shards from mixed deposits), while vessels in certain kinds of contexts (like sealed floor assemblages) are better preserved, identified, and studied. Such factors cannot be con-trolled by the researcher. The retrieval strategy then is a sample of a (biased) sample. At Tel

Kinrot, the pottery from the areas excavated with an intensive retrieval strategy can be con-sidered as a representative sample of allexcavated pottery, because keeping and studying all the rim shards should bear a relationship to all excavated ceramics. Even though the different vessel types when broken will produce different amounts of rims, the discrepancies between the vessel types can be estimated. However, it would require a thorough study of possible processes related to discard and various later formation processes at the site to arrive at some estimate of their relation to the pottery originally produced and used. This latter enterprise lies beyond the scope of this study.

There are two central concepts relating to measuring different features of observed items:

reliability andvalidity. Reliability means that the measurements are correct: accurate and con-stant. If (when) there are errors in the measurement, they are not biased systematically and therefore sum up to zero and vary in a random way. This is a prerequisite for the effect that the counted values for statistics like means or standard deviations can still be correct, even though there are some errors in the data (which is usually inevitable). Validity means that the measured feature is really indicative of the phenomenon of interest. The validity always re-lates to knowledge of the phenomenon studied and needs to be assessed by criteria arising from the subject matter.

Figure 2.6 presents a model of factors, attributes, and their measurements. Thefactors (on the left of the diagram) are the main influents for the different features (attributes) of the pottery. The arrows indicate the assumption of which features have the strongest influence. I have considered here four factors: the date, the vessel type, the external contacts, and the properties of the clay and firing technique. The external contacts are dependent on the date, but were considered separately as they do not seem to affect all the vessel types: e.g. bowls, kraters, and small containers seem more prone to bear signs of contact, while the vessels for cooking and storing seem to be more resistant to cultural influences (Yasur-Landau 2010: 9–

33; 227–266). The properties of the clay and firing technique were supposed to be rather con-stant during the short period of the Early Iron Age. The key to separating the effect of the date is keeping the other factors constant. A second step is to define the effect of the vessel type.

It is clear that the function is closely related to all the features of the vessels measured. The type of the vessel is initially concluded from a sum of these features, and it should not be a problem to reconstruct the degree of this effect. The remaining differences would account for the difference in time. The “noise” effect of random variance, and some other factors un-known to us, would still remain. The external measure for the dating is provided by local stra-tigraphies. The results of the factor analyses (see chapter 5.3) is congruent with the idea that at least the color variables co-vary (presenting the properties of clay and firing), as well as the different thickness measurements related to rim form (related to vessel type and date).

There are two kinds of error. The first is an error in sampling, which can be estimated using confidence intervals. The second kind of error is an error of measurement, which will follow from the process of changing observed features into attributes of variables, i.e. into numbers denoting the attributes. There is an error effect (ε) in all variables, but their amount and risk

of being systematic vary. For the measurements of size (ratio scale), one can suppose that the error is unsystematic, normally distributed, and sums to zero. In the variables of nominal scale (classes), the errors of measurement are actually false classifications. As all classifications of archaeological materials are modern constructions, and there is no inherent and “true” clas-sification, I mean by false classification a deviation from the definitions that I considered final after completing the sorting process. There is a risk of systematic error in classificatory varia-bles, relating to their frequencies. While some vessel types are more common than others, it may happen, that in border-line cases one easily classifies an object as the more common type instead of a rare one. This leads to an over-representation of those types. This relates to the nature and definition of types. The differences between the variability and conformity of dif-ferent types of vessels, and the clarity of their boundaries, might reflect the nature of the material. The errors in group memberships are difficult to evaluate because of the heuristic nature of all grouping methods. If some rules are explicated, the departures of these rules can be considered classification errors. There might be systematic tendencies towards the center in the ordinal scaled variables, such as hardness and the amount of tempering materials. In these cases, no mechanical measurements have been used.

The assemblage of ceramics that were retrieved through an intensive strategy comprises ce-ramic material (mainly shards) from two areas of excavation (U & W). The material is collected in one data set, but can be split according to the areas when, for example, features between different local strata should be compared. The shards were also ascribed to a vessel type, with the background suppositions of the full vessel forms defined with the help of material exca-vated earlier at the same site (1994–2001) and typological literature (e.g. Amiran 1969, and several excavation reports). The assumption about the full form of the vessel of a preserved rim fragment does not affect the measurement of most variables, such as thicknesses or color.

The analyses of this pottery material are presented in section 5.3. The material from non-intensively retrieved areas in the KRP (areas N, R and S) and Fritz’s campaigns (1994–2001) at Tel Kinrot cannot be considered as a representative sample, and therefore I did not include this material in the statistical study. It is included in the typology of the pottery material, and in addition to its typological value it provides information about the layers the material has been collected from, and the date and function of these contexts – the approach prevailing in the excavation reports.

Typological pottery studies in Israel-Palestine are focused on the date and function of the ves-sels. However, there are other features in the ceramic material that can have archaeological significance. For example, some contexts include more worn fragments, and this can be useful information for interpreting the formation of the context. Worn fragments can be a sign of erosion that has affected the material. If we also have worn pebbles in the same context and the shards are of several periods, interpretation as erosional accumulation seems likely, but if the material is packed below a floor and does not include pebbles, a constructional fill seems a better interpretation. Such information is not related to the typological work, but can be helpful in other respects, and is not available if only well-preserved vessels are studied.

Statistical modelling

An analytical model is a simplified representation of a complex reality. Creating a model ena-bles the setting up of hypotheses and their testing against the archaeological data (Orton 2004). A central question with all modeling is how large are the discrepancies that will be

An analytical model is a simplified representation of a complex reality. Creating a model ena-bles the setting up of hypotheses and their testing against the archaeological data (Orton 2004). A central question with all modeling is how large are the discrepancies that will be