Aspect in Finnish from the Finnish-Slavic contrastive perspective . 6

As stated above, Finnish does not have any obvious grammatical markers corre-sponding to Slavic aspect opposition. However, the questionnaire based study of Dahl (1985) showed that from the typological perspective the Finnish diffential object marking (henceforth DOM, see Sections 4.2.2) is close to the same phe-nomenon as Slavic style aspect. Therefore it comes as no surprise that studies with the Finnish object in focus approach its case marking in terms of aspectual-ity (Askonen 2001; Heinämäki 1984, 1994; Larjavaara 2007).² I elaborate on this category, which is central for Finnish aspectology in Section 4.3.

Also Finnish-Slavic contrastive studies focus mainly on the relation between the aspect andDOM(Tommola 1986; Zmrzlíková 2009). Both works discuss par-ticular features of Finnish grammar in the light of contexts where parpar-ticular values of aspect in Russian or Czech are used. Both works refer to literary data, but only Tommola (1986) includes quantitative summaries.

The main focus is naturally placed onDOM. Tommola (1986: vii) states that the Total object contains the semantic feature of what he callsresultativity: “speci-ficness of the object concept and speci“speci-ficness of the end state resulted from the action” – which is proximate to the Russian verbal aspect. However, Tommola characterises Russian verbal aspect as governed by two features: boundedness (Rus.predel’nost’, the existence of a bound limiting the situation) and totality (Rus. celostnost’, the non-divisibility of situation’s structure). Resultative situa-tions are total. However, in Russian the non-total, but bounded situation may be expressed with both aspects, thus also withPFV, although in Finnish such situa-tions must be marked with the Partitive object.

Both authors discuss the role the measure adverbials in the object cases, and verbal affixes which modify the temporal structure of situation. (Tommola 1986)

2Additionally, aspect is often approached in Finnish linguistics from the cognitive perspective (Huumo 2006; Nurminen 2011, 2014, 2015, 2017; Sivonen 2007). I leave the cognitive direction without a comment, because the starting point for this inductive research is formal grammatical category, and not the cognitive definition of the termaspect.

also analyses aspectuality in the context of tenses and the lative-essive distinction (see Section 4.2.2).³

The expression of aspect appears in Finnish as clausal phenomenon and its realisation is syntactically motivated. Kangasmaa-Minn (1984) explains that with the fact that verbs in Uralic are not as rich in information as, for example, in Slavic languages. Therefore, aspect is encoded in nominal dependents of the predicate, as they carry most information in the sentence. Similarly to Tommola (1986), she points at the importance of the lative-essive distinction encoded in the system of Finnish cases.

Biskupska (2018) compares the verb derivation systems in Polish and Finnish.

In Polish deverbal lexemes are often derivied by means of spatial prefixes, which directly influence the aspectual value assigned to the lexeme (see Section 3.5.5).

Additionally, the reflexive marker si˛eoscillates between the status of clitic and affix (see Section 3.3), but its relation toPVAis unclear. In Finnish, the two most common groups of derivational affixes concern the change in number of argu-ments (see Section 4.2.1). Biskupska concludes that Polish and Finnish deriva-tives differ with respect to their semantic scope, in particular as to the notion of the change of state and aspect, which in Polish are more salient thanks to prefixa-tion.

1.4 Data and methodology

1.4.1 Bottom-up approach to contrastive studies on aspect

As indicated above, PVA does not have straightforward counterparts in Finnish.

Additionally, previous reseach suggests that identifying these correlates requires considering possibly broad context, minimally the unit of clause. In my view, this requires turning away from the traditional deductive reasoning used in the previous studies on aspect, and following the more agnostic, inductive approach.

Consequently, the present work is empirical and organised bottom-up. No hypothesis is assumed a priori and testedagainst the data, but the conclusion is drawn directlyfromthe data and reflected in the light of existing theories. In order to achieve this goal, I use statistical exploratory methods. I base my findings on the empirical data stored in the form of parallel corpus, that is, original texts aligned to their translations.

3I return to these studies in Chapter 8 in order to contrast their quantitative results with my own.

1.4.2 The distributional hypothesis

The main methodological assumption of this study arises from thedistributional hypothesisrelated to the work of Harris (1954), namely, that linguistic elements with similar distribution in texts belong to the same semantic-functional category (cf. Sahlgren 2008: 33). In other words, when two linguistic elements, for ex-ample,e1niceande2beautifuloccur regularly with another linguistic elemente3

girl, one may assume thate1 ande2 belong to the same linguistic class (in that case for example to the class of adjectives).

TheDISTRIBUTION OF LINGUISTIC ELEMENT“is a sum of all environments in which a linguistic element appears, and anENVIRONMENTof a linguistic ele-ment is an array of its co-occurrents, i.e. the other eleele-ments (...) with which an element occurs to yield an utterance” (Harris 1954: 146).⁴

1.4.3 Data

The studied sample covers indicative, affirmative clauses in Polish and Finnish which contain simple predicate forms, that is, predicates consisting of one finite form. Thus, the study excludes infinitival complements and participle clauses.

The clauses originate from the parallel Finnish-Polish texts (originals and their translations) obtained from various written sources. The corpus (see Chapter 5) is bidirectional, so both Polish and Finnish originals are included in equal propor-tions. The final data set, which consists of 900 parallel clauses, can be stratified into three subsamples according to text type: literary-narrative, informative and to-be-spoken. Literary-narrative texts are obtained from fictional texts, informa-tive sample covers news and essays. The to-be-spoken type includes play scripts, film subtitles, and dialogues extracted from literary texts. The text type stratifica-tion is motivated by significantly different tense-aspect discourse structures in the chosen samples, as shown in Section 5.6.

1.4.4 Methods

The annotation of corpus requires taking into accont the temporal systems of both languages in question. The lack of suitable framework allowing for comparisons between different temporal systems is one reason why little cross-linguistic work has been done (Dahl 2000: 3). The functional (Bondarko 1991: 64-94) and

4Harris (1954) includes in the definition of environment also the particular position of elements.

In the current approach, I omit this constraint.

cognitive-functional (Bartnicka et al. 2004; Dickey 2000; Lehmann 2009) models can be applied to cross-Slavic comparisons, but I see their weakness in assuming a limited (therefore not necessarily exhaustive) set of functions where the difference betweenPFVandIPFVis relevant. Since aspect is undeniably related to temporal-ity, and time is usually subject to measurement, one handy approach to examining temporality is scalarity (see Section 2.6). Therefore the temporal systems of Pol-ish and FinnPol-ish (see Chapters 3 and 4), within which aspect can be characterised, are described in the present work according to one scalar-temporal model (see Chapter 2). Afterwards, the corpus is additionally annotated for clause-internal morphological, semantic, syntactic and clause external features such as text type or temporal quantification.

The data set is analysed with a set of advanced quantitative methods. The linguistic features are preliminarily explored for their frequency and distribu-tion. This leads to two conclusions: 1.some features are rather infrequent and/or sparsely distributed, 2.the system of interdependencies is so complicated that it cannot be summarised with simple significance-testing methods.

Therefore, the most frequently occurring features are further summarised in statistical models. First, the similarity between semantic, grammatical and lexical features is explored with the notions of similarity and distance upon which a hi-erarchical cluster tree is built to show the data structure. The validity of the most informative clusters is further tested with random forests (Breiman 2001) with which I try to find out whether the value ofPVAcan be predicted directly from the most frequent Finnish temporal features.

The random-forests model, cluster analysis and descriptive statistics of the data are used to draw the final conclusions aboutPVAand its correlates in Finnish.

In document Polish verbal aspect and its Finnish statistical correlates in the light of a parallel corpus (sivua 28-31)