• Ei tuloksia

Molecular Clock Study

Molecular and Morphological Data

DNA was extracted from the Arrhipis and Melasis species plus the outgroups. Due to the poor quality of the DNA retrieved from the dried and pinned museum specimens five genetic markers were chosen on the basis of most successful DNA amplification. The following four loci from the mitochondrial genome and one nuclear gene were used (Paper III & IV).

Mitochondrial: 1) partial sequence of the small subunit ribosomal 12S gene, 2) partial sequence of the large-subunit ribosomal 16S gene, 3) partial sequence of the Cytochrome c Oxidase subunit I (COI) and 4) partial sequence of the Cytochrome b (Cytb) gene. Nuclear: 1) small-subunit ribosomal 18S gene. For testing the temperature effect of the metabolic clock (Paper IV) it was useful to sequence mitochondrial as well as nuclear genes. It has been argued that the metabolic rate effect acts much stronger on the former.

This is due to mitochondria being the production sites of reactive oxygen species (ROS) which are toxic by-products of metabolism. Therefore, DNA damage from metabolites might be higher and thus mutation rates faster in mitochondrial rather than nuclear genes, especially in warmer climates (Martin and Palumbi, 1993; Fontanillas et al., 2007).

For the Syrphidae dataset DNA sequences for 1) the mitochondrial Cytochrome c Oxidase subunit I (COI) and 2) the nuclear large-subunit ribosomal 28S gene were

obtained from Genbank

(http://www.ncbi.nlm.nih.gov/) (Paper IV).

Eucnemid morphological data (Alaruikka and Muona, unpublished) was revised and additional genera were added to the matrix in order to analyse the position of Arrhipis.

Twenty-six morphological characters, some of them displaying multiple states (a total of 49 binary characters), were analysed (Paper III).

Phylogenetic Inference

A phylogenetic analysis combining morphology as well as molecular data was performed for the Arrhipis dataset using parsimony as the optimality criterion (POY version 3.011.a (Wheeler, 1996; Giribet et al., 2002)). Additionally, a Bayesian inference analysis on the Arrhipis

19 molecular data was made (MrBayes 3.1.2 (Ronquist and Huelsenbeck, 2003)) (Paper III). The underlying principle of the use of parsimony as an optimality criterion for reconstructing phylogeny is to assume as little as possible about any mechanism of evolution. Each evolutionary event is unique and consequently no a priori chosen model for species formation can be applied. Therefore, instead of relying on a statistical framework to find the “best”

topology, parsimony favours the tree that requires the fewest evolutionary changes (Steel and Penny, 2000). Thus, the explanatory power of a phylogeny depends on the degree to which it can minimize homoplasies (characters whose origin can not be explained by common ancestry) (Farris, 1983). In contrast, the Bayesian approach is a statistical inference using Markov chain Monte Carlo (MCMC) methods to obtain the “best” tree. It allows the choice of a specific model of evolution as well as the incorporation of any available prior information. The posterior probability, which is proportional to the product of the likelihood of the data given the model and the prior probability, is then calculated. The phylogenetic tree with the highest posterior probability is favoured as the most likely one (Felsenstein, 2004).

Using such philosophically and methodologically differing approaches as parsimony and the Bayesian framework to

analyse the data increased the confidence in the results.

To the obtained topology (Paper III) the species of the genus Melasis were added as the sister group of the genus Arrhipis according to their relationship described by Muona (1993) (Paper IV). To make sure the placement was correct a BEAST (Bayesian Evolutionary Analysis Sampling Trees) (Drummond and Rambaut, 2007) analysis was carried out on the Arrhipis and Melasis molecular data. BEAST allowed the constraining of the Arrhipis clade to the relationships obtained in Paper III. The topology for the tribe Syrphini was taken from the recently published parsimony tree of predatory flower flies (Diptera, Syrphidae, Syrphinae) (Mengual et al., 2008). The Eucnemidae and the Syrphidae tree were then used to test the prediction that taxa in colder climates have lower mutation rates, due to slower biochemical processes, than those taxa found in warmer climates (Paper IV).

Dating, Molecular Clock Models and Temperature Estimates

The split of Pangea into the two super- continents Gondwana and Laurasia and the subsequent break-up of first Gondwana and then Laurasia is a firmly accepted concept today (Li and Powell, 1993;

20 Sanmartín et al., 2001) (Fig. 1). In biogeography the age of taxa has been correlated with the age of geographic events, since it has been concluded that earth and life evolved together, during phases of uplift, continental drift and many other processes. Thus, biogeography so far seems to be the most promising method in dating evolution, since it clearly distinguishes age of being and age of fossilization (Heads, 2005).

The phylogenies of the Arrhipis and Melasis genera show a vicariant Gondwanan and Laurasian break-up pattern, respectively. Speciation events in the genus Arrhipis reflect the opening of the South Atlantic Ocean causing Africa to split from South America around 120-100 million years ago (mya) (Paper III, Murphy et al., 2001; Sanmartín and Ronquist, 2004). The Eucnemidae topology was dated using the Melasis North America and Europe split (Paper IV, Study I). These two landmasses are believed to have separated in the Early Tertiary, 65-55mya (Janis, 1993; Sanmartin et al., 2001). Even though a land bridge connecting North America and Scandinavia existed until around 40mya (Sanmartín et al., 2001), the dispersal of the Melasis via this route is highly unlikely, since the movement was restricted to very cold adapted organisms.

Thus, conditions for the dispersion of the

Melasis species via this landmass were too harsh (Muona, pers. comm.).

Assuming that mutation rates from organisms found in temperate regions are slower than those from tropical organisms (Gillooly et al., 2005) we predicted that dating our topology with the ~60mya Melasis split will show older speciation events within the genus Arrhipis than would be expected from the tectonic ~120-100mya Gondwanan break-up date (Paper IV, Study I).

To see which gene fits which molecular clock model best, likelihood ratio tests

Fig. 1. Break-up of Pangea

21 were carried out using BASEML in the PAML v3.15 package (Yang, 1997). Strict clock (one constant rate assumption), local clock (the global clock rate is divided into several, local rates), autocorrelated clock (autocorrelation puts a limit on the speed a rate is allowed to change from the ancestor to the descendant) and no-clock (no rate assumption) models were tested. The clock model with the highest likelihood score plus the no-clock model were used to obtain substitution rates and branch length for each gene separately. Each tree for every gene was fixed according to the Arrhipis/Melasis topology (Fig. 3) and the trees were dated using the appropriate clock models in BASEML and MULTIDIVTIME (Yang, 1997; Thorne and Kishino, 2002) (Paper IV, Study I).

For the Syrphidae dataset no divergence time calibration points were available, thus the phylogenetic tree (obtained from Mengual et al., 2008) could not be dated and therefore only branch lengths under a no-clock model for each gene individually were acquired in BASEML (Yang, 1997) (Paper IV, Study II+III).

Longitude and latitude coordinates for the collection sites of each species were obtained from mapping the collection

location in Google Earth

(earth.google.com). Average yearly temperatures and number of frost free days for each data-point were then acquired

using the WorldClim database in ArcMap (http://www.esri.com/software/arcgis/).

These temperature estimates were then transformed into the Boltzmann factor, which underlies the temperature dependence of metabolic rate:

Boltzmann factor = e-E/kT

Where E is an average activation energy for the biochemical reaction of metabolism (~0.65 eV), k is the Boltzmann’s constant (8.62 x 10-5 eV.K-1) and T equals absolute temperature in degrees Kelvin (Gillooly et al., 2001; Gillooly et al., 2005).

The calculated Boltzmann factors as well as temperature estimates in degrees were used as surrogates for body temperature (Paper IV: Boltzmann factor used in Study I + III, temperature estimates together with average number of frost free days per annum applied in Study II, according to Estabrook et al., 2007) since it was assumed that “extant ectotherms are approximately in thermal equilibrium with their environment, and that they occur in a similar thermal environment as their ancestors” (Muona, 1993; Gillooly et al., 2005).

22 Methods applied to test the Metabolic-Rate Dependent Molecular Clock

Three methods were chosen to test the temperature effect on a body size corrected metabolic clock (Paper IV). In previous studies these methods had revealed a temperature and body size effect (Gillooly et al., 2005; Estabrook et al., 2007) or at least a body size effect (Fontanillas et al., 2007) on the metabolic clock:

The first study (Paper IV) was carried out according to Gillooly et al. (2005) and the Eucnemidae dataset was used. Mutation rates of temperate and tropical species were studied across the tree. The topology was dated for each gene using the molecular clock model of best fit. The no-clock model was also applied. The aim was to infer whether:

a) the temperate clade mutates slower than the tropical one and whether results are consistent, regardless of the choice of clock models or mode of expressing genetic change used

b) correcting for temperature will reconcile molecular and biogeographical divergence dates and lead to a strict molecular clock Two-tailed non-parametric Spearman’s rank correlation tests were performed. The Boltzmann factor for ancestral nodes was reconstructed using “ace” in the R package

“ape” (http://www.R-project.org). This

was done to see whether correlations between the Boltzmann factor and substitution rate and the Boltzmann factor and branch length are present across the tree (Paper IV, Study I).

The second analysis was carried out according to Estabrook et al. (2007). The Eucnemidae and Syrphidae datasets were used. Branch length (accumulated genetic change since the most recent common ancestor) and temperature conditions were compared within species pairs across the tree as well as within sister pairs. The aim was to calculate the number of “monotone pairs”, where the most genetically differentiated species of a pair also exhibits a faster metabolic rate. This was done using a programme called ECERFODM.

ECERFODM calculates the thermal regimes for each ancestral node by taking the temperature average obtained from the immediate descendants (Paper IV, Study II). the comparisons between sister species the

23 relative biological trait variable (the ratio of the Boltzmann factor of the species with the higher temperature over the species calculated. Then two-tailed non-parametric Spearmann’s rank correlation tests were performed. Independent pairwise comparisons were carried out using Mesquite version 2.5 (Maddison and Maddison, 2008) (Paper IV, Study III).

Barcoding Study

Molecular, Morphological and Life-History Data

DNA was extracted from Finnish and Russian Hylochares specimen. Two mitochondrial and two nuclear genes were selected. Mitochondrial: 1) partial sequence of the small-subunit ribosomal 12S gene and 2) partial sequence of the Cytochrome c Oxidase subunit I gene (COI) (barcoding region). Nuclear: 1) small-subunit ribosomal 18S gene and 2) partial sequence of the large-subunit ribosomal 28S gene. Unfortunately the second internal transcribed spacer (ITS2), due to its high mutation rate useful in species delimitation, could not be

amplified since the primers (Navajas et al., 1998) seem not to be specific enough for Eucnemidae (Teräväinen, pers. comm.).

Obtained DNA sequences for each gene were aligned in ClustalW2 (Larkin et al., 2007) and checked for genetic differences between the Finnish and Russian populations (Paper II).

Actual observations of the Finnish beetles at their sites of occurrence in the field were made over a time-span of two years (2006-2008). Behaviour, life-history traits as well as habitat characteristics were recorded (Paper I) and compared with the information available from the Russian populations (Kangas and Kangas, 1944;

Siitonen and Martikainen, 1994; Siitonen et al., 1996). Larval features and larval galleries as well as morphological characters of adult male and female Finnish and Russian specimens were studied and compared (Paper II).