• Ei tuloksia

2.5 Teaching emotion expression through voice qualities

2.5.2 Joy, Tenderness, Sadness and Anger as voice qualities

2.5.2.5 Anger

Typical musical expressions of anger include a fast tempo, large tempo variability, minor mode, dissonance, atonality, high sound level, small loudness variability, high pitch, small pitch variability, ascending pitch, major 7th and augmented 4th intervals, strengthened singer's formant, staccato articulation, moderate articulation variability, complex rhythm, sudden rhythmic changes, sharp timbre, spectral noise, fast tone attacks, decays, small timing variability, accents on tonally unstable notes, sharp contrasts between "short" and "long" notes, accelerando, medium-fast vibrato rate, large vibrato extent, and micro-structural irregularity (Juslin & Laukka, 2004).

The list of voice quality features found in the expressions of sung anger in the field of voice science include high loudness, low loudness variation, rise and fall slopes (Scherer, Sundberg, et al., 2017), high equivalent sound level, low Hammarberg index, low-level difference between partials 1 and 2 (H1/H2) (Sundberg et al., 2021), high mean sound level, more short-term variability of sound level (Sundberg et al., 1995), high SPL, high vocal energy, high dynamics (rate, F0 contour, loudness variation), low formant bandwidth, moderate formant frequency, high low-energy frequency variation (Jansens et al., 1997; Scherer, Sundberg, et al., 2017; Scherer et al., 2015), low proportion energy <.5 kHz, low proportion energy

<1 kHz, high alpha ratio, high spectral flatness, high spectral slope, high spectral centroid, narrow bandwidth, weak low frequency energy (Sundberg et al., 2021), flat highly balanced spectrum indicating strong energy in the higher partials, low perturbation variation, high perturbation level, less jitter, and more shimmer (Eyben et al., 2015; Hakanpää et al., 2021a; Scherer, Sundberg, et al., 2017).

From a teachers’ perspective, anger is produced with a high Psub and high medial compression, even a pressed voice production. Articulation is sharp, which calls for the active and fast movement of the articulators. The flat spectrum and narrow bandwidths indicate increased adduction and short open quotient. The overall shortening of the vocal tract by lifting the larynx could help, as could increasing the

position of the F1 through opening the jaw and increasing the interlabial space (Hakanpää et al., 2021b).

3 STUDY QUESTIONS

The main aim of this study was to investigate whether it is possible to teach emotion expression in singing by using vocological information about voice quality and implementing this information to form a training pattern (the parameter modulation technique).

We investigated this by proposing several smaller study questions:

- Are listeners able to recognize emotions in a singing voice from short vowel samples?

- Is there a difference in the recognition of emotions when they are sung using a Classical singing technique compared to when they are sung using a CCM style of singing?

- Does pitch affect the recognition of emotion in the Classical-/CCM-style singing voice samples?

- Are valence and the activation of the emotions perceptible in the sung samples?

- Are there acoustic voice quality differences between emotional expressions for short singing excerpts of the vowel [:a] from Classical and CCM singers?

- Does the specific training (parameter modulation technique) improve the recognition of emotions from the singing voice?

- Do the acoustic differences between emotional expressions increase after the particular training (parameter modulation technique)?

4 METHODS

This study is an experimental comparative study using the hypothetico-deductive method as its basic scientific approach (we are evaluating general explanations of observed regularities by generating and testing hypotheses).

Experimental research operates using dependent and independent variables. The independent variable is manipulated and the effect that this change has on a dependent variable is examined. This enables a researcher to identify a cause and effect between variables.

The independent variable is the predictor variable being manipulated by the experimenter in order to observe the effect on a dependent variable (Coolican, 2009).

In this study, the independent variable is the voice sample performed with five different emotion expressions. To be specific, vocal emotion expression is the actual independent variable we are manipulating (by asking the performer to sing in different expressions), and the voice sample is the way we have operationalized it in this study (the way we capture the expression). In other words, we are looking to hear/see the differences in emotion expression from the voice samples. As there are four different emotions that we have asked the singers to portray in their voice in addition to the neutral state, we have a categorical independent variable consisting of five levels (joy, tenderness, neutral, sadness, anger).

The dependent variable is the event expected to change when the independent variable is manipulated (Coolican, 2009). In this study, it is either the answers given in the listening tests or the measurement result in acoustic analyses. In the case of the listening tests, the dependent variable is always categorical, nominal, and dichotomous. That means that there are two levels to this variable: correct and incorrect answers. The other possible dependent variable in this study, the acoustic parameter value, is a continuous variable. This means that the acoustic parameters are measured along a continuum, and they have a numerical value. To be very specific, the continuous variables in this study are so-called ratio variables that have the condition that the zero (0) of the measurement means that there is none of this variable. The name ratio variable reflects the fact that one can use the ratio of measurements. So, for example, the SPL doubles every 6 dB, or the ratio of frequencies of two notes an octave apart is 2:1.

The difficulty concerning the variables in this study is that we are trying to obtain measures that reflect a relatively unobservable construct, emotion. The extent to which our measurements actually coincide with the construct of emotion is referred to as construct validity. The way we try to address this problem and increase the validity of this study is by triangulation. We use acoustic parameters to look for a correlation between them and the emotion enacted by the singer. We can also use acoustic parameters to look for a correlation between them and listener evaluations of enacted emotions. We then of course look at the correlation between enacted emotion and listener evaluations. This model of triangulation originally comes from the Brunswikian lens model, but has been further modified to fit vocological study by K. Scherer (Bänziger et al., 2015; Brunswik, 1956).

Another key feature of this study is that it is a comparative study. In this thesis, we use two types of comparison studies: the cross-sectional and the longitudinal.

Cross-sectional study compares samples drawn from separate distinguishable sub-groups within a population, in this case singers using the Classical singing technique and singers using CCM singing techniques from a population of singers. The longitudinal study carries repeated measurements on the same group of people over a period of time. In this study, we use the repeated measures design in two ways: 1) to investigate the effects of the teaching intervention, and 2) to investigate the effects of different emotion expressions in the singing voice. In Study III, we use a control group for a comparison with the “Test” group that is undergoing the teaching intervention.

The final defining feature of this study is that it uses statistical analyses to validate hypotheses.