• Ei tuloksia

The general functioning of the visual system and attention

1.4 The process of estimating image-quality

1.4.1 The general functioning of the visual system and attention

middle of the visual field is accurate (0.3-2 ° of the visual angle), and the further the target is from the area in the middle, the less accurately it is perceived (Land, 2006). Eye movements are used to sample the world around, and even though perception seems continuous and whole, visual perception is constructed mainly of stops and jumps to the next place, known as fixations and saccades. There are other types of eye movements (see e.g. Land, 2006), but in the context of looking at still images these are the most relevant.

Viewing strategies are commonly measured in terms of fixation duration and saccade amplitude, which are shorter in visual search than in scene perception, for example (180-275 ms and 3 degrees in visual search and 260-330 ms and 4-5 degrees in scene perception) (Rayner, 2009). The processing per fixation is therefore simple in the search task: whether the target is there or not. However, it is important not to jump over the target, and to screen the whole image. What matters in scene perception is to fixate many aspects of important areas rather

24

than all the areas. The duration of fixation has been associated with the difficulty of scene processing (Henderson, Nuthmann, & Luke, 2013): the longer the fixation the deeper the processing tends to be (Holmqvist et al., 2011). However, the length of fixation could also be related to how interesting the content is, as well as to impaired clarity. In other words, fixations may be long if there is a lot of information to be retrieved from one place or if the information is difficult to obtain. However, gaze duration on one place (including several fixations) could be a better measure than the duration of single fixations in the assessment of viewing strategies in different tasks (Castelhano, Mack, & Henderson, 2009). The amplitude of saccades is related to task demands, workload, the stimulus and current cognitive processes: the more demanding and heavy the task is, the shorter the saccade amplitude (Holmqvist et al., 2011).

Although the participant’s attention is not always where the fixation is, it is typically directed at the fixated location or the next location to be fixated (Henderson, 2007). Therefore, the fixated place is considered a good enough approximation of attention allocation. Attention determines which information coming through the senses can access conscious processing and working memory (Baddeley, 2003). Working memory maintains and stores information in the short term and underlies human thought processes, and is limited in nature (Baddeley, 2003).

Attention comprises bottom-up and top-down processes. Bottom-up attention refers to salience filters in the central nervous system that are selective for properties of stimuli that are likely to be important (Knudsen, 2007). These properties are easily distinguished, and include movement and differing colours and orientations. Objects with such properties pop out of the scene without any mental effort (see Treisman & Gelade, 1980). As Le Callet and Niebur (2013) suggest, I refer to areas that are relevant in a strictly bottom-up sense as “salient”.

Top-down mechanisms stem from the aims behind actions and regulate the signal strengths of different information channels that compete for access to the working memory (Knudsen, 2007). Such mechanisms direct the eye movements towards targets and improve the signal-to-noise ratio in all domains of information processing: sensory, motor, internal state and memory (Knudsen, 2007). They also direct the gaze to areas that are relevant to a certain action or

25

task and further make the detection of important features more sensitive than of the non-task-relevant features. The areas of attentive focus are relevant because of their meaning to the task, and the process relies on both bottom-up and top-down information. Le Callet and Niebur (2013) call these “important areas”, but in this theses I refer to them as semantic regions of interest (ROIs) so as to emphasise the interpretation of bottom-up features that essentially distinguish between these salient and important areas. The meaning of information coming through the senses is thus constantly being processed. However, knowing about attention and eye movements does not in itself suffice to explain the process of quality estimation. It is also necessary to understand the cognitive processes that enable us to act in our environment and to interpret the things we perceive.

Figure 3. The flow of visual-quality estimation, modified from (Land, 2009)

Distinct components of the gaze-action system have been identified: schema control, the gaze system, the visual system and the motor system (Land, 2009) (Figure 3). Through these components individuals gather information from the outside world that they use to act in it. The gaze system serves to locate information thereby answering the question “where”, whereas the visual system responds to the “what” question and supplies information on which to base action (Land, 2009). There are also different neural routes in the perceptual system.

Land (2009) defines the schema system as determining where to look, what to

26

look for and what to do. Its role is twofold: setting the goal of the current behaviour and determining the sequence of actions needed to achieve it.

Understanding a task requires an understanding of its schema: how the task should be done, what the important features are and how the decisions should be formed. I will now describe in more detail what happens in the interaction between a participant and images in a quality-estimation task.

1.4.2 Material-related influences on estimations of image-quality