Eye Gaze as a Vehicle for Aesthetic Interaction: Affective Visualisation for Immersive User Experience

The paper explores the aesthetic potential of affective visualisation technology that responds to the affective changes of the user through reading their eye gaze. Through a critical investigation of current practices this research is exemplified with a practical application of affective visualisation that explores the interconnectivity of visualisations of swarm intelligence and dynamic affective changes of the user's immersive experience.


The Philosophy of Eye: The Eye-Mind Problem

Throughout the study of eye movement researchers have attempted to provide comprehensive models of how eye gaze can serve as a display of the mind, however, most research outcomes have failed to provide inclusive explanations. Early research has conveyed two significant but very different claims that have remained relevant, particularly in their accounts of the relationship between attention and eye movement. Both challenge an importance of the eye in processing information and give significance to the underlying voluntary processes of the mind as the control mechanism of attention or imagination.

Hermann von Helmholtz was one of the first scholars who attempted to explain how the mind is instrumental to the patterns of eye movements. He states that the eye's optical characteristics and information gathering is rather poor. Therefore, vision is only possible with some form of unconscious inference that makes sense of the information based on prior experiences of the world. [1] Studying the relationship between eye movements and visual attention, Von Helmholtz discovered the phenomena of ‘covert attention,’ which explained that visual attention is not always where eye ‘fixation’ (holding eye gaze on a single location) is directed to; he points out that one can attend to a stimuli without shifting visual focus. Observing letters on a screen that was too large to view at once; he noticed that without moving his eyes could ‘covertly’ attend any location of the screen. Although von Helmholtz's  observation on visual attention introduces significant questions about the relationship between eye movements and cognitive processing, it can be stated that eye movements  most commonly reflect a will to attend to objects in detail (‘overt attention’).

William James, reflecting on von Helmholtz account, provides an explanation of a very different aspect of visual experience in his discussion of the ‘embodied eye’. James suggested that attention “is the taking possession of the mind, in clear and vivid form, of one out of what may seem several simultaneously possible objects or trains of thoughts…It implies withdrawal from some things in order to deal effectively with others.” [2] He claims that attention and imagination are directly related; when attention does not regulate one's sense organs it is imagining things or actions that one is attending to, or looking for. James states that paying attention to what one is doing often consist of a similar kind of anticipatory imaginative engagement. As with Von Helmholtz's covert attention, James' ‘imaginatory’ attention give prominence to the complexity of the eye-mind relationship, acknowledging that although the quality of motion and fixations might be measurable, their cause is not fully determinable.  

The Physiology of the Dynamic Eye

In order to better understand the physiological capacities of the eye in reference to aesthetic experience, this section elaborates on the eye’s basic functionalities and characteristics. It is well known that the size of the visual field is limited and can be divided into: ‘fovea vision’ (encompassing 2 degrees at the centre of the visual field) responsible for the sharp detailed sight, and ‘parafoveal vision’ ( 2-10 degrees off centre) responsible for low resolution compressed information next to the ‘peripheral field’ (>10 degrees off centre). Since the visual field cannot be processed from one single fixation (lasting between 180 and 275 ms), as a result of the limited acuity of the retina, rapid eye movements are necessary to bring the retinal image of an object of interest to lie on the fovea (saccades; lasting between 10ms and 100ms). Attention initially assigns a target before saccade eye movements happen. During saccades, vision is dormant and new information is acquired only during fixation.

For eye movement analysis there are three types of movements that might be modelled to understand the overt localisation of visual attention (when attention and eye location are matched). Fixation, ‘smooth pursuits’ and saccades are all under ‘voluntary’ control as such they are a result of intentional decision making (whilst ‘involuntary’ movements, such as micro-saccades, are unconscious). Whilst saccades are rapid jumps of the eye to shift gaze to the object of interest, fixations and smooth pursuits occur during the intermission between saccades. In smooth pursuits the eye tracks a moving object and compensates the velocity of the moving on the retina. Some of these voluntary eye movements can be practiced and improved with control, however a saccade cannot be disrupted. Involuntary eyelid movements, such as blinking of the eye, are recurrently applied for the measurement of affective states of the user.

Aesthetic Experience and Eye Movement Research: A Critical Revisit

Since eye movement research and its methods of recording often lends itself to reducing aesthetic experience to characteristics of behaviour, psychological accounts have subsequently been popular. An early example is Alfred L. Yarbus who used a ‘scanpath’ (a graph of saccades and fixations) and the visual recording of eye movements (fixations and saccades) to study complex scenes to identify mainly task-dependent patterns of fixations.  Yarbus' well-known study of  ‘An Unexpected Visitor’ proposes that the interpretation of a composition can be based merely on where the viewer looked on the image. [3] Similarly to Donald W Graham, who described the composition that guides the viewer's eye on a pleasing path of visual elements of the scene, he implies that compositions facilitate the artist's decision in the viewers' experience of looking. Yarbus explained the viewing process more as a task to be solved, and showed that the viewer reinvestigates elements of the painting that promise to explain the image. He acknowledged that: “these elements show that they give information allowing the meaning of the picture to be obtained. Eye-movements reflect the human thought processes.” [3] Yarbus’ approach neglected the multidimensionality of aesthetic quality of the image resulting in a limited account of aesthetic experience that only focused on the visual attention in the viewing experience.

An often-used concept in the study of eye movement research is Daniel Berlyne’s description of ‘diversive-specific’ behavioural patterns of the viewer. [4] Berlyne's approach has been largely disregarded in contemporary psychology, yet topical eye movement research still applies his method. [5] According to these studies ‘diverse exploratory behaviour’ is when the viewer seeks out stimulation that has appealing collative properties (such as complexity, novelty, surprise and uncertainty that can trigger hedonic effects of arousal [4]) regardless the content or source; ‘specific exploratory behaviour’ is when viewer's curiosity arouses through the uncertainty or lack of perception of particular information in the image. [5] The two types of accounts exhibit diverse patterns of eye fixations; the earlier shows diffuse clusters of fixations, the later contains high density of fixations. The initial use of Berlyne’s work by Francois Molnar suggested that whilst knowledge based exploration is slow and purposeful therefore specific, pleasure based exploration is diverse. He furthermore proposed that good and bad composition can be shown by the number of transitions on the scanpath of the image before the exploration comes to equilibrium; he concluded that aesthetic engagement happens at early engagement and good composition needs fewer transitions.

Following these foundational studies, there have been many valuable contributions investigating aesthetic experience based upon discrete characteristics of viewing experience. For example, evaluating aesthetic appreciation between art-trained and untrained viewers, or understanding artist's perception in drawing or developing multidimensional methods inclusive of verbal recordings, task-directed recording and recordings of  hand movements. [5] However, the major methodological problems of such research remained the same, which is the limited capacity of scanpath applications that are forced to evaluate dynamic experience of the viewer based on the analysis of static values. As a consequence, the application of reductionist conceptions of human experience has generated a fragmented account of aesthetics that broadly influenced the understanding of eye movements in aesthetic experiences and limited in its future applications.

Towards an Affective Eye: Eye Movement as Aesthetic Interaction

Mark Johnson declared that: “aesthetics must not be narrowly constructed as the study of art and so-called aesthetic experience. Instead, aesthetics becomes the study of everything that goes into human capacity to make and experience meaning.” [6] Following the words of Johnson, this research also attempts to explain aesthetic experience (and aesthetic interaction) as the creation of meaning of an embodied mind. Applying Johnson's approach, human capacity in experience is explained here as affection when “meaning grows from our visceral connections to life and the bodily conditions of life… the bodily sources of meaning.” [6] In a similar manner Brian Massumi referring to Deleuze and Guattari, describes affection as: “an ability to affect and be affected. It is a prepersonal intensity corresponding to the passage from one experiential state of the body to another and implying an augmentation or diminution in that body’s capacity to act.” [7] As such the aim of an aesthetic experience is to understand the meaning of one's everyday ‘affections’ in order facilitate effective engagement of the user with novel meaning creation. John Dewey explains this following: “to understand the meaning of artistic products, we have to forget them for a time, to return aside from them and have recourse to the ordinary forces and conditions of experience.” [8] He goes on to state: “that experience becomes conscious, a matter of perception, only when meanings enter it that are derived from prior experiences.” [8]

Following up on these arguments, in order to understand eye movements it is crucial to comprehend them as an enactive capacity that produces embodied meaning through its actions. The dynamicity of an eye, in this sense, is crucial for understanding aesthetic experience; any kind of attempt to represent this dynamic action can lead to a reduction in its characteristics. Therefore, in the framework of this research it is suggested that eye gaze in aesthetic interaction should be applied as a real-time property rather than a static value.  As such, this investigation disregards the scanpath based methods of eye movement analysis and introduces ‘aesthetic interaction’ for aesthetic meaning creation.

Aesthetic interaction moves away from common conceptions of human-computer interfaces that focus on the ‘invisibility’ of the interface as most imperative facet of the human and computer relationship. Rather, aesthetic interaction requires a view where the system is a framework to facilitate user's expression and interpretation, promoting serendipity, provocation, surprise or how Umberto Eco referred to it, wonderment. In this sense aesthetic interaction acknowledges the ability of the user to appropriate technology and instead of immediate invisibility it offers an intellectual reflection process where the user's interpretations are instrumental to the system. As Graves Petersen et al. explains, aesthetic interaction “promotes improvising to be the key modality in how the user explores the worlds around her and learn new aspects.” [9] Similar to the approach presented in this paper, they take Dewey's pragmatic approach further and explain bodily experience as a significant aspect of the interaction adding that: “we have to move beyond ideals of meeting human sensor motor skills and somatic sensing, to include among others human intellectual capacity to grasp and make sense of complex, contradictory and even ambiguous systems and situations.” [9]

In summary, it is argued that real-time dynamic processes allow a meaningful  exploitation of eye movement for particular aesthetic production. As a result, an open system  is established where meaning is enacted  through an indirect response mechanism where  user's curiosity and imagination drives the interaction towards immersive states. The aim of such interaction is not to gain full control or full invisibility of the technology but to engage the viewer in a self-explanatory process of interaction through the movement of their eyes. As a result aesthetic interactive system is designed to respond to ‘augmentations and diminutions’ of the body [7] (in this case the eye) and produce responses to the anticipated affective state; meaning emerges through the continuous cognitive loop between the eye and the system.

Eye Gaze Driven Affective Visualisation and Swarm Behaviour

Having explored the aesthetic meaning of eye movement, this section introduces the concept of affective visualisation; a visual display that facilitates aesthetic experience through eye gaze. Data visualisation is generally described as “computer-supported, interactive, visual representations of abstract data used to amplify cognition.” [10] An affective visualisation is an interactive system where real-time data of the users is collected and feed back to them after an evaluation of their affective state. The feedback mechanism is crucial to the interaction as the real-time flow of data is visualised to provoke an aesthetic experience for the user. In this sense the visualisation is not specifically aimed to represent data but reflect on the dynamic qualities of the data flow.

An affective modelling of the user in affective visualisation is an aesthetic exploitation of the feedback mechanism between technological effect and affective human response. Coupling the dynamic flow of visual elements (effect) with the eye movements as passage of experimental states of the body (affect) here will be explained as a ‘cognitive feedback loop.’ Such couplings entail an augmentation in the body’s capacity to act which promotes user's involvement toward unexplored states of immersion. In application, a cognitive feedback loop is an open system where instead of discrete values for affective states the system allocates meaning to the changes in affective qualities of the user. This approach is similar to the so-called ‘affective loop’ concept introduced in HCI research as both concepts emphasise an affective input and output modality in order to facilitate unique and individual experiences of the user. However, cognitive feedback loop specifically builds upon the real-time dynamicity of interaction where aesthetic qualities do not represent affective states but trigger affection in real-time; this is a significant distinction as meaning here is linked to dynamic events rather than passive qualities.

An example application of the model described above is that developed for the ‘Mind Cupola’ biofeedback installation work. The main aesthetic concept of the display is to generate an open system that organises information along the user's eye movements in a way that it reflects their behaviour. The underlying mechanism of the visualisation is to guide the user towards a state of equilibrium, where the user's interaction is balanced between control and aesthetic satisfaction. There is no particular goal in the system other than to explore whether this process of interaction might activate imaginary capacities in the user experience by the engagement of meaning creation.

The affective modelling of the user is based on three characteristics: aesthetic engagement with the screen (level of engagement), task driven interaction (as level of attention, engagement, performance) and measurement of involuntary responses as eyelid movements (blink rate, blink closure duration). The system incorporates all voluntary eye movements such as saccades, smooth pursuit and fixations; involuntary eyelid measurements also captures user's affective responses over time.

The visual display consists of a particle system (collection of independent objects) that respond to the user's eye movement in a way that they represent three different intelligent behaviours according to the user’s responses. This intelligent behaviour can be described as a form of ‘swarm intelligence,’ designed to visualise the natural phenomena of fish shoals, bird flocks and swarms of insects. The particular patterns (from ellipsoids to vortex arrangements) with emergent qualities such as speed, density and colour, are dynamically appointed to the changes in the user's evaluated affective states. For example fish shoal patterns is appointed to low engagement level, bird flocks to optimal performance and insect swarm behaviour to erratic engagement level. The particle system not only produces swarm behaviours but also forms simple messages as affective texts or recognisable shapes with affective meaning. They serve as a feedback mechanism for the participant, informing them of their performance over time enhancing their experience. As a result, aesthetic experiences are encouraged through the dynamic and affective quality of patterns and their responsive characteristic to eye movement.

A significant aspect of the aesthetic experience is the relationship between these emergent patterns and eye saccades. The swarm might follow or avoid the focus of the eye, which can be described as ‘predator’ or ‘guider’ behaviours. This is further used as a feedback mechanism for the user; in such a situation the system starts to distribute the swarm in such a way that it is shifted to the parafoveal vision and the peripheral field. This returns to Von Helmholtz's concept of covert attention, which implies that the user is asked to distribute or move their attention to the outside of their fovea vision. Task-driven aspects of interaction are when the user is asked to guide a swarm. Smooth pursuit eye movements are applied here to follow a particular path, or fixation when users are asked to focus to keep the swarm in a particular spot of the display to avoid, for example, objects with predator behaviour. In equilibrium the patterns become more harmonic with no predators, the aim is a pleasurable interaction based on the aesthetic engagement of movements.

The aim of this paper has been to discuss a non-reductive model of aesthetic experience that produces an affective engagement of the user based on information of the eye gaze. The semantics of this affective visualisation based voluntary movement of the eye whilst both, voluntary and involuntary responses, were evaluated in order to generate a cognitive feedback loop. It has been proposed that the collective behaviour of swarms might well simulate affective states or consciousness. As Johnson explained before, aesthetic qualities aim to trigger new meaning through the body; by the use of the relative quality of visual information and affective intelligence the users re-evaluates their everyday experiences of viewing and attaches new meanings to their actions. The critical evaluation of this research hopes to stimulate further ways of applying theories of aesthetic experience and affective visualisation that aims to generate an open system for unique immersive experience.

An extended version of this paper with full references can be accessed at cognitiveloop.org.

References and Notes: 
  1. H. V. Helmholtz, Treatise on Physiological Optics (Rochester: Continuum, 1911).
  2. W. James, The Principles of Psychology (New York: Henry Holt, 1890), 403-404.
  3. A. L. Yarbus, Eye Movements and Vision (New York: Plenum, 1967), 190.
  4. D. E. Berlyne, Aesthetics and Psychobiology (New York: Appleton-Century-Crofts, 1971).
  5. C. F. Nodine, P. J. Locher, and E. A. Krupinski, “The Role of Formal Art Training on Perception and Aesthetic Judgment of Art Compositions,” in Leonardo 26, no. 3 (1993): 219-227.
  6. M. Johnson, The Meaning of the Body: Aesthetics of Human Understanding (University of Chicago, 2007), ix-x.
  7. B. Massumi, “Notes on the Translation and Acknowledgements,” in A Thousand Plateaus, G. Deleuze and F. Guattari (Minneapolis, MN: University of Minnesota Press, 1987), xvii.
  8. J. Dewey, Art as Experience (New York: Minton-Balch, 1934), 2, 283.
  9. M. G. Petersen, O. S. Iversen, Krogh, P. G. and M, Ludvigsen, “Aesthetic Interaction – A Pragmatist’s Aeshetics Of Interactive Systems,” in Proceedings of 5th conference on Designing Interactive Systems (DIS 04), New York, (2004): 271.
  10. S. Card, J. Mackinlay, B. Shneiderman, Readings in Information Visualization: Using Vision to Think (San Francisco, CA: Morgan Kaufmann Publishers, 1999), 7.