Crossing jungle : an analytical and experimental approach of activation profiles for audio-graphic navigation in clusters of leaves

This paper presents the work of designers and sound designers. It is an experimental approach, where we have tried to analyse the audio-graphic characteristics of foliage through video and simple 2D and 3D simulation models. Within the project we are working on the concept of activation profile. An activation profile is a simple way to represent active and shaped event triggers.


Topophonies are virtual navigable sound spaces, composed of sounding or audio-graphic objects. Graphic and sounding shapes or objects are audio-graphic when visual and audio modalities are synchronized. In virtual reality and video games, we know how to make scenes composed of point-shaped elements: graphic and sound (i.e. a spot representing an object). However, there is no tool enabling navigation to make scenes consisting of particularly great numbers of interactive visual and sound elements, nor dispersed elements such as in a crowd, a flow of traffic, foliage or rain. The research project Topophonie proposes lines of research and innovative developments for sound and visual navigation in spaces composed of multiple and disseminated sound and visual elements (audio-graphic clusters). By working in a scientific multidisciplinary group (digital audio, visualization, sound design) with enterprises specialized in the domain of interactive multimedia activities, the project Topophonie works on models, interfaces and audio-graphic renderings of audio-graphic clusters. The project team is composed of researchers specialized in granular sound renderings and advanced interactive graphic renderings, as well as digital designers and enterprises specialized in the relevant fields of application. The first task of the project was to analyze and formalize several representation models. Foliage is one of them.

This paper is part of the Topophonie research project, the aim of which is to navigate within audio-graphic clusters. Clusters are wide ranges of objects of the same class. By Audio-graphic, we mean synchronized audio and graphic object behavior: both modalities have been implemented in a single action. Among the various examples of these kinds of objects, such as rain, flocks, grains etc.

This paper focuses on foliage as clusters of leaves. We have selected two main audio and visual behavior in order to find a good and costless way to simulate: the wind and a first person character crossing the foliage. This paper presents the work of graphic and sound designers. It is an experimental approach, where we have tried to analyze the audio-graphic characteristics of foliage through video, 2D and 3D simulation models with popular softwares. Within the project, we are working on the concept of activation profile. An activation profile is a simple way to represent active and shaped event triggers.

We can illustrate the concept of “activation profile in clusters of leaves” with the body of Tarzan crossing the jungle hanging from a vine.
We needed to be sure that this concept was perceptible. Therefore, we have compared the user experience relative to two different symbolic activation profiles: a point symbolizes the hand of a player and a line symbolizes the wind path in the user experiment.
We concluded with a 3D interactive scene which simulates the audio-graphic navigation in clusters of foliage with different activation profiles.


Fig. 1. Video shots of wind, camera and human body crossing various species of foliage

These video captures show some audio-graphic characteristics: the vegetal sound due to the collisions between leaves, the plastic sound of the camera, the clothes and flesh sound of the human body, and the graphic behavior of the different species of plants during and after the crossing movement.

It is difficult to be in conditions where the wind blows strongly on the foliage and where the microphone is not affected by the storm. For the passage of the camera object through the foliage, the microphone itself and the plastic case of the camera interferes when colliding with leaves and branches and creates some mechanical and metal noise which appear out of context. A completely soundproof cage was rather complex to realize.

Nevertheless, the use of a body organ (hand, arm, torso or foot) is natural and produces more convincing sound when crossing foliage. For others, activation profiles such as small insects or big monsters, we thought it would be easier to simulate them in the studio afterwards. Certain sound sequences seem fake due to the fact that the sound produced by the body of the cameraman is added. Carrying the camera crossing foliage also activates leaves outside the camera’s field. Therefore, the audio and the video images may not always be coherent. As a consequence, the coherence should be evaluated according to the precision of the physical interaction and the audio-graphic rendering in the camera’s range.

Literature about audiovisual synchronization and cross modality perception show that the audio and the visual are complementary and synchronization can vary a lot and still be significant. (c. f. ventriloquism, Mc Gurk effect, works by Jean Vroomen and Beatrice de Guelder).

However to be effective, sound should, one way or another, stick to the visual events that appear in the field of the camera. In addition, it works better when sounds are related to visual events happening within the objective frame. A un-visualized event, outside the camera’s field, often causes interference with the understanding of active events. As the colliding object: the camera or the first person character is not visible nor definite, we can be quite tolerant about its material or mental representation.


In order to benefit from our own foliage sound library with a variety of different aspects and species we recorded the manipulation of several branches and leaves in the studio. We also recorded some foliage sounds outdoor, but it was too difficult to avoid traffic sounds and to resolve the wind and activation problems. We manipulated them more or less violently to produce various sound movements and feign typical effects of crossings: the passage of the wind in foliage, the passage of a hand or an object on a leaf, or a group of leaves. Video shots are important to correlate the manipulation of the foliage and the sound produced by the leaves. Actions were carried out with a hand, another branch or leaves such as caressing, creasing, tearing away, shaking, crashing and hitting one or several leaves.

The remarkable sound differences we have noticed are related to the following criteria:

  • The inflorescence: the numbers of leaves, their size and shape, the proximity of the leaves, their spatial distribution and the global architecture etc.
  • The material: the texture of leaves and state of drought, and plasticity of the branch, deformation, elasticity, overlap and bruising etc.
  • And the energy of manipulation, the kinds of gestures with the hand or other leaves: the speed and movements etc.

Manipulation of various foliage species:
Sound sample classified by species:

Doing this work, we have noticed that when listening to the sounds without the image of the movement, they all seem to sound more or less the same and do not have much significance, it is especially difficult to imagine their real movement. This may sound obvious, but the view adds very important information for understanding what we hear: spatial origin, causal action, physical reason for a specific sound particularity, activation mode and action.

Therefore, it seems that synchronization is more important for realism than sound timbre.
In L’Audio vision [1] Michel CHION analyze the perception of an extract of Ingmar Bergman's Persona (1966) in three times: sound and image, image alone and then sound alone.
Applying this method to videos of simple sound actions such as manipulating foliage appeared to be an extremely interesting experience. However, it became even more interesting for manipulating the simulation in real time. We thought that anyone should be able to tell how convincing and coherent an audio-graphic simulation of foliage navigation is. We then proposed to a small panel of students to manipulate the interactive simulation with sound only, image only and both sound and image.


The aim of the experiments was to determine the importance of audio-graphic synchronization in interactive manipulation and the relevance of the activation profile variation. All the tests concern the same scene of foliage, consisting of the image of a tree on which leaves can be touched by the rollover of the mouse cursor and emit a light noise of a leaf from our sound library. The test was realized with headphones.
Two kinds of activation profiles were tested:
(Fig. 2) A linear profile with a vertical line is a representation of the passage of the wind through the tree, triggering a larger quantity of leaves, and a punctual profile with a dot symbolizes the hand of the player

Exp. 1. First we proposed to navigate only with sound on a black screen, and we asked the user about this blind experiment.
(file A):
(file B):
Interpretation: All testers did heard a sound difference between profiles A and B. Non-specialized listeners often have difficulties to adopt a precise vocabulary without visual reference. A single person detected a larger quantity of sound elements in the A file.

Exp. 2. In the second experience, we proposed to navigate with the image and no sound in order to separated the perceptual analysis.
(Graphic 1):
(Graphic 2):
Interpretation: The majority of users successfully linked  sound and image. We consider that finding the right answer demonstrates the audio-graphic coherence.

Exp. 3. Then we presented the audio-graphic version with the image and the associated sound. We asked users what the audio-graphic version adds to the experience and the relevance of using different activation profiles. i. e. punctual activation profile:
vs. linear activation profile:
Interpretation: According to the answers to this questionnaire, it seems that the audio-graphic version of the interactive profiles give more information about the navigation than both sound and image ones. It is also clear that the difference between the two profiles is perceived and understood much more easily in the audio-graphic version than in the single modality ones and therefore makes sense.


Using profiles enables inter-penetrability of clusters or complex objects to be simulated. For example, the collision of a hand with foliage or a collision between two foliage. In the case of manipulation of foliage, the number of collisions and the sound parameters would be too complex; inflorescence and parameters of materials, multiple triggers etc. Within the Topohonie project, we have developed methods to simplify the use of profiles.
In the following sections we will develop the generic term of profile, for example, triggers can be on/off or progressive. They can either have the function of source, activator or both source and activator. This paper focuses only on activation profiles. Clusters of triggers are one way to simulate complex profiles. They can have different shapes and sizes.

The sounds triggered by collisions of foliage are multiple. For example, a breeze of wind in a birch does not produce the same sound as a gust of wind in a palm tree.
We could use progressive profiles with variation of unit sizes in order to increase and decrease the activation within a profile. The activation variables can be volume, density, strength, quantity, speed, agitation, friction, scratching, sounds and visual effects produced by other sources etc.

We are now working on more elaborate activation profiles and their audio-graphic behavior and renderings.
(Fig. 3):

Crossing foliage seems an uncommon experiment. The player immerses himself in a 3D environment representing an impracticable jungle. In some cases the user may seek to circumvent the obstacles represented by trees rather than passing through them.  But in order to evolve in the 3D space, the player needs to cross the foliage. When the player can see the First Person, we can assign a specific form to the activation profile. So we can illustrate crossing foliage like a breeze of wind, a hand or a stick. The display of the activation profile seems necessary to justify collisions. The visual effects are not yet currently the expected ones. It seems clear that each collision triggers a coherent visual action. This model demonstrates the necessity of the consistency of the different activation profiles which match with the collision sound.

Sounds results classified by activation profiles shapes:

  1. Plan: collisions are less frequent and seem unrealistic.
  2. The cluster of points: We obtain distinguishable iterations. The multi-points of contact created more interaction in the game play.
  3. The cluster of various size spheres:  The progressive approach is more sensitive, sounds are depending on the size of the spheres.
  4. Surface of a capsule: we get a collision at the entrance and exit of the surface, it does not seem coherent.

We can improve the collision by distinguishing two modes:

  • Trigger mode (triggering the first collision)
  • Continuous mode (continuous control during the contact with the shape).

We have also noticed that the visual shape of the profile influences the way the user interprets the audio-graphic behavior.


The relevance of using different activation profiles was demonstrated by the user experiments. In the 3D interactive scene, we have tested various kinds of activation profiles to control different sound behaviors. The shape of the activation profile and the sound behavior both determine the meaning and understanding of the interaction. The number of units in clusters and the synchronization of these numerous collisions must be precise enough to express these rich interactions.

Our experiments and analyses show that activation profiles can be visible in order to visualize the interaction. Players have a better feeling of navigation when they can anticipate collisions.

This experimental approach brings a new point of view to the visual and sound synchronized modeling. This approach allows us to imagine new forms of audio-graphic expression, by navigating across landscape and soundscapes. The user can play both sounds and graphics actors with his own movements crossing activations profiles, like Tarzan crossing the jungle.

Topophonie Project
ENSCI les Ateliers,
Agence Nationale de la Recherche,
Pôle de compétitivité CAP DIGITAL
Marie-Julie Bourgeois & Roland Cahen, 2011

References and Notes: 
  1. Michel Chion, L’audio-Vision: Son et Image au Cinéma (Paris: Nathan-Université, série “Cinéma et Image”, 1991).