‘WOODEN WORLDS’ - Aesthetical and Technical aspects of a Multimedia Performance using Real-time Interaction

Wooden Worlds is an audiovisual, interactive performance by Claudia Robles Angel and Javier A. Garavaglia. The piece, of variable length, is a complex multimedia performance in which viola, video, photography, soundscapes, live-electronics and live processing of pre-recorded sounds interact with each other in real time, all of which intersect in art, science and technology. The paper describes the technical and aesthetical aspects of the work.



This paper describes the intention, aesthetical principles and technical aspects of ‘Wooden Worlds’, an interactive audiovisual performance by Claudia Robles Angel and Javier A. Garavaglia. The piece is a multimedia performance developed from several different sound and visual layers, all of which interact with each other in real-time. Together they create an atmospheric constellation. The attention of the audience is challenged by the piece’s audiovisual elements, which in most of the cases are not recognizable at first sight. The role of the viola live on stage acts as an element of synergy between the diverse audiovisual elements, with musical composed and improvised passages. The performance requires two computers on stage running both the software package MAX/MSP/Jitter for real-time interaction.

The piece works with a world of sound and image, which is directly or indirectly connected to wood, particularly, in the way wood appears in nature, namely, in the form of trees or of tree cortices. Nevertheless, the idea is not only to show wood in its naked reality, but also to use images and objects of particular forms and characteristics, which, whilst made of wood, cannot immediately be recognised. One of the techniques utilised herewith is that of the ‘close-up,’ which consists of the shooting of surfaces at an extreme proximity, resulting in pictures showing those surfaces in a meticulously detailed manner. The resulting images, normally called ‘close-up photography’ or ‘macro photography,’ are usually shot by zooming or, most likely, by the usage of macro lens. In ‘Wooden Worlds’, this technique is utilised to show extremely close details of wooden surfaces, which, in many cases, are not recognisable as such at first sight. The intention is to produce a ‘haptic’ image, in which the observed object can be de-contextualised, allowing a free and open interpretation by members of the audience, who can have the feeling of ‘touching’ the images with their eyes. The word ‘haptic’ has its root in the Greek word HAPTΌS (πτω), which means to touch or to fasten. This type of visual conception is a fundamental aesthetical position of Claudia Robles Angel, one of the authors herewith, who seeks to transport the tactile sensation to the photographic image, approaching the object as much as possible, thus inviting to use the eyes to feel and not only to see. This usage of the word ‘haptic’ in visual arts (image and moving image) has its roots in Deleuze: [1]

“Where there is close vision, space is not visual, or rather the eye itself has a ‘haptic’, non–optical function: no line separates earth from sky, which are of the same substance; there is neither horizon nor background nor perspective nor limit nor outline or form nor center; there is no intermediary distance, or all distance is intermediary.”

Following this interpretation by Deleuze, the term ‘haptic’ is not used in ‘Wooden Worlds’ in the same sense as in ‘haptic interaction’, which is the type of interaction produced by touching devices, as defined by Hermann and Hunt. [3] The usage of the term in this performance is implied aesthetically, with no reference whether to the interaction itself, nor to the interfaces used, as no interaction in this piece occurs by the act of touching devices or interfaces. By making possible to ‘perceive the imperceptible’ through ‘haptic images’, the audience is immersed in a virtual space of images and surround sound, in which the material ‘wood’ is constantly alluded to.

The sound components in ‘Wooden Worlds’ have two sources: (a) pre-recorded concrete sounds and (b) a live viola on the stage, mixing fully composed passages with improvisation. The pre-recorded sounds were obtained during nightly recordings of the rainforest in South America. The richness of this nightly natural ‘soundscape’ was paramount to the general sound conception of the piece, as it helps the listener to become part of the immersive virtual environment of the performance. The concept of ‘soundscape’ is herewith referred to as defined by Truax: [4]

“A soundscape is an environment of sound (or sonic environment) with emphasis on the way it is perceived and understood by the individual, or by a society. It thus depends on the relationship between the individual and any such environment. The term may refer to actual environments, or to abstract constructions such as musical compositions and tape montages, particularly when considered as an artificial environment.”

This feeling/sensation of a natural sound-dome inspired the acoustical space in ‘Wooden Worlds,’ which, while still following the generic description of the term by Truax [4] (quoted above) , creates an immersive environment during the performance by transforming those sounds in real time instead of making ‘a musical composition’ or a ‘tape montage’ as in Truax’s definition. [4] The soundscape idea in ‘Wooden Worlds’ is technically aided by the octophonic sound, which surrounds the audience in the complete darkness of the concert venue creating a unique atmosphere between natural and virtual environments.

The form of ‘Wooden Worlds’ was conceived as an arch, with a climax at its golden mean. The usage of the golden mean creates the impression of a quasi-biological cycle, which begins and ends in complete darkness with the same type of audio material (insect sounds), but which evolves towards a climax, in which all forces interact, fading gradually out to complete darkness in the last third of the piece.


The interaction of the piece was programmed in three dimensions: (a) live-electronics, (b) video interaction with the viola, and (c) interaction via a MIDI controller. The two laptops must connect (both with a different fixed IP address) with each other via Ethernet using the MAX object ‘udpsend.’ The first computer acts as the master, generating SMPTE Time-code, which is then transmitted to the second computer for synchronisation purposes. The master is in charge of several DSP functions for the viola’s live-electronics. The second computer contains pre-recorded rainforest sounds and a library with all of the images; it interacts via a MIDI controller with (a) the pre-recorded sounds (through DSP processes programmed in MAX) and (b) the real time manipulation of the images (via ‘Jitter’). Both computers display the SMPTE time from the master computer, to allow for an accurate performance throughout the entire piece.

Full automation of live-electronics processes, as explained in Garavaglia, [2] had to be used up to some extent herewith, mainly because of the extreme complexity of the performance. Full automation is partially used in the video interaction too, mostly in those passages in which both the pre-recorded sounds and the visual part need manipulation via the MIDI controller. It proved extremely difficult to manage the totality of these processes without some degree of automation. However, and given the partially improvised character of some sections, it was decided not to fully automatise the totality of the performance; instead, only those passages requiring a considerable amount of manipulation of the interactivity were automatically programmed.

Pitch and amplitude from the viola were the only parameters selected for the interaction with video. To read them, the first computer uses simultaneously two algorithms: the first performs pitch recognition of the sounds of the viola, while the other measures their amplitude. The actual values of these parameters proved to be rather inconvenient for the mapping processes needed on the second computer in order to manipulate processes for the video sections of the piece. Hence, pitch values were multiplied by factor 10 while amplitude values were multiplied by factor 10000 before being sent to the second computer and mapped. These big figures allowed for a smooth interpolation of the diverse video parameters, without any noticeable rough changes in the video effects occurring. In this way, the viola’s amplitude values were scaled from 500 (a number that proved to be rather efficient in avoiding unnecessary soft amplitude data) to a maximum of 10000. For rotation effects, these figures were mapped to 0.1 – 50.0 in the second computer for the Theta parameter (the rotation angle, measured in radians) of the ‘jit.rota’ object in ‘Jitter’. The frequencies were mapped from 1300 (130Hz) and 50000 (5000 Hz) to 1.0 and 0.0 respectively for the zoom parameter (horizontal and vertical). Another type of interaction viola/video is the mapping and scaling of the amplitude values between 0.45 and 0.1 in order to change the temperature colour in ‘Jitter’.

With regard to the live-electronics processes, they are divided in two sets: those in the master computer for the viola and those in the second computer for the pre-recorded sounds. The viola's live-electronics include the following DSP functions: ring modulation of two sources via two Comb filters, delays, reverberation, convolution, granular synthesis, live-recorder/player and a ‘spatialisator’. Sounds coming from the output of the ring modulator, the delay unit, the convolutor and the sample-player are sent in a circular or localised automatically programmed surround sound (4.1) using the ‘spatialisator’. Another set of live-electronics was programmed for the second computer, which works mainly processing insect sounds from the rainforest. This second set includes: spectral extraction, comb filters, pitch shifting, chorus (for pitch or voices transformations) and granular synthesis. This second computer also includes spatialisation in 4.1 with circular movement (increasing or decreasing the speed between changes of loudspeaker during the performance at the will of the performer, with the intention of creating at certain moments an intense feeling of rotation) as well as localised distribution of sound.


As explained earlier, the principal technique utilised for the visual conception is that of the close-up, with several of the pictures shot using extreme zooming. The main type of image is that of the cortex of several and different trees, which were collected in the last ten years from many different types of vegetations, mostly Europe and the rainforest in South America. Other pictures stem from complete and isolated trees in nature, shot mostly in diverse regions of Europe. 

The main intention of using close-up images of tree-cortices is to resemble surface areas, which seem to be borderless, as, due to the closeness of the shots, there is a complete absence of perspective. Hence, these surfaces become eternal territories in which they lose their attributes, transforming themselves from object to landscape. Through these images, ‘Wooden Worlds’ invites audiences to immerse themselves in a visual territory created by wooden textures mixed with different types of trees with diverse structures. Figure 1 shows two of the many close-up images used in ‘Wooden Worlds’.

As all images used in the piece are included in some kind of interaction, they were all stored in a library in the respective MAX patch; most of them are selected randomly during the performance, with the exception of a few, which were pre-selected. At some moments, these surfaces emerge without any visual effect applied; at some other, however, effects such as, for example, colour and heat changes (programmed in ‘Jitter’) are introduced.

As an example, in the first six minutes, the performance begins in complete darkness and only with nightly sounds of insects, suggesting a typical new moon night in the rainforest; very gradually, the ambiance light is increased by little grains populating the screen using the ‘jit.noise’ object, which creates a matrix full of random values in ‘Jitter’. After a while these grains are transformed into the ‘haptic image’ of the tree cortex shown in Fig. 1 (left), creating an abstract landscape made of wood.

Besides these images, there is one video post-produced with the software ‘After Effects’ from two images: a wood cortex and a tree. It begins with an extremely detailed close-up of the cortex, slowly zooming out and thus, revealing the entire structure, which resembles a woman stretching the arms as if crucified. The video was produced to interact with the viola by modifying the playing speed via the pitch data from the viola, in order to create a tension through the extension of its duration. Controlling the speed with the viola frequencies produces herewith the effect of ‘zooming-out’ the image according to the music.

Other images were faded with the resulted rotated image with ‘Jitter’s jit.xfade’ object to create a feedback continuously changed by the Theta parameter and by zooming (horizontally and vertically). This feedback effect creates a visual and chaotic multiplicity that is reinforced by the rotation parameter, which can be slow or fast, according to the music played by the viola. Another type of interaction viola/video is the mapping and scaling of the amplitude values between 0.45 and 0.1 in order to change the temperature colour in ‘Jitter’.


The sound and musical aspects of ‘Wooden Worlds’ include two main sources: pre-recorded concrete sounds and a live viola. For the latter, some sections were fully composed whilst other sections were left for improvisation. The pre-recorded sounds, based mainly on sounds produced by insects, were obtained in several nightly recordings in a tropical part in South America, in the town of Girardot (Colombia) and at the Amazon rainforest (Colombian side). As usual in tropical areas, there are plenty of insect sounds during the night, most of which are produced on, inside or close to wood. Temperatures in Girardot are normally around thirty degrees Celsius (average). The constant heat, together with a rather high degree of air humidity (in some parts of the Amazonas, air humidity is around 94%) are ideal climate conditions for a concert of insects’ sounds and of other types of lives.

The viola (which is mostly made of wood) has a pivotal function across the entire piece as it interacts with other audiovisual materials of the performance: on the one hand, it interacts with the pre-recorded rainforest sounds and with sounds recorded life during the performance through typical DSP functions in real time programmed in MAX/MSP; on the other hand, it also interacts with the video part, as explained earlier. Sound interaction is included with the intention of either imitating the pre-recorded environmental sounds (mainly based on sounds made by insects) or to be combined with them. Therefore, the music composed for this type of interaction is that of short instrument actions (many of which are of undetermined pitch), that are supported mainly by sounds such as bow scratching the strings, harmonics or sounds obtained by knocking the instrument, to mention just a few. Most of these are of improvisatory nature, reacting live to the insect-sounds.

As the second type of interaction required a much more careful planning, the music for it was fully and carefully composed; here the viola must take control of different video parameters, some of which require a fixed duration and therefore, accuracy by playing. The parameters read from the viola for video–interaction are pitch and amplitude. A clear example is the ‘Elegy’, which starts in the score of the piece at 00:12:15:00 (SMPTE time). The image here is the post-produced video introducing the image resembling the form of a crucified woman. The ‘Elegy’ was composed with this image in mind, with the video starting with a ‘haptic image’ (a close-up of a tree cortex), which moves slowly forward until the full image of the ‘woman-tree’ is displayed. This slow revelation was explained in section 2 above.

In the last section of the performance, the amplitude and pitch of the viola control other video parameters, namely a feedback of the original image combined with a zoom (via the viola’s pitches), which is added to a rotation effect (controlled by the viola’s amplitudes), creating a repetition and multiplication illusion of the image within the screen with a tense and chaotic visual result. During this section music, sound and image are improvised by both performers within a fixed and planned length.

With regard to the audio output of the piece, each computer is connected via Firewire to an audio interface, with a quadrophonic output each. This division allows for a clear and separated space for each of the sound sources (the pre-recorded natural sounds with their DSP processing and those from the viola, with its own DSP live-electronics). The octophonic settings are described in figure 2.

The piece was world premiered during the Kölner Musiknacht 2010 in Cologne, Germany (Kunst Station Sankt Peter, 25.9.10).

References and Notes: 

  1. Gilles Deleuze and Félix Guattari, A Thousand Plateaus: Capitalism & Schizophrenia, trans. Brian Masumi (New York: Continuum International Publishing Group, 2004), 545. 
  2. Javier Alejandro Garavaglia, “Raising Awareness About Complete Automation of Live-Electronics: a Historical Perspective,” in Auditory Display, 6th International Symposium CMMR/ICAD 2009, Copenhagen, Denmark, May 2009. Revised papers - Lecture Notes in Computer Science–LNCS 5054 (Berlin–Heilderberg: Springer Verlag, 2010): 438–465. 
  3. Thomas Hermann and Andy Hunt, “An introduction to interactive sonification,” in IEEE Multimedia, 12 (2) (IEEE Computer Society, 2005): 20–24, http://dx.doi.org/10.1109/MMUL.2005.26  (accessed June 17, 2011).
  4. Barry Truax, “Handbook for Acoustic Ecology,” Second Edition (Cambridge Street Publishing, 1999, originally published by the World Soundscape Project, Simon Fraser University and ARC Publications, 1978), http://www.sfu.ca/sonic-studio/handbook/ (accessed June 18, 2011).