Ethernet Orchestra: Interdisciplinary Cross-Cultural Interaction in Networked Improvisatory Performance

The study of interdisciplinary cross-cultural interaction in networked audiovisual performance serves as the starting point for Ethernet Orchestra’s 2010 telematic music improvisation and live cinema performance, Distant Presences. This paper outlines and technical facilitation of the performance, evaluation methodologies and the creative strategies employed by the dispersed musicians and visual artists to collaborate remotely.



Technical and creative strategies involved in collocated audiovisual performance have to be reconsidered when performers are separated by large geographic, cultural and “acoustical distances.” [1] While a burgeoning knowledge of eclectic network performance has illuminated many of the inherent technical issues involved with dispersed interaction, there remains poignant issues of intercultural perception and cognition in interdisciplinary, networked collaboration. This paper seeks to address these concerns by examining the strategies that the remote musicians and visual artists utilized in realizing this improvisatory, telematic audiovisual performance. Considering differences in perception across the dispersed collective, it adopts a semiotic perspective focusing on the role of metaphor in understanding signs in improvisatory musical and visual interaction. Viewed through the framework of distributed cognition, the interface and use of video is examined as features of a “conceptual field” [2] for evaluating collaboration across artistic disciplines and cultures.  The collaboration combined intercultural musical improvisation and online image mixing spanning four continents and time zones. Networked musicians were located at the University of Technology, Sydney (AU), Kunstmühle gallery, (DE), and Londrina (BR) with visual artists in Sydney, London and Munich. The performance was broadcast by FBi Radio, Sydney, and streamed on the Internet as “Radio You Can Watch”, allowing listeners to view the accompanying live visual mixing to the radio broadcast. This form of cross-cultural, interdisciplinary collaboration affords unique opportunities to investigate the technological and creative strategies involved in producing a live, networked audiovisual performance for radio, which as far as it is known, unique for the medium.  

Ethernet Orchestra is an international network music ensemble comprised of skilled musicians from a diverse range of cultures and improvisatory traditions. The instrumental make up of the group includes, Turkish oud & bendir, Mongolian horse fiddle, throat singing as well as guitar, trumpet (played the author) and Max/MSP electronic processing. The collaborating visual artists are also known for their work in a range of audiovisual practices, however in this performance they would be best described as live cinema artists rather than VJ’s. The term VJ initially described video jockeys presenting videos on MTV (Music Television), but as Makela (2008) argues “metamorphosed to include video performance artists who create live visuals for all kinds of music.” [3] However, the term also carries associations with club culture in which the VJ’s contribution is to mix visual projections to accompany a DJ’s music set. While these distinctions are somewhat diffuse, live cinema or real-time audiovisual performance is seen as routed in narrative art practices, with a history dating back to lantern shows and shadow plays. Network artist, theorist and participant in this performance Helen Varley Jamieson notes how problematic these terminologies can be, but adds that a more succinct term CJ (Cyber Jockey) has emerged recently to describe Internet based audiovisual performance.  

Networked Audiovisual Connectivity

Connecting musicians and visual artists across four time zones with low latency and high audiovisual fidelity required a combination of network interfaces capable of synchronizing dispersed collaboration without interruption. While documented networked audiovisual performances have employed bespoke interfaces on high-speed research networks, Internet2, KAREN, CERNET2, the multifarious configuration of regional networks, participant addresses and the machines being used in this performance required accessibility rather than speciality. This was particularly necessary for the performers participating from remote locations with domestic connection speeds. Player communication between the audiovisual interfaces was also paramount for synchronization and the demands of a live a radio broadcasting schedule. This was achieved by a combination of IRC (Internet Relay Chat) and interface text windows coordinated at the central hub studio at the University of Technology, Sydney. Although a radio broadcast was the principal performance medium, the station studios were not technically equipped to facilitate the performance. It was therefore decided that the University of Technology, Sydney would best provide the technical hardware, sound requirements and network to run two bandwidth hungry interfaces needing fast up and download speeds.

While a number of network audio interfaces could have been employed, it was decided to use the ‘plug-n-play’ platform eJamming, a proprietary multi-user interface, using peer-to-peer architecture that transmits 44.1 kHz 16 bit (CD quality) WAV files. The interface sends packets (signal as digital information including its destination) via UDP (User Datagram Protocol), making it a fast, high fidelity platform for synchronous connectivity. Its low level latency is achieved through compression algorithms that shrink the file size for high packet flow, and its peer-to-peer configuration allow it to route the signal directly to players rather than via external servers. This transforms large network latencies of up to 150ms down to approximately 11ms (imperceptible) in the interface, effectively creating collocated acoustics for the networked players. Player soundcards are connected via USB, and audio in-and-out parameters control monitoring levels. A text window provides communication between musicians facilitating synchronous dialogue between performers.

The “live cinema” mixing was performed in the multi-user audiovisual interface VisitorsStudio, a Flash based environment, enabling artists to "upload sound files and still/moving images (jpg, png, mp3, flv, swf) to a shared database, mixing and responding to each other's compositions in real-time." [4] Performers upload images, short animations or movie clips, which are then selected and looped in a mix window. Images and clips can be synchronously manipulated, changing parameters such as size, perspective, contrast and filters. Artists improvise with collages of static and moving images responding to each other and the audio stream in real-time.  Figure 1 shows a screenshot of the networked live cinema mixing in the VisitorsStudio interface.

As the hub studio was located some miles away from the radio station, the networked performance was streamed from eJamming to the station studio via the Internet broadcasting platform Nicecast.  Nicecast is a client-server platform that broadcasts audio as a compressed Internet stream to other machines via a URL (Universal Resource Locator) or web address. This allowed the station to then re-broadcast the stream on their terrestrial frequency 94.5 fm, along side the stations own Internet stream. Both local and international listeners were then able to watch the live cinema mixing via the url For artistic and evaluation purposes the visual mix was recorded and archived through the interfaces sequenced file playback system, and sound and video of the musician’s performances was also recorded. The complete audiovisual performance can be viewed at Despite the array of dedicated network interfaces, the humble telephone served the purpose of tele-communication device between the station studio and the university in a pre-performance interview with the author.

Collaborative Networked Performance

As an area of research, collaborative, networked performance has become a point of considerable interest in recent years. As far back as the 1970s LAN (local area network) computer music experiments of the League of Automatic Composers to contemporary WAN (wide area network) performances of the Stanford SoundWire group, Pauline Oliveros, Avatar Body Collision and Furtherfield, musicians, artists and researchers have been developing technical and theoretical frameworks to enhance and evaluate interdisciplinary, dispersed networked performance. While the examples provide a small contingent of practitioners worthy of inclusion in any networked performance review, they represent a cross section of contemporary interdisciplinary approaches to it. This includes diverse practices such as network acoustics and technology research, composition, improvisation, theatre and social activism.  

Gesture and Dispersed Perception

Within the ocean wide area network linking Sydney, Londrina, Braunschweig, London and Munich, three of the musicians were collocated in the hub studio at the University of Technology, Sydney. All participants recorded their individual performances on video providing useful data to contrast the gestures of the collocate group to that of the remote musicians who were unable to see each other. It should be noted that participant observation and analysis of gesture in this paper is applied only to the networked musicians. Evaluation of the video recordings revealed that where musicians could see each other, they rarely used the opportunity to coordinate their ensemble collaboration. This is in line with previous research where the video relay of networked performers primarily assumes “the purpose of providing an experience for the audience,” [1] and that dispersed musicians usually don’t visually monitor their collaborators during a performance. The role of gesture in improvisation and musical performance is well documented, and has often been seen as a principal problematic for musicians collaborating in the non-visual networked environment. However, in the context of the networked improvisatory performance, the use of a live video stream does provide participants with what Hutchins (2005) describes as a “material anchor” in their “interaction of mental…and material structure.” [2] Viewed through the lens of distributed cognition, material anchors are also present in the appearance of collaborators names or cursors in the interface.

Multimodal Improvisation

The primary focus of the Distant Presences performance was not only intercultural and interdisciplinary improvisation but to explore this as a multimodal radiophonic performance. Theoretical musings on the interplay of sound and image in audiovisual arts and experimental film are historical in their weight and depth, however recent discourses have centered on “concepts of interactivity” [5] and montage, veering away from the model of a linear celluloid medium. Live cinema mixing with music and sound art has also caught the attention of technologists keen to create novel interactive interfaces, and artists wanting to re-imagine or “redefine what cinema can be.” [6] While the collaborative domains of live cinema and music are born out of collocated gallery spaces and club culture, it is the spontaneous entwining of the two that informs its methodologies. Makela (2007) argues “what differentiates live cinema from normal cinema is the ability to improvise the narrative or concepts, to alter their course as the performance progresses.” [3] In networked audiovisual performance it is this same “interplay of the audiovisual, performed and improvised in the momentary negotiations of the participants” [6] that shape this experience.

Significant in Distant Presences, is how these negotiations are shaped and mediated across disciplines and between cultures. Analysis of post performance interviews and audiovisual instances reveal similarities in approaches between musicians and visual artists collaborating in cross-cultural interdisciplinary improvisation. While there are obvious differences in the creation of content, i.e. pre-performance selection of images versus spontaneous creation of music and sound, what emerges is the use of metaphor as a signifier for perception across the audiovisual spectrum. As Chion (1994) suggests, “listening with the ear is inseparable from that of listening with the mind, just as looking is with seeing.” [7] From a semiotic perspective, the research identifies what Lakoff and Johnson (1980) refer to as “ontological” and “orientational” metaphors as potential experiential signifiers in dispersed interaction. They argue that ontological metaphors give us a way to refer to experience as “ways of viewing events, activities, emotions, ideas, ect., as entities and substances”, which allow us to “refer to it, quantify it, identify a particular aspect of it [….] and act in respect to it. [8] Analogous to their given examples, rhythm, timbre and melody can be considered entities in demonstrating a musician’s experience of tele-musical improvisation. The examples illustrate ways in which dispersed collaborators can think about, and refer to such experience.

That rhythm is pushing the music to a climax at that point.

The timbre of that sound was so cold that it made the melody really stand out.

 Orientation metaphors, relate us to our “spatial orientation: up-down, in-out, front-back, on-off, deep-shallow, central-peripheral”, however they do not “structure one concept in terms of another but instead organize a whole system of concepts with respect to one another.” [8] This gives a spatial orientation to concepts, for example “happy is up; sad is down.” [10] Applied to networked, multi-modal improvisation, such associations are significant in communicating emotion and meaning across cultures and disciplines. Examples of this emerge from the analysis of post performance interviews where musicians and visual artists reflect on their awareness of metaphor during the performance. This often occurs during complex, cognitively demanding situations in what Schön (1995) describes as reflective practice in action. Musicians and visual artists appear to process their perception and creative expression, while producing spontaneous audiovisual dialogues, revealing “a capacity for reflection on their intuitive knowing in the midst of action […] and further to “use this capacity to cope with the unique, uncertain, and conflicted situations of practice.” [9]

This is exemplified in the opening 10 minutes as electronic musician Martin Slawig performing from Braunschweig, Germany illustrates “intuitive knowing” and awareness of spatial orientation to musical concepts, “I played a cymbal in time but lost timing inspired by the section before […] I liked the growth of this moment because of the multi-layer patterns and rhythmic speed up, everything was there, guitar loop, bass pulse and at 12’34” brake down everything deleted and flatline”. Slawig is describing a section of group dynamics where the music and visual mixing build to a climax, and then suddenly drop. Sensing this same collective crescendo, Turkish bendir player Yavuz Uydu remembers, “I start playing an increasing metered 2/4 beat, I think this is what triggered everyone to build up”. Although most of the live cinema artists reported a cognitive priority to the visual mixing before the music, Helen Varley Jamieson also related an awareness of spatial metaphor to concepts in her strategies during the performance, “I consciously chose images that spoke to the theme in some way […] to include the idea of temporal distance as well as spatial”. Contributing to this, fellow visual collaborator Graziano Milano notes, “it’s timbre, tonality, pace, atmosphere, melody that triggers in me visualisations of what kind of general mixes/visuals may work”. The author also argues that timbre plays a pivotal role in the perception of meaning in intercultural, and interdisciplinary improvisation. Sacks (2007) also suggests, “timbre constancy is a multileveled and extremely complex process in the auditory brain that may have some analogies with [visual] colour” [10] and this is corroborated by examples where qualities of sound appear to guide participants perceptions of audiovisual events. What starts to emerge in these statements is a relationship between metaphorical signification in sound and the creative responses of collaborators to it. 


Distant Presences is an illustration of a successful cross-cultural interdisciplinary collaboration, achieved by combining audiovisual interfaces for an innovative performance outcome. It demonstrates the creative and cognitive challenges facing musicians and visual artists who foray into the liminal collaborative terrain of live, improvisatory, networked performance. Approaches to audiovisual improvisation are discussed and chosen examples illuminate the role of metaphor in signifying meaning in intercultural, collaborative perception and creative interaction. Evaluation of gesture within the non-visual network performance environment suggests that collaborators replace visual signifiers with extended listening practices, in which melody, rhythm and timbre become entities of meaning, fashioning metaphors of experience and expression in the minds of the participants. Viewed through a perspective of distributed cognition, visual representations of dispersed collaborators primarily act as “material anchors” [2] for the “conceptual integration” of “a shared physical and socio-cultural human experience.” [2] These representations are also present in the networked environment as interface text boxes or moving cursors.

As a model for cross-cultural collaboration, networked improvisation is unprecedented in its ability to create dialogical exchange between musicians and artists from diverse traditions and cultures. While developing technologies can facilitate these unique experiences, collaborators are faced with new creative and cognitive challenges in negotiating synchronous intercultural improvisation. This paper is intended as a contribution towards an understanding of dispersed cognition and the nature of representation in non-visual networked improvisatory performance, recognizing the need for new methodologies to mediate these emerging topographies. 

References and Notes: 
  1. J. P. Cáceres, R. Hamilton, D. Iyer, C. Chafe and G. Wang, “To the Edge with China: Explorations in Network Performance” (paper presented at 4th International Conference on Digital Arts - ARTECH 2008, November 7-8, Porto, 2008).
  2. E. Hutchins, “Material Anchors for Conceptual Blends,” in Journal of Pragmatics 37, no. 10 (2005): 1555-1577.
  3. M. Makela, “The Practice of Live Cinema” (paper presented at ARTECH 2008, November 7-8, Porto, 2008).
  4. Furtherfield's official Web Site, (accessed June 25, 2011).
  5. P. Greenaway, "Toward a Re-Invention of Cinema" (lecture, European Graduate School - Cinema Militans Lecture, Saas Fee, September 28, 2003).
  6. A. Bucksbarg, “VJing and Live AV Practices,” VJ, 2008, (accessed 25th June 2011).
  7. M. Chion, Audio Vision: Sound on Screen (New York: Columbia University Press, 1994), 33.
  8. G. Lakoff, and M. Johnson, Metaphors We Live by (Chicago, IL: University of Chicago Press, 1980), 14-15.
  9. D. Shön, The Reflective Practitioner: How Professionals Think in Action (Aldershot: Algate Publishing, 1995), 8.
  10. O. W. Sacks, Musicophilia: Tales of Music and the Brain (New York: Alfred A. Knopf, 2007), 115.