A wealth of embodied knowledge has been articulated and documented in music performance practice literature. However, this documentation is usually aimed at instrument performers, and not packaged in such a way that it can be extensible to designers of computational Interactive Performance Systems (IPS). We extract meaningful dimensions from literature relating to instrument-specific kinematics, and describe the sonic and felt dimensions of specific technique. To aid a bottom-up design process, we provide a graphical representation of this articulated embodied knowledge in the form of a novel dimension space. In addition, we discuss possible solutions for how we can access and extend this embodied knowledge computationally. The resulting dimension space seeks to provide a clear understanding of what exactly we might extend through design. This research lays a foundation for future design work of IPS such as co-creative systems, interactive music systems, musical agents, and augmented instruments. In particular, this paper documents the ideation phase of our emerging IPS design for a professional violist. As such, this paper serves as an example of how we can extract and organise embodied knowledge from music performance, towards inscribing computational designs with embodied knowledge of music performance.
Recent philosophical viewpoints in AI and music frame creativity as a distributed phenomenon, and approach computational music systems through the lenses of extended systems paradigms. In addition, there is a long-standing discussion regarding the epistemology of computational music systems. In parallel, the acoustic instrument performance paradigm frequently serves as an elemental example of embodied and creative interaction. In light of these philosophical viewpoints, we proceed from the standpoint that the acoustic instrument-body system is already an Intelligent Performance System (IPS) that contains embodied knowledge of music. This research asks specifically how we can computationally extend this existing IPS.
Music-related Human Computer Interaction (HCI) design is frequently informed by forms of knowledge other than theoretical knowledge [1][2]. This points towards an interesting line of inquiry regarding the epistemological question: where is knowledge held? And by extension, moves us to consider the seat of intelligence in musical configurations comprising human musicians and Artificial Intelligence (AI) musical systems. Touching on these questions are philosophical viewpoints such as extended cognition [3] and intelligence [4][5], bodily incorporation [6] and configuration [7], and the metaphor of dynamical systems [8][9][10][11].
However, there is a perceived gap between epistemological (to do with knowledge) and phenomenological (concerned with the lived body[12]) approaches to computational music system design [13]. Philosophical viewpoints concerning embodied knowledge offer promise in closing this gap. In addition, AIMC-related research often concerns two practices: performance and design. We often focus on how these two practices differ. For example, designers who play their own designed instruments claim to experience disruptions in concentration while performing, because they oscillate between design and performance modalities [3][14]. There is space to explore integration of these two practices.
We are motivated to explore how this current landscape might inspire a novel definition of IPS, with a specific view towards our future design work of a semi-autonomous interactive music system for a professional viola player. In doing so, we explore philosophical viewpoints and design approaches that can help us to consider the human and computational in IPS as a single entity. Pointing towards instrument-specific performance practice literature, we engage with how embodied knowledge is explainable, and how it can be extensible to design practice. To this end, we propose a dimension space for our emerging IPS.
To the best of our knowledge, IPS is a novel term, introduced as the theme for the 2023 AIMC conference. According to the call for papers, the term concerns "how AI is applied in real-time artistic performance" [15]. As such, we find it necessary to engage further with what an Intelligence Performance System might be, especially in the context of augmented instruments and interactive performance systems.
In carving out a description of IPS, we turn to the distinction between sound and music. Sound is an observable phenomenon, a measurable, physical excitation of particles in a medium. Music is much more illusive. The question what is music? has been the subject of debate for at least 2.5 millennia [16] [17]. We can think of it as a "wicked" question, for which there is no one correct solution, but which is nonetheless worth attempting to answer because each answer deepens current understanding of the question [18].
In the Western ’Art’ Music (WAM) academy, there is shift from regarding music as text (scores, static objects) to music as performance (an activity). These two opposing answers to what is music? permeate AI music research. For instance, Collins describes the code of musical agents as "an abstraction of a score" [19]. In contrast, Chew posits that computer programs might accomplish “the essential work of performance and interpretation”, when their task encompasses solving for musical structures [20]. Here, 'performance' need not refer to performativity (to do with staging an activity for observer), but to performance as a task. A sportsperson can perform a high-jump; a violist can perform a down-bow stroke; a computer program can perform the task of generating musical structures. All are performers of a task. As Chew [20] states, a music generation system can be thought of as a performer in this task-oriented sense.
While both Collins and Chew conceptualise AI music systems as performers, each does so from a different perspective. Collins’s musical agents are performers because they exhibit autonomy as "collaborators" [19] (separate entities) to human performers. Chew’s music generation systems are performers because of the tasks they perform. If musical agents are performers in Chew’s sense, their code is perhaps more akin to the sensorimotor patterns that human music performers (like sportspeople) train into their muscle memory in the practice room. Here, embodied knowledge is a more appropriate metaphor than code as "an abstraction of a score" [19].
Expanding the question what is music?, Noë [21] asks the question what would disembodied music even be? Noë seems to arrive at the idea that music is inherently embodied because we experience it phenomenologically; we “are perceptually sensitive to something more than merely the physical stimulus” - rather to the meaningful arcs - of sound [21]. From this line of thinking, we take the position that humans are always in the loop when it comes to music. In contrast to Collins’s focus on the separateness of computational music systems and human performers, we consider these as one performing entity in our definition of IPS.
Magnusson [3] explores the epistemology of computational music systems through the theory of extended mind [22], which posits that cognitive systems can comprise combinations of the biological and the non-biological environment [23]. Designers inscribe music systems with theoretical knowledge of music and cultural assumptions while creating these tools, and such systems therefore contain knowledge and intelligence. Through the philosophical viewpoint of epistemic tools, computational music systems are simultaneously a part of the cognitive system of the humans who design and use them, and separate entities onto which we offload and "inscribe" knowledge [3]. In this light, computational music systems become epistemic tools, "systems of knowledge and thinking in [their] own terms" that can express knowledge and contain music theory and ideology [3].
The epistemic tools conceptualisation of is not entirely compatible with our aim to extend the acoustic instrument-body IPS. For instance, the extended mind metaphor requires us to regard computational music systems "primarily as extensions of the mind rather than the body" [3]. Magnusson later acknowledges that the epistemic tools viewpoint is only one side of the coin by stating that "our modern devices of expression should be viewed as both tools and machines; as both instruments for manual dexterity and mechanisms for automation; both as extensions of the body and cognitive scaffoldings of the mind" [13]. The epistemic tools metaphor alone makes it difficult to imagine how we might computationally extend an acoustic instrument-body IPS, where embodied knowledge is intrinsically part of the system.
A promising solution is to focus on designing movement instead of objects. Gillies writes about movement-focused interaction, defined as "interaction design around specific body movements rather than objects" [24]. Gillies argues that movement interaction involves embodied knowledge, while traditional HCI interfaces involve representational knowledge. In contrast to Magnusson’s epistemic tools, movement-focused interaction necessitates that the design is not simply "embodied in a sketch or an object, but has to include the user’s movement" [24]. Here, the design comprises the user’s movement and the computational system.
In this paper, we focus on IPS where the human user of a computational music system is a trained violist. This focus unlocks certain viewpoints on embodied and procedural knowledge in the IPS. Mashino and Seye [25] take the position that the body of a trained musician/dancer has its own “specific structure, competencies, and consciousness”, and describe how the how such a body is "moulded” and “transformed” through practice. Just as a computational music system is designed, a musician’s body is trained. As such, we take the position that embodied knowledge of discipline-specific movement has already been "designed" or "inscribed" into the IPS when the IPS user is a trained performer. In this way, the trained body and acoustic instrument combined is already an IPS.
While epistemic tools is a useful conceptual framework for the Digital Music Instrument (DMI) paradigm, it is incomplete in the context of IPS. Especially where the user is a trained musician, we propose it is more useful to conceptualise the design as including the musician’s body and movement as well as the computational system.
There are many promising philosophical viewpoints for how we might consider the body and the AI system as a single IPS entity. We touch on some of these viewpoints in this section.
As an alternative to AI, Ito [4] proposes Extended Intelligence (EI). Rather than understanding the individual intelligence of separate agents Ito describes intelligence as "networked" and "distributed", where intelligent tools form part of the "EI that every actor in the network is a part of" [4]. Through this view, intelligence is shared amongst many humans and machines. Gioti [5] considers Ito’s EI in AI and music. According to Gioti, the "distributed" and "networked" attributes of EI are useful where the goal is to augment, not replace human creativity intelligence. In terms of how EI might translate to music, Gioti writes that:
In distributed human–computer co-creativity, high-level aesthetic decisions are made by humans, while non-human agency is understood as an extension of human intentionality, enabling new types of human-technology interaction and the redefinition of conceptual spaces and artistic practices. [5]
In practice, this begs the question: if computational intelligence can extend human intelligence, what are we extending?
In contrast, Donnarumma [7] argues against framing AI technologies as bodily extensions or epistemic tools, that it is more fitting to think in terms of incorporation and human-machine configuration. To illustrate the idea of incorporation, Donnarumma refers to Haraway’s notion of technological bodies [26][27] where "a body incorporates a particular instrument, rather than pairing with it as if they were two unbiased separate entities" [7]. Rather than technologies extending out from human bodies, they are incorporated into hybrid bodies.
Incorporation occurs through physical, embodied practice and is an essential step towards creating a “human–machine configuration: a hybrid body, an arrangement of human and technological parts where the human body learns how to affect the instrument and be affected by it”[7]. In such a configuration, agency and expression extend in both directions through the technological and human components of the hybrid body.
While one could design a computational system with the intention to incorporate the system into one technological body, incorporation is not something that can take place during the ideation phase of the design practice, before there is a computational system to physically interact with. Although the acoustic instrument-body IPS can be thought of as incorporated, we cannot yet ascribe incorporation or configuration to our emerging IPS design in the ideation phase because it does not yet physically exist - the human user has not yet practiced with it.
Whereas the intelligent actors in Gioti’s EI reside in a network, Donnarumma’s form a body. Both are fitting lenses to view the computational system and the user as part of the design. Although Donnarumma presents extension and incorporation as at odds, there is an additional metaphor that we could apply to both possible lenses.
Dynamical Systems offers promise in conceptualising the human user, acoustic instrument, and computational element as forming an IPS together. Kaufer and Chemero [23] succinctly describe a dynamical system as "a set of quantitative variables changing continually, concurrently, and interdependently over time in accordance with dynamical laws described by some set of equations" [23]. This is the premise of Dynamical Systems Theory (DST) in the study of complex systems in mathematics. It is also applied in Psychology to describe the embodied interrelation between action, perception and cognition [23].
Research at the intersection of Music and Cognitive Science benefits from this lens. For instance, Walton et al. [28] use quantitative analysis methods from DST to study how different musical contexts afford distinct, emergent patterns of coordination between freely-improvising musicians (musical agents). Dynamical systems can be considered as modular – that is, as individual and coupled systems. For instance, we can consider a configuration of multiple musical agents as a dynamical system [28][29]. Yet we can also consider the neural (brain) and extra-neural (body) of a single musician as a dynamical system through the lens of embodied cognition [9]. In this sense, DST allows us to conceptualise the IPS both as Donnarumma’s configured body [7] and Gioti’s extended creative system [5].
DST has been applied to AI music system design in multiple ways [30]. For example, Blackwell et al. [10] use DST mathematics in the computational architecture of their autonomous music improvisers. This is different from using dynamical systems as a metaphor in HCI design, an approach that dates back to the 1980’s [31] and is seeing a resurgence today [32]. Also in recent years, Thelle [8] uses DST as a design metaphor for musical co-creation between humans and AI.
All of the above-mentioned frameworks offer promise for conceptualising the human user and computational system as part of the IPS. Likewise, each can be applied to humans in the absence of an AI component, enabling us to view computational and human intelligence through the same conceptual frameworks. DST is a promising metaphor to adopt for a rich and inclusive view of IPS.
Considering our notion of IPS where humans are part of the system, we do not consider embodied knowledge (of musicians) to be entirely ephemeral and elusive. Rather, we engage with embodied knowledge as somewhat explainable and as such, extensible in HCI design practice.
Polanyi [33] proposes the idea of tacit knowledge: a kind of knowledge and understanding that people possess, that exists at a deeper, non-symbolic level. Tacit knowledge is often equated with embodied knowledge [34]. Although we cannot be aware of all the knowledge our bodies hold, we can draw our attention to embodied knowledge through Csordas’s somatic modes of attention [35] and articulate what we perceive. Even Polanyi specifies that tacit knowledge can integrate "clearly identifiable elements". We can derive and transmit explicit knowledge by describing these elements. As such, we can think of explicit knowledge as inherently "rooted in tacit knowledge" [33]. This is the case for performance practice literature that describes performance techniques.
In performance practice, embodied knowledge is articulated and transferred between practitioners. It is held in multiple bodies and musicians are aware of the knowledge their bodies hold. Feminist epistemology offers a useful conceptual apparatus here:
An individual woman using an embodied way of knowing attempts to understand knowledges as constructed [36], and further, as something that she embodies; that she experiences and lives. She attempts to integrate the knowledges that she feels intuitively are important with what she has learned from others, and with a conscious awareness of how she embodies these knowledges. [37]
Individually and collectively, musicians have a heightened awareness of their bodily movements [25]. They are aware of and able to explain exactly what their bodies can do (kinematic), the felt experience of performing certain movements and how movement relates to physical changes in sound production on the instrument (mechanical). Much of the knowledge in the living bodies of trained musicians is not a black box; this knowledge has been explored and documented extensively in performance practice and pedagogy publications, such as Tuttle Coordination [38]. We propose that existing documentation of the trained body is a promising foundation for design that aims to engage with embodied knowledge of the trained body.
In Western ’Art’ Music (WAM), Performance Practice was established in the field of Musicology in the 19th century and has survived until today [39][38][16]. There are even earlier texts on instrument performance, such as Leopold Mozart’s 18th-century treatise [40]. In recent years, there has been a shift towards an embodied approach to music performance [41], and the area has benefited from developments in performance psychology in sport [42].
In addition to the growing number of academic publications on music performance, music performers disseminate their research in places where it is most likely to reach performers and students, who invest most of their time practicing and are not necessarily trained academic researchers1. Since instrument-specific research is specialised and not always aimed at an academic audience. Although the first author of this paper is a violist, performance literature may be difficult for many music computing researchers to access and interpret if they are not acoustic instrument performers themselves.
At this point in our exploration, we turn to Gioti’s extended creativity, wherein computational creativity extends "the space of creative possibilities" in music composition [5]. Afterall, embodied knowledge is already inscribed in the system because the performer is part of the system. With this research, we seek a practical solution for how exactly we can go about extending this "space" in IPS design. This warrants following practical question:
What do we aim to extend through the design of an IPS?
In asking this question, we seek to define the starting point of a bottom-up design process. Our answer is an articulation of the current "space of possibilities" of the IPS. Taking our position that the acoustic instrument-body system is already an IPS, the acoustic instrument-body system is the starting point for our design process.
As proposed by Polanyi, the body contains more knowledge than we are able to articulate [33]. It is impossible to describe and document all the embodied knowledge that the body holds. So we do not attempt to do so. In addition, we do not attempt to document and explore every possible viola technique. Such a task is better suited to a performance practice or pedagogy publication. Such an attempt would lead to a broad, shallow understanding of the acoustic instrument-body IPS in the context of AIMC.
Instead, we focus our efforts on one aspect of viola technique that we find particularly intriguing. This particular technique provides a window into the theoretical, mechanical, felt, sonic, and expressive dimensions of the current space of possibilities that we seek to extend computationally in our design practice. To make this embodied knowledge extensible, we extract dimensions from the instrument-specific literature on performance practice, then propose a dimension space in this way to depict how embodied knowledge is felt and expressed throughout the IPS.
Graham et al. propose a propose a methodology for describing "the entities making up richly interactive systems" [43]. These entities are both physical and virtual, and pertain to the performance of specific tasks by users. The authors depict these elements visually as a dimension space, where each axis represents a different element (dimension). Dimension Space Analysis is an offshoot of Design Space Analysis [44] where designers identify and articulate different problems, possibilities and theories relating to their design, to aid the design process and communicate their decisions to the community. This is "the design rationale behind a system from the set of all possible design decisions" [45]. The distinct characteristic of Dimension Space Analysis is that these possibilities, problems and theories are depicted through visual representation. This allows the viewer to easily compare different designs at a glance.
Applying the dimension space proposed by Graham et al. to music, Birnbaum et al. [45] propose a dimension space for musical devices (Figure 1), so that devices can be compared to one another. They identify possible dimensions from existing frameworks for classifying musical devices in the literature. The axes of the dimension space proposed by Birnbaum et al. represent object properties that would meaningfully display design differences among devices [45]. This dimension space is intended to be applicable to a broad range of ’devices’ including interactive systems and installations, DMI’s and augmented instruments. Although Birnbaum et al. intend their dimension space as a generalisable analytical tool for other authors to use, subsequent designers have most often adopted the idea of a Dimension Space for Musical Devices as inspiration to construct their own dimension spaces [13][46][47][48][49][50]. In this way, authors have rather carved out dimensional spaces that show the novelty of their own approaches to music system design.
In response to the dimension space proposed by Birnbaum et al. [45], Magnusson proposes an Epistemic Dimension Space for musical devices (Figure 2) and argues that research musical devices has suffered from a research focus based on phenomenology, at the expense of a more “epistemological” [51]. As such, Magnusson’s dimension space depicts cognitive, conceptual and music theoretical features of digital music systems.
The Epistemic Dimension Space implies that epistemological and phenomenological approaches are at odds, yet they might not be so incompatible when we consider tacit knowledge [33], embodied knowledge [34], and feminist epistemology [37] as no less valid than theoretical, explicit knowledge. Afterall, Polanyi argues that tacit knowledge is the basis for all theoretical knowledge [33].
We focus our efforts on one aspect of viola technique that we find particularly intriguing. To the best of our knowledge, this technique has not yet been explored with computing. First, we describe the technique. Then we explain our intrigue.
The concepts of repull and release are at the core of Tuttle Coordination [38]. Repull (Video 1) typically takes place during a sustained down bow on the right side of the body. It denotes an action in the upper back and side muscles to rotate the right scapula (shoulder blade), causing the elbow to drop slightly leading to pronation of the wrist and hand. This action subtly slows the speed of the bow, which darkens timbre for the duration of the action. Repull is usually intended to increase tension in the sound. Once at the tip of the bow, a release action between the head and the instrument “opens” the sound again [38]. Expert violists who are familiar with Tuttle Coordination can perform repull and release at any point in the bow, which enables seemingly limitless combinations of tension and release in timbre to explore.
Repull is notable for many reasons. Firstly, it is a niche concept, even within the viola community. Usually, viola students only learn repull when they reach an ’advanced student’ level of playing. Furthermore, repull is largely associated with the viola performance tradition of Turtle Island (North America). Yet although repull is strongly associated with a particular school of viola playing, it is not style-specific. For example, the first author is not from Turtle Island and uses repull in their classical WAM and free-improvisation practices.
Secondly, one can perform repull on any note, on any string, and even whilst playing many fast notes - or not at all. It is independent of what we might think of as the ’symbolic’ elements of music. Thirdly, repull often correlates to an expression of tension and release (as well as bodily tension and release) on a structural level of music. For example, awareness of the moment of deepest repull in a piece of music, can bring out tension and release on the (larger) scale of the entire piece of music (as opposed to tension and release within a note or phrase). While there are many other kinematic elements of performing, we can perform any passage of music with or without repull. Not all of these other techniques are independent of notes and rhythm. For these reasons, repull has potential to provide rich time series information about how performers express musical structure through tension and release.
Because repull occurs in the back muscles and does not relate to the immediately obvious notes and rhythms in music or essential elements of sound-production, it may be difficult for listeners (and watchers) of viola performance to notice repull. Like Gioti’s intention to extend rather than replace human creative practice, we are interested in repull as a musical feature that is obvious to performers but to listeners, to augment listeners’ experience of music.
We have described how performers use repull to affect timbre, express tension and release (both in their bodies and in the music), dynamically control the mechanical connection between the bow and the string, and express musical structure. Although this is just one movement (and its counter movement of release), repull could provide rich dimensions for Music Information Retrieval (MIR) and machine listening tasks of the computational system we will design. We conceptualise all of these dimensions (felt, mechanical, expressive, theoretical and sonic) as essential to acoustic instrument performance of music (as opposed to sound). To organise this information in a way that is extensible to design practice, we construct a dimension space where each axis graphically represents each dimension (Figures 3 to 5). Our dimension space also considers movement as an essential part of the design, so we represent axes (mechanical, sonic, expressive, felt) as dynamically changing axes, throughout the action of repull.
These five axes represent five dimensions of the acoustic instrument-body IPS, during the repull and release action of viola playing. Some of these dimensions (sonic, theoretical) relate to what Magnusson might classify as epistemological features. Other axes (expressive, felt) might be seen as phenomenological, in that they relate to the performer’s lived experience of repull. The mechanical dimension could be viewed as an object property relating to control. These dimensions exclude the computational component of the emerging2 IPS.
Thus far, we have only proposed five dimensions, whereas the norm is to propose a dimension space comprising seven axes. One of the useful aspects of dimension spaces is that designers can swap out axes as their design progresses and new design features, theories and objectives take precedence [43]. This flexibility allows for an imaginative and dynamic design process. We purposefully leave the remaining two axes open so that we can graphically represent imagined IPS that do not yet exist. Considering that our design includes the human user’s embodied knowledge of movement as well as the emerging computational system, we can ascribe dimensions for musical devices (from the literature) to the remaining two axes (Figures 6 & 7). In this way, we can identify areas where AI can expand the space of possibilities of repull during this ideation design phase.
We have suggested that the dimensions of repull offer promise for MIR and machine listening. This warrants a discussion of how and why we might access the dimensions of repull computationally. To be clear, we do not claim that it is possible to directly sense knowledge. Rather, we aim to detect (and eventually generate in real-time) expressions of embodied knowledge throughout the entire human-AI IPS.
Although we are inspired by existing dimension spaces for musical devices (Figures 1 & 2), our dimension space differs from existing approaches. Most notably, the dimensions that we propose relating to repull are interrelated, rather than orthogonal (Figure 3). In this way, our dimension space shows the potential of felt and musical features that we can extend directly from the repull action.
In terms of our future research, we will explore whether we can infer musical features relating to tension and release from patterns of repull and release in viola performance. We are interested that the Felt (Muscular) and Sonic (Timbral) axes of our dimension space (Figure 3) are interrelated in terms of tension and release. To extend this relationship computationally, we will train an AI component of our system on time-aligned audio and electromyographic3 data from a viola performer. With this approach, our IPS design-in-progress seeks to extend embodied knowledge into design from performance practice.
Inspired by philosophical viewpoints, we have described how our bodies are not a ’black box’ of tacit knowledge, but that we can consider much (not all) of our embodied knowledge as explainable. Dissemination of knowledge in the performance practice community is testament to this idea. While knowledge from performance practice is typically packaged for a specialist audience in a way that is extensible specifically to music performance, we demonstrated how we can identify information in the performance practice literature and repackage it in a way that is extensible to HCI design practice in AIMC. This lead to the contribution of a novel dimension space for IPS.
Our approach to designing with embodied knowledge may be extensible most obviously to the design of interactive music systems and augmented instrument configurations that build upon or include an element of acoustic instrument performance. Furthermore, this may be of use to intelligent instrument design grounded in movement-interaction, and possibly even to those who seek to inscribe DMI’s with embodied knowledge of music.
As such, this paper serves as an exemplar for how we can bridge embodied knowledge from Performance practice and computational system design, from the very start of a bottom-up design process. Considering the idea of AI as an extension to human knowledge and intelligence, we have provided one possible answer to the wicked question: what are we extending?
We acknowledge the support provided for the first author’s PhD studentship by the Arts and Humanities Research Council of the United Kingdom and the Consortium for the Humanities and the Arts South-East England.