Skip to main content
SearchLoginLogin or Signup

Human-AI Musicking: A Framework for Designing AI for Music Co-creativity

Published onAug 29, 2023
Human-AI Musicking: A Framework for Designing AI for Music Co-creativity
·

ABSTRACT

In this paper, we present a framework for understanding human-AI musicking. This framework prompts a series of questions for reflecting on various aspects of the creative interrelationships between musicians and AI and thus can be used as a tool for designing creative AI systems for music. AI is increasingly being utilised in sonic arts and music performance, as well as digital musical instrument design. Existing works generally focus on the theoretical and technical considerations needed to design such systems. Our framework adds to this corpus by employing a bottom-up approach, as such it is built using an embodied and phenomenological perspective. With our framework, we put forward a tool that can be used to design, develop, and deploy creative AI in ways that are meaningful to musicians, from the perspective of musicking (doing music). Following a detailed introduction to the framework, we then introduce the four case studies that were used to refine and validate it, namely, a breathing guitar, a biosensing director AI, a folk-melody generator, and a real-time co-creative robotic score. Each of these is at different stages of development, ranging from ideation, through prototyping, into refinement, and finally, evaluation. Additionally, each design case also presents a distinct mode of interaction based on a continuum of human-AI interaction, which ranges from creation tool to co-creative agent. We then present reflection points based on our evaluation of using, challenging, and testing the framework with active projects. Our findings warrant future widespread application of this framework in the wild.

Author Keywords

Creative AI, Human-AI Interaction, Interaction Design Framework, Embodied Musicking

1. INTRODUCTION

In recent years, AI has increasingly permeated the visual, sonic, and performative arts (among other art forms). Consequently, AI-based interactions have also been explored extensively in musicking [1], that is the act of making music such as performance, composition, improvisation, listening, sensing, and designing (among others). The research discourse supporting this practice generally focuses on the theoretical and technical considerations needed to design such systems, by presenting advances in machine listening and composition of new music [12]; outlining design features for embodied musical prediction interfaces that simplify musical interaction and prediction to just one dimension of continuous input and output [10]; and design considerations for achieving “meta-creative musicianship” with autonomous musical agents [4], as well as exploring systems that achieve real-time music information retrieval for live music performances [6] software toolboxes for sound bank mining as a creative resource for “techno-fluent” musicians [16]. In turn, McCormack et al., [7] offer a framework for maximizing the human–AI creative interaction by considering the “theoretical and practical considerations needed for their design so as to support improvisation, performance, and co-creation through real-time, sustained, moment-to-moment interaction”.

While the use of AI in musicking is a rich and emerging area of research, most propose design-focused approaches for human–AI music interaction. Yet, Small’s concept of musicking suggests that the ‘act of musicking establishes in the place where it is happening a set of relationships, and it is in those relationships that the meaning of the act lies’ [21]; unfortunately, this phenomenon is not really dealt with. As such, there is a lacuna in discourse that focuses on the embodied and phenomenological meaning-making relationships that emerge from musicking with AI. By that we mean, firstly, what the conscious experience of musicking with AI is, as experienced from the subjective or first-person point of view; and secondly, how this understanding transforms the design of the creative-AI that we build for musicking. In response, this short paper investigates the following research question: to what extent does a phenomenological approach to understanding musicking with creative-AI enhance the development of these systems at various stages of the creative process?

2. THE HUMAN-AI MUSICKING FRAMEWORK

To address our research question, we determined that one potential way is through intermediate design knowledge—i.e., what sits between general theories and specific design instances [9], e.g., heuristics, design patterns, annotated portfolios, toolkits, methods, and conceptual frameworks (the focus of this paper). These structured collections of related concepts can be used to explain a complex phenomenon, for example by providing sensitizing concepts to guide the thematic analysis of data, or to help generate new design possibilities, and can take various forms including taxonomies and dimensions. In this paper, we introduce a framework for designing AI for music co-creativity and understanding human-AI musicking relationships.

2.1 Foundations of the Framework

The concepts underlying our framework are derived from our own practice-based experience with AI-based musicking and the wider literature and were applied to four ongoing design case studies to validate and deepen the concepts and demonstrate potential applicability. We thus draw from previous works that consider the role of AI in human creativity and that gather diverse perspectives from technologists, artists, musicians, researchers, and academics on its creative applications [20] and works that compile research, methodologies, and reflections from practitioners in music and human-computer interaction (HCI), also known as music interaction [8], exploring universal and situated approaches within particular cultural and aesthetic contexts in music.

We also draw from previous works that approach AI-based musicking from a human-centered perspective to design, develop and deploy real-time intelligent systems that cooperate with humans in a “deep and meaningful way”, as described in the “Embodied Musicking Robots” project [18], and works that propose embodied and post-phenomenological approaches to outline theoretical and conceptual structures such as embodiment, embodied cognition, flow, musicking, meaning-making, and embodied interaction to argue for their employment in musicking [19].

Furthermore, we expand on existing frameworks, such as that presented by Vear in The Digital Score [17] which is grounded on the double perspective of Taking-In: Taken-Into, which is used to identify key signatures of creative relationships in musicking to understand how digital scores create meaning. In basic terms, the Taking-In signatures are the connections of media and behaviour that “reach out, suggest, offer and shift through tendrils of affordance”, and Taken-Into signatures are those that “establish a world of creative possibilities for exploration through the flow”.

We also gather inspiration from relational frameworks like the “Contesting Control” framework which considers how humans contest control with increasingly autonomous systems and with their own bodily responses as these systems become more deeply connected to their bodies [3] by drawing on examples of a breath-controlled robotic ride, a brain-controlled film, and a musical duet with a robotic piano, to express how artistic experiences can take people on journeys through varying degrees of surrendering control, being aware of control, and looseness of control; and the ‘Aesthetic Failure’ framework which solely emerges from a detailed analytical study of performance with the aforementioned robotic piano [7]. This latter study revealed how failure in performance is a complex and layered phenomenon, ranging from improvisations arising from musical failures to catastrophic ones in which performances break down. It also reveals how humans may adopt different artistic strategies to failure when performing autonomous systems including taming, gaming, riding, and becoming the system. Our framework adds to this corpus by proposing an embodied and relational framework that can additionally be utilised as a design tool.

2.2 The Framework as a Design Tool

The aim of this framework is to provide a method and a tool that draws the musician progressively closer to the embodied relationships that could form meaning from inside the perspective of musicking. Some of its features are more relevant at different points in the creation process (i.e., the musician will be focusing on a different set of priorities through design, prototyping, user-testing, refinement, and evaluation), but nonetheless worth considering. It presents a series of questions that are broadly structured to understand a) how the creative AI is manifest in the human-AI relationship, b) how it operates, and c) how it interrelates with the human musician.

2.2.1 How it is Manifest in the Human-AI Relationship

The first part of the framework (Figure 1) is split into three questions:

2.2.1.1 Creative Goal of the creative-AI, from which we can address how it is manifest (or intended to be manifest) within the embodied flow of musicking. In short, this first question determines the specific objective of the creative AI as it engages with a musicking scenario. This might be a relatively simple task such as generating a melodic line in advance of the live performance, or in-the-loop control of real-time parameters by sensing the human’s gestural activity.

2.2.1.2 Liveness identifies the type of being-in-the-world sensation that the creative-AI proffers as it engages with the musicking activity. It is a sensation that something is co-operating in the real-time flow, or not. Occurrences of liveness reach out to the musician and affect their musicking to such as extent that it feels as if they are with them. For example, the creative-AI might be unaware of the “real-timeness” of the music and only called into action at specific points, in which case it has static and potential liveness: it is waiting to do something in this world. Alternatively, it could be analysing the melodic shape of a jazz improvisation and providing a harmonic accompaniment, in which case it has active and in-the-loop liveness: it is actively sensing the human, analysing them, and adding sound into the music.

2.2.1.3 Presence extends upon notions of liveness and helps to identify the way the creative-AI is manifest inside musicking. This, we can categorise into three states: tangible (something is there), reasoning (a mind in action), dimensional (I am somewhere). Emmerson argues that presence implies something more than simply a sound is there, or human agency is there, rather we should consider our “experience of it" as the primary connection in defining the relationship to these presences [5].

 

Figure 1. Components of AI Manifestation in Musicking

2.2.2 How It Operates in Musicking

The second part of the framework is split into four questions that identify how the creative-AI operates inside musicking (Figure 2).

2.2.2.1 The first of these is Belief, which in this sense is used to describe an acceptance by the creative-AI that something is true, or that it has trust or confidence in something from its perspective especially in how it interprets percepts (objects of perception) in the environment of musicking. It is the limitations of a worldview that are embedded into the AI’s programming and understanding of its world (its umwelt). In this sense beliefs are not facts, they are subjective and individual, and deal with conceptual parameters that may be biased or prejudiced [2].

2.2.2.2 Language identifies the ways in which the creative-AI reaches out from its world into the musicking world and communicates with the human musician(s) or another AI agent. This could be directly through sound or notation, but equally, it could be through embodied gestures of a robotic arm, or as visualising prediction data from a factory of neural nets, or as data streams.

2.2.2.3 In Perception we consider the creative-AI as an intelligent agent that can have its operational behaviour determined or informed by percepts (objects of perception) as it interacts with the dynamic situation of musicking. This question helps identify what it is reading from the human-musicking world and why.

2.2.2.4 When considering Learning we should be mindful that not all creative-AI needs to learn on-the-job, or even beforehand. Machine learning is only a part of AI, although it is being used increasingly in music. Creative-AI could be built upon symbolic algorithms or fuzzy logic; the defining factor of its creativeness and “AI-ness” is the context within which its behaviour is perceived to be creative and/or intelligent [20]. If the creative AI is learning, or has learned, then what is it? and how does that relate to the interrelationships between human and AI?

Figure 2. Components of AI Operation in Musicking.

2.2.3 How it Interrelates with the Human Musician

The final part of this framework brings the human back into focus (Figure 3).

2.2.3.1 Interactions identifies how all the previous characteristics of the creative AI plays out in the temporality of musicking. It considers the type of interaction that the creative AI presents and/or elicits in musicking. This might be playful in that it offers alternative solutions that the human uses as points of departure or inspiration, or pragmatic in that it controls the sequencing of form and materials. This feature also considers the mode of creative interaction and whether it is concurrent, collaborative, co-creative, or a combination of these over time [18].

2.2.3.2 Given the heavy focus on the behaviours and characteristics of the creative AI in musicking so far, these last two features highlight the Role of Human-in-the-loop and its Impact upon human creativity. As mentioned above this framework focuses on bottom-up experience, the creative AI so far has been built bottom-up from a goal and sense of liveness and presence, belief, and perception. We can use this deeper phenomenological understanding to re-assess and re-align the original conceptions about the human-musician involvement and appraise whether the creative-AI is fit for purpose, or its design offers new ways in which the human can interact with it.

2.2.3.3 The final Impact feature of the framework asks us to reflect upon any shifts in the musicking of the human(s). The assumption here is that the human musicians are open to considering the creative AI as having the potential to influence the way they make music within the context of this musicking activity, or even a general influence on how they approach music-making in the project or more generally.

 

Figure 3. Components of AI Interrelations in Musicking.

3. APPLYING THE FRAMEWORK

We applied this framework to four case studies for which a human-centred approach for designing systems to support AI-based musicking is a common thread. These were a breathing guitar, a biosensing director AI, a folk-melody generator, and a real-time co- creative robotic score. Each of these case studies was at a different stage of development—ranging from ideation, through prototyping, into refinement, and finally, evaluation—and each sit at a different point on a continuum of human-AI interaction that ranges from creation tool to co-creation agent. The framework was applied as a way of evaluating and shaping the future development of each project. An evaluation of this process is beyond the remit of this short paper, but we intend to report on progress separately. However, it is worth outlining these four projects briefly to illustrate the range of applications this framework could be applied to. The four case studies are:

3.1 The Breathing Guitar

The Breathing Guitar by Juan Martinez Avila is a project that is at the ideation stage and used the framework to outline basic phenomenological design considerations. This idea started as a design concept that emerged during an embodied ideation exercise [1], which consists of a guitar that shapeshifts with inflatable elements across different parts of the instrument, which respond to the breathing patterns of the guitarist by pushing back against their body. The vision is that the instrument would behave like an intelligent companion, at times comforting the guitarist, and at times putting them out of their comfort zone during performance.

3.2 Rhizomessages

Rhizomessages by Solomiya Moroz is currently at R&D stage and used the framework to guide the prototyping process and identifying areas of focus in software development and human-computer interaction. It is a chamber opera work-in-progress creating an immersive sonic ecosystem environment between plant life, musicians and the audience present during the performance. The AI embedded in this digital score of the chamber opera is inspired by the communication of plants through action and variation potentials and can be manifested as two networks: Themes and Variations. Together the total ecosystem of the digital score is composed and communicated with the performing musicians and plant responses in real time.

3.3 AI Folk Session Player

The AI Folk Session Player by Steve Benford used the framework to refine design decisions and to roadmap the next stages of development from a human-centred perspective. It builds on the existing FolkRNN project and used that to springboard ideas and shape future development [14,15]. While FolkRNN’s compositions have been performed by humans as part of concerts, sessions, competitions and on recordings, the literature does not provide an account of AI itself performing them. The long-term aspiration of this project is therefore to create an AI session player that can perform FolkRNN’s tunes live at traditional sessions, folk clubs and gigs, alongside human musicians. Using AI in this way is seen as an interesting way of extending the instrument’s functionality by enabling it to conjure up its own tunes.

3.4 Score-drawing Robotic Arm

This robot arm digital score by Craig Vear extends the research into embodied musicking robots by Vear [17,18,20]. At the point of engaging with the framework, the first stage of the project had just been completed, and the next iteration was under evaluation and then deployed. This next-stage arm uses an AI Factory to listen to a live musician and then create a response in real-time by moving and drawing something on the sheet of paper and in the space above the paper. This in turn is interpreted by the live musician who responds accordingly by making an appropriate sound, thereby completing a feedback loop. An additional feedback loop was developed that plugged a disabled musician into this system using a brainwave reader and a skin arousal sensor to stream her condition into the AI Factory. This further informed the decision-making process of the AI and acted as a sort of cybernetic extension to the musician’s ability to thrive in a music-making relationship with able-bodied musicians. An example of how this stage of development was shaped by this framework can be seen in this example[1].

4. CONCLUDING REMARKS

When addressing the central research questions posed at the start of this paper it was found that the framework helped shape thoughts on how the individuals would like AI to behave inside their works. This included helping make the transition between some initial ad-hoc experimentation and thinking in a more structured way about how the project could and should develop next. In this respect, the framework added useful structure to creative thinking and future priorities at key moments in the projects’ timelines. And crucially, from inside musicking. Although this was validated through a small study using only four case studies from a single research community, there are very encouraging signs that warrant wider release of the framework.

Depending on the stage of development in the project different features/questions in each of the parts of the framework became heightened or redundant but were nonetheless worth considering. For example, trying to differentiate between liveness and presence, beliefs and perception, and beliefs and learning at the initial development stage of a project led to some confusion as the project’s identity had not been concretely developed. However, considering these aspects did start to “add flesh to the ideas”. Conversely, thinking about the AI design in part 2, was beneficial for all participants regardless of experience, and highlighted how developing creative-AI is a process.

Being offered an opportunity to evaluate different choices from a phenomenological perspective, highlighted certain aspects of the AI’s presence and the embodied nature of the AI in performance. This included developing ideas about how it collaborates, which aspects the musician wished the AI to co-create, and about its persona. This led to all the projects developing new ideas about the role of AI, how expert it needs to be, where its fallibilities and flaws could manifest, and how its ‘spirit’ could be considered to have a sense of self-awareness which added to a sense of co-creation to each project.

5. ACKNOWLEDGMENTS

The Digital Score project (DigiScore) is funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. ERC-2020-COG – 101002086).

6.    REFERENCES

[1]             Juan Martinez Avila, Vasiliki Tsaknaki, Pavel Karpashevich, Charles Windlin, Niklas Valenti, Kristina Höök, Andrew McPherson, and Steve Benford. 2020. Soma Design for NIME. In Proceedings of the 2020 International Conference on New Interfaces for Musical Expression (NIME’20).

[2]             Avron Barr, Edward A Feigenbaum, and Paul R Cohen. 1981. The handbook of artificial intelligence. William Kaufmann.

[3]             Steve Benford, Richard Ramchurn, Joe Marshall, Max L. Wilson, Matthew Pike, Sarah Martindale, Adrian Hazzard, Chris Greenhalgh, Maria Kallionpää, Paul Tennent, and Brendan Walker. 2021. Contesting control: journeys through surrender, self-awareness and looseness of control in embodied interaction. Human–Computer Interaction 36, 5–6 (October 2021), 361–389. DOI:https://doi.org/10.1080/07370024.2020.1754214

[4]             Andrew R Brown, Matthew Horrigan, Arne Eigenfeldt, Toby Gifford, Daniel Field, and Jon McCormack. 2018. Interacting with Musebots. 19–24.

[5]             Simon Emmerson. 2017. Living electronic music. Routledge.

[6]             Rebecca Fiebrink, Dan Trueman, and Perry R Cook. A Meta-Instrument for Interactive, On-the-fly Machine Learning.

[7]             Adrian Hazzard, Chris Greenhalgh, Maria Kallionpaa, Steve Benford, Anne Veinberg, Zubin Kanga, and Andrew McPherson. 2019. Failing with Style: Designing for Aesthetic Failure in Interactive Performance. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI ’19), Association for Computing Machinery, New York, NY, USA, 1–14. DOI:https://doi.org/10.1145/3290605.3300260

[8]             Simon Holland, Tom Mudd, Katie Wilkie-McKenna, Andrew McPherson, and Marcelo M. Wanderley. 2019. New Directions in Music and Human-Computer Interaction. Springer.

[9]             Kristina Höök and Jonas Löwgren. 2012. Strong concepts: Intermediate-level knowledge in interaction design research. ACM Transactions on Computer-Human Interaction (TOCHI) 19, 3 (2012), 23. DOI:https://doi.org/10/f225d4

[10]           Charles Patrick Martin, Kyrre Glette, Tønnes Frostad Nygaard, and Jim Torresen. 2020. Understanding Musical Predictions With an Embodied Interface for Musical Machine Learning. Frontiers in Artificial Intelligence 3, (2020). Retrieved January 30, 2023 from https://www.frontiersin.org/articles/10.3389/frai.2020.00006

[11]           Jon McCormack, Patrick Hutchings, Toby Gifford, Matthew Yee-King, Maria Teresa Llano, and Mark D’inverno. 2020. Design Considerations for Real-Time Collaboration with Creative Artificial Intelligence. Organised Sound 25, 1 (2020), 41–52. DOI:https://doi.org/10.1017/S1355771819000451

[12]           Eduardo Reck Miranda. 2021. Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity. Springer Nature.

[13]           Christopher Small. 1998. Musicking: The Meanings of Performing and Listening. Wesleyan University Press.

[14]           Bob Sturm. 2018. What do these 5,599,881 parameters mean?: An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer.

[15]           Bob L. Sturm, Oded Ben-Tal, Úna Monaghan, Nick Collins, Dorien Herremans, Elaine Chew, Gaëtan Hadjeres, Emmanuel Deruty, and François Pachet. 2019. Machine learning research that matters for music creation: A case study. Journal of New Music Research 48, 1 (January 2019), 36–55. DOI:https://doi.org/10.1080/09298215.2018.1515233

[16]           Pierre Alexandre Tremblay, Gerard Roma, and Owen Green. 2021. Enabling Programmatic Data Mining as Musicking: The Fluid Corpus Manipulation Toolkit. Computer Music Journal 45, 2 (June 2021), 9–23. DOI:https://doi.org/10.1162/comj_a_00600

[17]           Craig Vear. 2019. The Digital Score: Musicianship, Creativity and Innovation. Routledge.

[18]           Craig Vear. 2021. Creative AI and Musicking Robots. Frontiers in Robotics and AI 8, (2021). Retrieved January 31, 2023 from https://www.frontiersin.org/articles/10.3389/frobt.2021.631752

[19]           Craig Vear. 2022. Embodied AI and Musicking Robotics. In The Language of Creative AI: Practices, Aesthetics and Structures. Springer, 113–135.

[20]           Craig Vear and Fabrizio Poltronieri. 2022. The Language of Creative AI: Practices, Aesthetics and Structures. Springer Nature.

[21]     Christopher Small. 1998. Musicking: The Meanings of Performing and Listening. Wesleyan University Press



[1] https://www.facebook.com/DigiScoreERC/videos/6355463204514665/

Comments
0
comment
No comments here
Why not start the discussion?