Skip to main content
SearchLoginLogin or Signup

Embodied Perspectives on Musical AI (EmAI) Workshop

Published onAug 29, 2023
Embodied Perspectives on Musical AI (EmAI) Workshop


How will machines perceive humans as diverse embodied entities, and how will humans trustfully communicate with devices exploiting multiple modalities? How can embodiment theories contribute to creating intelligent musical agents? At large, how will we make music with AI in the future? The interaction and artistic contexts, perceptual-motor constraints, affective states, environmental features, sensing devices, and computational systems are just a few that can come into play in an attempt to answer such questions. An interdisciplinary research model encompassing natural sciences, humanities, cognition, and performing arts is often necessary to understand conventional forms of musical collaboration and create novel music technologies. With these questions in mind, we organized the EmAI workshop in November 2022. Following the exhaustive interest and positive feedback of the participants and audience, we now seek to convene a group of scholars, artists, and engineers from diverse disciplines for the second edition of the workshop as part of the AI Music Creativity conference.

Workshop description

Embodiment, or, more concretely, musical embodiment, denotes how the body shapes our musical experiences. For example, you may exert more or less effort depending on the uncertainty of some musical situations or while playing technically challenging tasks. Such varying levels of effort during a live performance can lead to particular affective states resulting in bodily arousals. Or you can also use your body functionally, such as full-body swaying to facilitate keeping the groove or nodding your head to signal your bandmate to return to the tune’s main melody. From an enactive perspective, our perception is shaped by our actions. As such, cognition emerges not just through information processing but mainly from the dynamic interaction between the agent and the environment. All in all, we experience the music with our body, using more modalities than hearing, regardless if we perform or listen to it. 

Most musical artificial intelligence (AI) and multi-agent systems (MAS) focus on the music information found in the auditory domain. Modeling instrumental acoustics, synthesizing raw audio, or generating symbolic music data are highly complex tasks that AI can already accomplish to some extent. Still, it is still being determined how these technologies can collaborate with musicking humans. While there is a growing trend toward multimodality, which can be observed in state-of-the-art prompt-based audio/visual generators or large language models, the human-AI interactivity is still relatively limited. An expanded effort to employ embodied and tangible interaction channels can partly overcome that limitation. However, achieving a bare minimum of trustworthiness relies on strong non-verbal communication and mutual understanding, which can be fairly trivial notions for humans. While the EmAI workshop is motivated to tackle a broad range of topics and questions concerning embodiment, we propose Embodied Trustworthiness as the workshop theme, which is critical concerning collaborative creativity and human-AI partnership in general.

Why Trust? Brief theoretical remarks

Trust is a complex phenomenon that has been defined within sociology [1][2], psychology [3][4], economics [5], philosophy [6], and, more recently, computer sciences [7][8][9]. These definitions vary from behavioral and moral aspects to affective and personal attributes. Essentially, trust is critical for any positive relationship. Regardless of the nature of the agents, a trustor expects something positive from the trustee, knowing that the trustee might fail. Computer sciences and human-robot interactions often have two distinct perspectives: “Performance trust,” referring to the confidence that the trustee can deliver the task, and “moral trust,” referring to social-based interactions where the trustee will act morally appropriately and not exploit the trustor’s vulnerability [10]. In social cognition, these perspectives can be seen as equivalent to "competence" and "warmth," respectively [11]. The former has been significantly addressed in the context of human-automation interactions [12]. Whereas the latter has only recently started to attract a growing cross-disciplinary interest since intelligent systems have been increasingly used in social contexts [13][14][15] (e.g., arts, health, education, entertainment, productivity, etc.), making their acceptance as trusted social agents critical.


Previous edition

The first EmAI workshop was held in hybrid form in November 2022. It took place at the University of Oslo. To accommodate more presentations and more extended discussions, we structured these two days as follows:

  • Five thematic sessions

    • Design and interaction

    • Software and synthesis

    • Modeling and analysis

    • Creativity and expressivity

    • Mapping and control

  • Two keynote speeches 

  • A keynote performance

  • Installations and demos

The recorded live streams can be accessed at:

Embodied Perspectives on Musical AI (EmAI) - Workshop - Day 1
Embodied Perspectives on Musical AI (EmAI) - Workshop - Day 2

Proposed edition for AIMC

Title: Embodied Perspectives on Musical AI (EmAI) Workshop

Theme: Embodied Trustworthiness

Duration: One full day

Participation: By submissions to be curated by the workshop organizers

Submissions: Anything of relevance to the topic of Embodied Perspectives on Musical AI.

Sessions: TBA

Organizers: Çağrı Erdem, Riccardo Simionato, Sayed Mojtaba Karbasi, Alexander Refsum Jensenius, Carsten Griwods

Technical rider

The technical requirements are open to discussion with the conference organizers. The following list roughly contains the main equipment used in the first edition:

  • A mid-sized auditorium or conference hall

  • Large screen

  • Sound system

  • Live streaming equipment:

    • 1-2 cameras

    • Wireless microphones

    • A laptop or desktop PC 

  • Sitting commodities for the physical audience

  • A table for the presenter

  • Various cables (HDMI, minijack, power outlets, etc.)

No comments here
Why not start the discussion?