« Latent Terrain » : Dissecting the Latent Space of Neural Audio Autoencoders

video

informations

Type
Ensemble de conférences, symposium, congrès
Lieu de représentation
Ircam, Salle Igor-Stravinsky (Paris)
date
28 mars 2025

We present Latent Terrain, an algorithmic approach to dissecting the latent space of a neural audio autoencoder into a two-dimensional plane. Latent Terrain questions the conventional paradigms of dimensionality reduction in creative interactive systems, in which the projection from high to low dimensional spaces is done by modelling similar objects with nearby points. Instead, with a mountainous and steep surface, a terrain material generated by our approach affords greater spectral complexity when navigating an audio autoencoder's latent space.

Extending from this, we present Latent Terrain Synthesis, which is a method for sound synthesis whereby a waveform is generated by pathing through a terrain surface. Latent terrain synthesis aims to help musicians create tailorable and flexible materials to explore musical expressions leveraging the sonic capabilities of neural audio autoencoders such as RAVE.

We provide our MaxMSP externals nn_terrain that work together with nn~ to generate latent terrains for pre-trained RAVE models and allow users to navigate the terrain in real-time.

In this talk, I will first present the technical details behind latent terrain, workflow, how it integrates with RAVE, and a demo interface with a tablet and a stylus. I will also present a recent user study workshop at the Centre for Digital Music at Queen Mary University of London, with co-authors Anna Xambó Sedó and Nick Bryan-Kinns, of 18 musicians from various backgrounds exploring musical affordances and deriving sonic materials for musical expressions.

Acknowledgment: This work is supported by the UKRI Centre for Doctoral Training in Artificial Intelligence and Music are supported by UK Research and Innovation [grant number EP/S022694/1].

intervenants

Les médias liés à cet évènement

Mettre en son The Powder Toy - Kieran McAuliffe

The “falling sand” genre of games provide a unique “sandbox” experience to players, encouraging curiosity and creativity. Players experiment with a variety of powdered elements which are subjected to a detailed physics system and may react

28 mars 2025

Vidéo

Sinusoidal run rhythm

sinusoidal run rhythm is created by the addition of in-phase cosine functions in integer ratios. Their maxima are temporally and dynamically shifted in relation to corresponding notated rhythms and exhibit a physicality that is not present

28 mars 2025

Vidéo

ART MUSIC DENMARK presents : Presentation of “vssl” (new hardware electronic instrument) - Xavier Bonfill

28 mars 2025

Vidéo

Performance télématique immersive - Randall Packer, Théophile Clet, Federico Foderaro

At its core, Telematic Theater features the Audio-Visual Panner, a spatialization tool that synchronizes 3D performer positioning in spherical environments with 3D ambisonic and binaural sound. This correlation between sound, image, movemen

28 mars 2025

Vidéo

Session C-LAB : Application de Dicy2 dans la production - The Day in Gad-Avia - Chia Hui Chen, Jing-shiuan Tsang

Process - Sound Collection and Training Using sound data from 2023 to 2024 registered by NanFormosa, it is placed in the Memory Creator for analysis, with Nana performing improvisational interactions. Additionally, a MIDI controller is use

28 mars 2025

Vidéo

RAVE Model Challenge - Cérémonie de remise des prix

28 mars 2025

Vidéo

Installation vidéo interactive « Here's the Information We Collect » - Tansy Xiao

« Here's the Information We Collect » is a multi-channel interactive video installation tailored to respond to selected privacy policy on major social media platforms. The audience members are invited to engage with the work by speaking int

28 mars 2025

Vidéo

Overton - Synthèse spatiale décorrélée - Martin Antiphon

Decorrelated Spatial Synthesis involves the addition of classical synthesizer parts to spatial coordinates, and establishes a correlation between synthesis parameters and spatial positions. For each polyphony voice, each section of the synt

28 mars 2025

Vidéo

partager


Vous constatez une erreur ?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

heures d'ouverture

Du lundi au vendredi de 9h30 à 19h
Fermé le samedi et le dimanche

accès en transports

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Copyright © 2022 Ircam. All rights reserved.