Latent Terrain Synthesis
Building new musical instruments that compose and interact with AI audio generators.
Last modified 2025-08-18
Jasper Shuoyang Zheng
Welcome
Latent terrain is a coordinates-to-latents mapping model for neural audio autoencoders (such as RAVE, Music2Latent). A terrain is a surface map for the autoencoder's latent space, taking coordinates in a control space as inputs, and producing continuous latent vectors in real-time.
Latent terrain aims to open up the creative possibilities of latent space navigation, allowing one to adapt an autoencoder to easier-to-navigate interfaces (such as gestural controllers, stylus and tablets, XY-pads, and more), and build new musical instruments that compose and interact with AI audio generators.
An example latent space walk with Music2Latent:
Example applications
- Steering a neural audio autoencoder (tutorial coming soon).
- Building 1D/2D latent granular synthesiser (tutorial coming soon).
- Latent looping device.
Supported autoencoders
Latent terrain can work with any audio autoencoder as long as it offers latent variables. However, only a limited number of them have been implemented for MaxMSP, and we have only tested the following models:
- RAVE
Realtime Audio Variational autoEncoder for fast and high-quality neural audio synthesis, by Antoine Caillon and Philippe Esling. - Music2Latent-Scripted
Music2Latent is a Consistency Autoencoder to encode and decode audio samples, by Marco Pasini, Stefan Lattner, and George Fazekas. We're using a scripted fork of the original repository.
We plan to test the following model in the future:
Get started
Get in touch
Hi, this is Shuoyang (Jasper). nn.terrain~
is part of my ongoing PhD work on Discovering Musical Affordances in Neural Audio Synthesis, supervised by Anna Xambó Sedó and Nick Bryan-Kinns, and part of the work has been (will be) on putting AI audio generators into the hands of composers/musicians.
Therefore, I would love to have you involved in it - if you have any feedback, a features request, a demo / a device / or anything made with nn.terrain, I would love to hear. If you would like to collaborate on anything, please leave a message in this feedback form.
Acknowledgements
Shuoyang Zheng, the author of this work, is supported by the UKRI Centre for Doctoral Training in Artificial Intelligence and Music [EP/S022694/1].