Executive summary
The KAMoulox project (ANR-15-CE38-0003) runs from January 2016 to September 2019. It has been funded by ANR, the French national research agency, as a Junior Researcher project with principal investigator Antoine Liutkus. Its name stands for Kernel additive models for online unmixing of large historical arxives. KAMoulox has two main objectives:- Developing cutting-edge audio restoration techniques through signal processing research.
- Making them available for research in real world sound archives, in an open-source environment.
Assets
The archives of the CNRS - Musée de l’Homme
Description
- Since late 19th century, French ethnologists were sent over the world with the mission to record the world audio heritage
- To this day, this extraordinary database comprises 42k documents, among which 32k are digitized
- The diversity of this archive is tentalizing, with items from 199 countries and 1200 ethnies
Value
- The archives of the CNRS - Musée de l'Homme are one working material for a whole community of scientists in digital humanities.
- As of December 2017, there are 2600 scientists/month regularly using the archive.
- With a content of an extraordinary cultural value, the archives offer rich repurposing possibilities: musical creation, licensing, etc.
The mixed research unit CREM: Centre de Recherche en Ethno-Musicologie is responsible for these archives.
Challenges
Although they are an invaluable source for information and wealth for our immaterial cultural heritage, the archives also come with some specific challenges, which are the focus of the project.- Most recordings are old, degraded and noisy, often preventing repurposing in cultural creation.
- Users are not sound engineers and may not easily proceed to restoration themselves.
- Due to the sheer volume of these archives and the time required, it is impossible to automatically restore all entries. The need to proceed on demand emerges.
Objectives for the project
- Audio restoration for the non expert
- Embed online audio restoration tools in the archives
Audio signal separation
Description
- Antoine Liutkus and collaborators are specialists in audio source separation, which consists in recovering the differnet sounds in a recording. Denoising and demixing are examples.
- This topic is heavily based on probabilistic models for time series, that comprise two core elements:
- The distribution considered and the corresponding filtering procedure. Our recent research focuses on α-stable models, that appear especially interesting for denoising.
- Models for the spectrograms, that describe how a sound is varying over time and frequency. This is the topic of our recent research on kernel and DNN models.
- A. Liutkus et al. “Gaussian processes for underdetermined source separation.” IEEE TSP, 59.7 (2011): 3155-3167.
- A. Liutkus et al. “Kernel additive models for source separation.” IEEE TSP, 62.16 (2014): 4298-4310.
- A. Liutkus et al. “Generalized Wiener filtering with fractional power spectrograms.” IEEE ICASSP, 2015
- A. Nugraha et al. "Multichannel audio source separation with deep neural networks." IEEE TASLP, 2016.
Challenges
Our current research on probabilistic models for audio processing comprise several important challenges, that need to be addressed befor the methods may be used in real applications.- Tractability is an issue: applications require fast computations, and models need to be pre-trained. Scaling up training to the archives level remains a challenge, especially considering that archives only feature real data, so that the true underlying noise or sources are unknown.
- Alpha-stable modeling are effective and explain many heuristics used in the field. However, learning their parameters and generalizing them to multichannel cases is a real challenge, due to the lack of an analytical expression for their likelihood function.
Objectives for the project
- Robustness: filtering in noise, robust inference
- Explaining heuristics with the α-stable paradigm
- Flexible and exemplar-based trainable audio models
Work packages
KAMoulox is decomposed into 4 work-packages (WPs) of equal importance, that closely reflect its numerous scientific and technical objectives. From a transversal application perspective, all WPs revolve around incrementally improving a web-based source separation software architecture.