bliss music analyzer library

An open-source library to make audio playlists by evaluating distance between songs.

Note: this page is about bliss-rs. For the old bliss in C, see here.

Index

What is bliss?
Download
Library usage
Technical details
And blissv1?

What is bliss?

bliss is a library designed to make smart playlists, by evaluating distance between songs. It is mainly useful integrated in existing audio players, or for research purposes.
You can see it in action for MPD through blissify for instance.

The main algorithm works by first extracting common audio descriptors (tempo, timbre, chroma…) from each song into a set of numeric features per song. Once this is done, the distance between two songs can be simply computed using the existing distance() method (which is just an euclidean distance, really).
Playlists can then be made by putting together close songs (see "usage" section for more info).

bliss is written in Rust (see the crate) uses ffmpeg and aubio. Python bindings are also available. The source code is available here.
It is still in development, so don't hesitate to submit PRs, bug reports, etc.

Download

The simplest way is just to add bliss-rs = "0.2.4" to your Cargo.toml.

If you use MPD and want to make smart playlists right away, install blissify instead: cargo install blissify.

You can also use the provided playlist example to build playlists from a folder containing songs, and a starting song. Quick example if you want to use it:
cargo run --features=serde --release --example=playlist /path/to/folder /path/to/first/song

64-bits packages for blissify are available for Archlinux and Debian/Ubuntu:

ArchLinux

Ubuntu

Library usage

Decoder::song_from_path() does all the heavy lifting, see below:

Compute distance between two songs:

 
     use bliss_audio::decoder::Decoder as DecoderTrait;
     use bliss_audio::decoder::ffmpeg::FFmpeg as Decoder;
     use bliss_audio::playlist::euclidean_distance;
     use bliss_audio::BlissResult;
     
     fn main() -> BlissResult<()> {
         let song1 = Decoder::song_from_path("/path/to/song1")?;
         let song2 = Decoder::song_from_path("/path/to/song2")?;
     
         println!(
             "Distance between song1 and song2 is {}",
             euclidean_distance(&song1.analysis.as_arr1(), &song2.analysis.as_arr1())
         );
         Ok(())
     }

Analyze several songs and make a playlist from the first song:


    use bliss_audio::decoder::Decoder as DecoderTrait;
    use bliss_audio::decoder::ffmpeg::FFmpeg as Decoder;
    use bliss_audio::{
        playlist::{closest_to_songs, euclidean_distance},
        BlissResult, Song,
    };
    
    
    fn main() -> BlissResult<()> {
        let paths = vec!["/path/to/song1", "/path/to/song2", "/path/to/song3"];
        let mut songs: Vec = Decoder::analyze_paths(&paths).filter_map(|(_, s)| s.ok()).collect();
    
        // Assuming there is a first song
        let first_song = songs.first().unwrap().to_owned();
    
        closest_to_songs(&[first_song], &mut songs, &euclidean_distance);
    
        println!("Playlist is:");
        for song in songs {
            println!("{}", song.path.display());
        }
        Ok(())
    }

For more information, see the documentation and README.md.

Technical details

The analysis process works this way:
Each song analyzed with Decoder::song_from_path, has an analysis field, which an in turn be transformed into a vector using analysis.to_vec().
Each value represents an aspect of the song, and an Analysis can be indexed with AnalysisIndex, to get specific field (e.g. song.analysis[AnalysisIndex::Tempo] gets the tempo value.)
Here's what the different parts represent:

Tempo has one associated descriptor, that uses the spectral flux as an onset detection method.
Timbre has seven different descriptors: the zero-crossing rate, and the mean / median of the spectral centroid, spectral roll-off, and spectral flatness.
Loudness has two descriptors, the mean / median loudness, which is a measurement of how loud the sound is, i.e. the amplitude delta of how much the speaker membrane should move when producing sounds.
This descriptor is usually not used in research papers since it very much depends on the way songs are recorded / encoded, but it should be an integral part of a playlist-making algorithm. A very soothing track will still wake you up if its volume is turned up to the maximum, even if it resembles a lot to other soothing tracks.
Chroma features have ten different descriptors, that are interval features based on this paper.

As you might have noticed, the chroma features make up for half of the features. While the euclidean distance (each numeric feature counts for the same amount as the other in the distance) provides very satisfactory results, experimenting with metric learning, or simply adjusting the distance coefficients could improve your experience, so don't hesitate to do so!

For more on these features, and some discussion on metric learning, see this thesis, that was made specifically for founding the basis of bliss' innards.

And blissv1?

Some people will have noticed that the previous location of bliss' repository was here. This repo contains the old bliss code, which was written in C. However, it has been re-written in Rust, to be able to implement from the ground up a more scientific approach to music information retrieval.

The old C library is still bugfixed though, and its webpage is still accessible, though it is recommended to use the Rust version, as it is faster and more complete.
Note that the features generated by C-bliss and bliss-rs are also incompatible.