Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
How it works

Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.

Note that for voice recognition, Dejavu is not the right tool! Dejavu excels at recognition of exact signals with reasonable amounts of noise.

Installation and Dependencies:



First, install the above dependencies.

Second, you’ll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup:

Now you’re ready to start fingerprinting your audio collection!



Let’s say we want to fingerprint all of July 2013’s VA US Top 40 hits.

Start by creating a Dejavu object with your configurations settings (Dejavu takes an ordinary Python dictionary for the settings).

Next, give the fingerprint_directory method three arguments:

  • input directory to look for audio files
  • audio extensions to look for in the input directory
  • number of processes (optional)

For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn’t, and so won’t repeat itself.

You’ll have a lot of fingerprints once it completes a large folder of mp3s:

Also, any subsequent calls to fingerprint_file or fingerprint_directory will fingerprint and add those songs to the database as well. It’s meant to simulate a system where as new songs are released, they are fingerprinted and added to the database seemlessly without stopping the system.


Fingerprint files of extension mp3 in the ./mp3 folder

Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5 second clips sampled randomly


More information