Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
How it works
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
Note that for voice recognition, Dejavu is not the right tool! Dejavu excels at recognition of exact signals with reasonable amounts of noise.
Installation and Dependencies:
Read INSTALLATION.md
Setup
First, install the above dependencies.
Second, you’ll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup:
1 2 3 |
$ mysql -u root -p Enter password: ********** mysql> CREATE DATABASE IF NOT EXISTS dejavu; |
Now you’re ready to start fingerprinting your audio collection!
Quickstart
1 2 3 |
$ git clone https://github.com/worldveil/dejavu.git ./dejavu $ cd dejavu $ python example.py |
Fingerprinting
Let’s say we want to fingerprint all of July 2013’s VA US Top 40 hits.
Start by creating a Dejavu object with your configurations settings (Dejavu takes an ordinary Python dictionary for the settings).
1 2 3 4 5 6 7 8 9 10 |
>>> from dejavu import Dejavu >>> config = { ... "database": { ... "host": "127.0.0.1", ... "user": "root", ... "passwd": <password above>, ... "db": <name of the database you created above>, ... } ... } >>> djv = Dejavu(config) |
Next, give the fingerprint_directory method three arguments:
- input directory to look for audio files
- audio extensions to look for in the input directory
- number of processes (optional)
1 |
>>> djv.fingerprint_directory("va_us_top_40/mp3", [".mp3"], 3) |
For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn’t, and so won’t repeat itself.
You’ll have a lot of fingerprints once it completes a large folder of mp3s:
1 2 |
>>> print djv.db.get_num_fingerprints() 5442376 |
Also, any subsequent calls to fingerprint_file or fingerprint_directory will fingerprint and add those songs to the database as well. It’s meant to simulate a system where as new songs are released, they are fingerprinted and added to the database seemlessly without stopping the system.
Example
Fingerprint files of extension mp3 in the ./mp3 folder
1 2 |
# Fingerprint files of extension mp3 in the ./mp3 folder python dejavu.py -f ./mp3/ mp3 |
Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5 second clips sampled randomly
1 2 3 4 5 6 7 8 9 10 11 12 |
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5 # second clips sampled randomly from within each song 8 seconds # away from start or end, sampling with random seed = 42, and finally # store results in ./results and log to dejavu-test.log python run_tests.py \ --secs 5 \ --temp ./temp_audio \ --log-file ./results/dejavu-test.log \ --padding 8 \ --seed 42 \ --results ./results \ ./mp3 |
Results
More information