GitHub - yuhang2/soundfingerprinting: The project aims studying the audio signal in terms of its perceptual characteristics, resulting in an algorithm that will be able to detect (map) unknown audio snippets from a large database of known songs.

Sound Fingerprinting

Soundfingerprinting is a C# framework designed for developers and researchers in the fields of audio processing, data mining, digital signal processing. It implements an efficient algorithm of signal processing which will allow one to have a competent system of audio fingerprinting and signal recognition.

Documentation

Following is a code sample that shows how to generate sub-fingerprints from an audio file. The sub-fingerprints will be stored in the backend and used later by the algorithm to recognize unknown snippets of audio. The interfaces for fingerprinting and querying the stream have been implemented as Fluent Interfaces with Builder and Command patterns in mind.

private readonly IModelService modelService = new ModelService();
private readonly IFingerprintCommandBuilder fingerprintCommandBuilder = new FingerprintCommandBuilder();

public void StoreAudioFileFingerprintsInDatabaseForLaterRetrieval(string pathToAudioFile)
{
    TrackData track = new TrackData("GBBKS1200164", "Adele", "Skyfall", "Skyfall", 2012, 290);
	
    // store track metadata in the database
    var trackReference = modelService.InsertTrack(track);

    // create sub-fingerprints and its hash representation
    var hashDatas = fingerprintCommandBuilder
                                .BuildFingerprintCommand()
                                .From(pathToAudioFile)
                                .WithDefaultFingerprintConfig()
                                .Hash()
                                .Result;
								
    // store sub-fingerprints and its hash representation in the database 
    modelService.InsertHashDataForTrack(hashDatas, trackReference);
}

The default underlying database is MSSQL, those connection management is handled by ModelService class. NoSQL data storage will be implemented in the upcomming releases. The MSSQL database initialization script can be find here. Do not forget to change connection string FingerprintConnectionString in your app.config file.

Once you've inserted the fingerprints into the database, later you might want to query the storage in order to recognize the song those samples you have. The origin of query samples may vary: file, url, microphone, radio tuner, etc. It's up to your application, where you get the samples from.

private readonly IQueryCommandBuilder queryCommandBuilder = new QueryCommandBuilder();

public TrackData GetBestMatchForSong(string queryAudioFile)
{
    int secondsToAnalyze = 10; // number of seconds to analyze from query file
    int startAtSecond = 0; // start at the begining
	
    // query the underlying database for similar audio sub-fingerprints
    var queryResult = queryCommandBuilder.BuildQueryCommand()
                                         .From(queryAudioFile, secondsToAnalyze, startAtSecond)
                                         .WithDefaultConfigs()
                                         .Query()
                                         .Result;
    if(queryResult.IsSuccessful)
    {
        return queryResult.BestMatch; // successful match has been found
    }
	
    return null; // no match has been found
}

The code is still in active development phase, thus the signatures of the above used classes might change. See the Wiki Page for the operational details and information.

Extension capabilities

The framework was built with loose coupling in mind thus all components involved in fingerprinting can be easily substituted. If you would like to switch from Bass.Net library to NAudio because of licencing concerns, you can do it by simply binding the interfaces IAudioService, IExtendedAudioService to NAudioService implementation.

DependencyResolver.Current.Bind<IAudioService, NAudioService>();
DependencyResolver.Current.Bind<IExtendedAudioService, NAudioService>();

Algorithm configuration

Fingerprinting and Querying algorithms can be easily parametrized with corresponding configuration objects passed as parameters on command creation.

 var hashDatas = fingerprintCommandBuilder
                           .BuildFingerprintCommand()
                           .From(samples)
                           .WithFingerprintConfig(
	                            config =>
	                            {
	                                config.TopWavelets = 250; // increase number of top wavelets
	                                config.Stride = new RandomStride(512, 256); // stride between sub-fingerprints
	                            })
                           .Hash()
                           .Result;

Each and every configuration parameter can influence the recognition rate, required storage, computational cost, etc. Stick with the defaults, unless you would like to experiment.

Third party libraries involved

Following is the list of third party libraries used by SoundFingerprinting project.

Bass.Net - used as a default framework for audio processing tasks.
NAudio - can be used as a substitution for Bass.Net.
FFTW - used as a default framework for FFT algorithm.
Exocortex - can be used as a substitution for FFTW.
Encog - used by Neural Hasher (which is still under development, and will be released as a separate component). SoundFingerprinting library does not include it in its release.
Ninject - used to take advantage of dependency inversion principle.

FAQ

Can I apply this algorithm for speech recognition purposes? No. The granularity of one fingerprint is roughly ~1.86 seconds, thus any sound recording which is less than that will be disregarded.

Binaries

git clone git@github.com:AddictedCS/soundfingerprinting.git

In order to build latest version of the SoundFingerprinting assembly run the following command from repository root

.\build.cmd

Get it on NuGet

Install-Package SoundFingerprinting

Demo

My description of the algorithm alogside with the demo project can be found on CodeProject The demo project is a Audio File Duplicates Detector. Its latest source code can be found here. Its a WPF MVVM project that uses the algorithm to detect what files are perceptually very similar.

Contribute

If you want to contribute you are welcome to open issues or discuss on issues page. Feel free to contact me for any remarks, ideas, bug reports etc.

Licence

The framework is provided under GPLv3 licence agreement.

The framework implements the algorithm from Content Fingerprinting Using Wavelets paper.

Name		Name	Last commit message	Last commit date
Latest commit History 302 Commits
build		build
src		src
tools		tools
vagrant		vagrant
.gitignore		.gitignore
README.md		README.md
build.cmd		build.cmd
licence.txt		licence.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build

build

src

src

tools

tools

vagrant

vagrant

.gitignore

.gitignore

README.md

README.md

build.cmd

build.cmd

licence.txt

licence.txt

Repository files navigation

Sound Fingerprinting

Documentation

Extension capabilities

Algorithm configuration

Third party libraries involved

FAQ

Binaries

Get it on NuGet

Demo

Contribute

Licence

About

Releases

Packages

License

yuhang2/soundfingerprinting

Folders and files

Latest commit

History

Repository files navigation

Sound Fingerprinting

Documentation

Extension capabilities

Algorithm configuration

Third party libraries involved

FAQ

Binaries

Get it on NuGet

Demo

Contribute

Licence

About

Resources

License

Stars

Watchers

Forks