GitHub - mrsagile/Sagen: Free, open-source TTS engine for .NET

Sagen (German for "to say") is my attempt at making a text-to-speech engine aimed at .NET developers who don't have thousands of dollars at their disposal to license a commercial speech synthesis solution. In many ways, it is an experiment and continual learning experience for me, as I am not an expert in speech science, phonetics, or vocal acoustics; I simply want to see how far I can go with original research, freely available resources, and lots of patience.

Why?

There are tons of TTS engines out there already, but I feel like they're all missing something.

Aside from being often prohibitively expensive, it is not unusual for commercial TTS systems to be restrictive in their available customizations for voices, voice parameters, and context-sensitive vocal qualities (e.g. intonation, stress, and timbre). Such qualities are necessary to convey meaning in speech.

Concatenative synthesizers, as well as other similar "realistic" TTS technologies tend to be CPU-heavy, leave a large memory footprint, and require each voice to be installed separately. Because they are based on databases of recorded speech samples, they are not very customizable at all.

There are also many free options for speech synthesizers, but they often have sparse, confusing, or convoluted documentation, or are locked down to one specific language (e.g. Java). While all TTS libraries have advantages and disadvantages, I feel like the .NET crowd would welcome a TTS solution specifically made for them.

My goal with Sagen is not necessarily to produce "something better", but to instead offer a user-friendly TTS engine with a respectable amount of configurability, flexibility, and performance. The best part? It's free.

What's planned

Here is a short list of major features that will be supported:

Text-to-speech based on an articulatory model adapted from Voc
Plentiful parameters for tuning how voices sound (age, sex, vocal force, hoarseness, etc...)
Support for direct playback, WAV exporting, and sending audio data via System.Stream
Multiple options for sample format and rate (export only)
Support for X-SAMPA-based pronunciation lexicons
Multiple language support (English and German are currently prioritized)
Heteronym resolution
Singing?!

It is currently a heavy work-in-progress, and I welcome your input and/or contributions.

Licensing

This project is made available under the MIT License and is completely free for anyone to use, for any purpose, without the burdens of licensing costs or royalties.

This project contains code adapted from Paul Batchelor's Voc library, which is in turn based on Neil Thapen's bizarre but fascinating project, Pink Trombone.

See LICENSE.md for all licenses and copyright notices.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
Sagen.Languages.German		Sagen.Languages.German
Sagen.Languages.USEnglish		Sagen.Languages.USEnglish
Sagen.Playback.OpenAL		Sagen.Playback.OpenAL
Sagen.Playback.XAudio2		Sagen.Playback.XAudio2
Sagen		Sagen
SagenConsole		SagenConsole
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
Rebracer.xml		Rebracer.xml
Sagen.sln		Sagen.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sagen.Languages.German

Sagen.Languages.German

Sagen.Languages.USEnglish

Sagen.Languages.USEnglish

Sagen.Playback.OpenAL

Sagen.Playback.OpenAL

Sagen.Playback.XAudio2

Sagen.Playback.XAudio2

Sagen

Sagen

SagenConsole

SagenConsole

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

Rebracer.xml

Rebracer.xml

Sagen.sln

Sagen.sln

Repository files navigation

Why?

What's planned

Licensing

About

Releases

Packages

Languages

License

mrsagile/Sagen

Folders and files

Latest commit

History

Repository files navigation

Why?

What's planned

Licensing

About

Resources

License

Stars

Watchers

Forks

Languages