Introduction
SpeechToolkit is an all-in-one, end-to-end toolkit for ML in speech. It aims to simplify the usage of text-to-speech, automatic speech recognition, and voice conversion models.
Why SpeechToolkit?
Almost every model uses a different Python API. If you wanted to integrate them into your project, you'd need to write customized code for the model. Switching to a different model would require significant changes.
SpeechToolkit aims to solve this by providing a centralized, unified, easy-to-use Python API for speech models. Instead of having to rewrite your program to support a new model, you can simply change a couple lines of code with SpeechToolkit.
In addition, SpeechToolkit packages these models into a simple, PyPI-installable package. This not only makes code management easier, but also can help mitigate potential licensing issues.
Packages
SpeechToolkit supports many different third-party open-access models, as well as some models developed by ML for Speech. While these models are mostly available through the SpeechToolkit package, we've packaged many of these models individually if you don't want to use the SpeechToolkit library.
Get Started
Visit the Getting Started page to get started!