DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
-
Updated
Jan 1, 2022 - C++
{{ message }}
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
kaldi-asr/kaldi is the official location of the Kaldi project.
Speech recognition module for Python, supporting several engines and APIs, online and offline.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
NeMo: a toolkit for conversational AI
A PyTorch-based Speech Toolkit
Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.
Lingvo
Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0
Kalliope is a framework that will help you to create your own personal assistant.
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
hi,
as you know, in SoLoud, the number of filters are limited
we should implement more like different reverbs, fir and irr filters, (these could be used to implement HRTF support), Chorus, One Poll, One Zero, Pole Zero, Two Pole, Two Zero, etc
a library exists called stk under zlib license which already implemented these maybe we can implement some of these out
the open-source virtual assistant for Ubuntu based Linux distributions
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
Creating CSV files manually is a lot of work. This could be automated by a script if the name of the WAV file is the same as the transcript.
The same could be done for creating a language model input text file. A script could pull the transcript from the WAV file name.
Descriptive Deep Learning
An asynchronized Python library to automate solving ReCAPTCHA v2 using audio
The official repository of the Eesen project
Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.
@voicybot Telegram bot main repository
Adapt Intent Parser
Open STT
Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."
Specs