kaldi-asr/kaldi is the official location of the Kaldi project.
-
Updated
Sep 17, 2020 - Shell
{{ message }}
kaldi-asr/kaldi is the official location of the Kaldi project.
Pre-trained and Reproduced Deep Learning Models (『飞桨』官方模型库,包含多种学术前沿和工业场景验证的深度学习模型)
Code examples for new APIs of iOS 10.
Lingvo
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
WaveNet vocoder
DELTA is a deep learning based natural language and speech processing platform.
Python library and CLI tool to interface with Google Translate's text-to-speech API
Open-Source Large Vocabulary Continuous Speech Recognition Engine
A PaddlePaddle implementation of DeepSpeech2 architecture for ASR.
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
A Python wrapper for Kaldi
Speech Enhancement Generative Adversarial Network in TensorFlow
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
Node.js client for Google Cloud Speech: Speech to text conversion powered by machine learning.
A neural network for end-to-end speech denoising
We have to create contributing files for two repositories:
Also, please update cboard's contributing file, in order to:
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Voice Converter Using CycleGAN and Non-Parallel Data
Add a description, image, and links to the speech topic page so that developers can more easily learn about it.
To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."
hi,
as you know, in SoLoud, the number of filters are limited
we should implement more like different reverbs, fir and irr filters, (these could be used to implement HRTF support), Chorus, One Poll, One Zero, Pole Zero, Two Pole, Two Zero, etc
a library exists called stk under zlib license which already implemented these maybe we can implement some of these out