speech recognition free download

Showing 31 open source projects for "speech recognition"

View related business solutions

Scientific/Engineering Clear Filters & Widen Search

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
Earn up to 16% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

pyVideoTrans

Translate the video from one language to another and embed dubbing

pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. ...

Downloads: 22 This Week

Last Update: 5 days ago
See Project
2

JSpeech

Java library designed to integrate Speech-to-Text

jSpeech is a Java library designed to integrate Speech-to-Text (STT) capabilities, command control, and diarization (speaker identification) into applications in a simple, modular, and decoupled way.

1 Review

Downloads: 4 This Week

Last Update: 2026-03-12
See Project
3

Live Transcribe Speech Engine

Live Transcribe is an Android application

Live Transcribe Speech Engine provides on-device speech recognition components that power real-time transcription for accessibility and everyday voice-first experiences. Its design prioritizes latency and robustness in noisy, far-field environments, enabling continuous transcription with low delay on mobile hardware. The engine manages audio front-end processing—such as noise suppression and voice activity detection—before feeding audio into compact, accurate acoustic and language models. ...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
4

ASR for Medical Reporting

Automatic speech recognition system for medical reporting in spanish.

This is a functional prototype of automatic speech recognition system for medical reporting in Spanish using CMU Sphinx4 ASR toolkit. This ASR use pre-trained acoustic model and context dependent language model in nuclear medicine diagnostics.

Downloads: 0 This Week

Last Update: 2020-07-15
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

ILA - teachable voice assistant

ILA is a fully customizable and teachable voice assistant for Java

...It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). The goal is to make ILA completely free of Google by improving all aspects of the open source systems. Since version 3.3 users can also write own add-ons to extend ILA. ILA's successor is the SEPIA Framework: https://sepia-framework.github.io/ Hope you enjoy ILA - Florian

4 Reviews

Downloads: 1 This Week

Last Update: 2018-07-23
See Project
6

Welsh Natural Language Toolkit

...The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions. ...

Downloads: 0 This Week

Last Update: 2017-05-26
See Project
7

Distant Speech Recognition

Beamforming and Speech Recognition Toolkit

...These toolkits are meant for facilitating research and development of automatic distant speech recognition.

Downloads: 0 This Week

Last Update: 2019-08-21
See Project
8

Welsh Natural Language Toolkit

WNLT is a suite of open source natural language modules for the Welsh

...The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions.

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
9

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-06
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

ArabicDiacritizer

An automatic restoration of Arabic diacritic marks

...It is based mainly on deep architectures using deep neural network. The algorithm generates diacritized text with determined end case. The algorithm is described in detail in: Ilyes Rebai, and Yassine BenAyed 'Text-to-speech synthesis system with Arabic diacritic recognition system', Computer Speech & Language, 2015. We appreciate it very much if you can cite our related work. ************** Installation *************** - Extract the archive "ArabicDiacritizer Setup.rar". - Install the application using "Setup.exe". - Put an Arabic text in the Text Box...

Downloads: 1 This Week

Last Update: 2014-12-16
See Project
11

HMM Speech Recognition in Matlab

A speech recognition system using Matlab/Simulink/Stateflow.

This project provide hidden Markov model speech recognition system by using Matlab/Simulink/Stateflow.

4 Reviews

Downloads: 0 This Week

Last Update: 2016-07-25
See Project
12

openSMILE

SMILE = Speech & Music Interpretation by Large Space Extraction openSMILE is a fast, real-time (audio) feature extraction utility for automatic speech, music and paralinguistic recognition research developed originally at TUM in the scope of the EU-project SEMAINE, now maintained and supported by audEERING.

Downloads: 0 This Week

Last Update: 2014-11-27
See Project
13

AK toolkit

The AK toolkit is another kit for building and use Hidden Markov Models (HMMs). Originally developed for handwritten text recognition (HTR) using Bernoulli HMMs, it also implements diagonal Gaussians and can be used for any other purpose.

Downloads: 0 This Week

Last Update: 2013-04-22
See Project
14

KinectCAD

Gesture based movement with CATIA

This project provides a gesture based movement of part objects in the CAD-system CATIA. It is possible to rotate, move or zoom in or out. Further there is a rudimentary speech recognition to change the rotating axes or to do some other helpful things. KinectCAD has been written in Visual C# 2010. The package includes the source code and binaries files. To start KinectCAD it is necessary to have a correct installed Microsoft Kinect. Also it is helpful if there is an installed Kinect SDK V1. But you can also download the Runtime at: http://download.microsoft.com/download/E/E/2/EE2D29A1-2D5C-463C-B7F1-40E4170F5E2C/KinectRuntime-v1.0-Setup.exe Important! ...

Downloads: 0 This Week

Last Update: 2013-05-30
See Project
15

Interactive4J

Project aim to provide simple easy APIs for Java developers to use interactive abilities in their Java Applications like speech recognition, handwriting recognition, use of web cam , sound record/play, decision trees , text to speech and many others.

Downloads: 0 This Week

Last Update: 2014-07-15
See Project
16

Simple Interactive Java Browser

Simple interactive Java browser is basic simple browser show how we can use voice commands to navigate in web sites and use hyperlinks as a voice commands.

Downloads: 0 This Week

Last Update: 2013-11-02
See Project
17

Scalable Language API

Scalable Language API (SLAPI) The most comprehensive architecture for conversational natural-language applications including speech recognition/synthesis, semantics, & machine translation. Used on Android & other mobile app platforms.

Downloads: 1 This Week

Last Update: 2018-01-22
See Project
18

Talking Calculator

A basic calculator designed for the visually impaired that recites the operations being performed as it does them. Future development to include voice recognition of calculator functions and more advanced engineering calculations. Pre-Alpha phase.

Downloads: 3 This Week

Last Update: 2013-04-09
See Project
19

Grapheme to Phoneme Forge

Use our tools to hand edit phonetic word dictionaries for speech recognition engines. The new G2P4J format supporting SAMPA and Kirshenbaum IPA is portable to Sphinx, Julius and others. Demo medical, legal and technical dictionaries are featured.

Downloads: 0 This Week

Last Update: 2013-04-03
See Project
20

PlatoAI

A system for researching knowledge representation, language parsing, and derivation with a voice recognition front end. The front end, known as Grace, allows additional functionality to be developed to interact with the web, iTunes, etc.

Downloads: 0 This Week

Last Update: 2013-04-08
See Project
21

ASR-Builder

ASR-Builder provides an easy-to-use interface to the HTK toolkit, that allows users to build ASR systems. ASR-Builder provides a platform that performs house-keeping tasks when using HTK and also provides default training/testing/recognition scripts.

Downloads: 0 This Week

Last Update: 2013-04-26
See Project
22

IITM multi modal interface

Augmenting other natural interfaces namely handwriting, speech recognition to input Indian language characters to the computer. Speech synthesis is also provided to read out local language text.

Downloads: 0 This Week

Last Update: 2014-05-10
See Project
23

VoxForge

VoxForge collects user-submitted speech audio files for the creation of Acoustic Models for Free and Open Source Speech Recognition Engines such as HTK, Julius, ISIP and Sphinx.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
24

TSSBank

TSSBank is written in c#(.Net 2.0).The main aimed group is the disabled persons.This component outputs voice & textual outputs (with value/words)plus experimental Voice Recognition (VR) system that identifies more then 80% accurately with out training.

Downloads: 0 This Week

Last Update: 2013-03-20
See Project
25

corpuscifre

Italian labeled digits corpus, good for speech recognition. Corpus di cifre italiane segmentato, adatto a esperimenti di riconoscimento vocale e riconoscimento fonetico.

Downloads: 0 This Week

Last Update: 2016-08-20
See Project