live speech-2-text recognition

Project Description

live speech-2-text recognition is a real-time, live caption speech-to-text recognition system that uses Wav2Vec model fine-tuned with the LJSpeech dataset. This project is designed to provide highly accurate live transcriptions, making it ideal for accessibility features, live events, and many other applications where real-time captioning is essential.

Installation

To set up the project, follow these steps:

Clone the repository and install dependencies
```
pip install -r requirements.txt
```
Download the Model and Processor
- Download the fine-tuned model here and the processor files here.
- Place the downloaded folders in the src/data/ directory. Rename the folders to model and processor respectively.

Usage

Model Training

If you prefer to train the model yourself:

cd src
python3 train.py

This script will train the Wav2Vec model using the LJSpeech dataset and save the fine-tuned model in the specified directory.

Transcription and Evaluation

To process transcriptions and evaluate them:

Process Transcriptions
```
python3 process.py
```
Use this script to process the transcriptions using the default and fine-tuned models.
Evaluate Transcriptions
```
python3 check.py
```
Run this script to calculate the BLEU, WER, and CER scores for the transcriptions placed in the src/ directory.

Real-time Speech Recognition

For live speech recognition:

python3 speech.py

Ensure your microphone is set up and calibrated when prompted. Speak after the "Speak now!" prompt for live captioning.

Acknowledgments

Facebook AI Research for the Wav2Vec 2.0 model.
The LJSpeech dataset contributors.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
dataset		dataset
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

live speech-2-text recognition

Project Description

Installation

Usage

Model Training

Transcription and Evaluation

Real-time Speech Recognition

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

gprem09/s2t-ai

Folders and files

Latest commit

History

Repository files navigation

live speech-2-text recognition

Project Description

Installation

Usage

Model Training

Transcription and Evaluation

Real-time Speech Recognition

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages