Dropped sentences in ASR transcript w/ Parakeet v3/v2

Transcribed [this video](https://www.youtube.com/watch?v=GT_sXIUJPUo) using Parakeet v2 and v3.
```sh
# Clone repo
git clone https://github.com/FluidInference/FluidAudio.git
cd FluidAudio

# Download and convert to 16 KHz 16-bit wav
uvx "yt-dlp[default]" -f bestaudio GT_sXIUJPUo -o "temp_GT_sXIUJPUo.%(ext)s"
ffmpeg -i "temp_GT_sXIUJPUo."* -acodec pcm_s16le -ac 1 -ar 16000 cowen.wav
rm "temp_GT_sXIUJPUo."*

# Build
swift build -c release

# Transcribe
swift run fluidaudio transcribe cowen.wav --model-version v2
swift run fluidaudio transcribe cowen.wav --model-version v3
```
I also transcribed the same file using [parakeet-mlx](https://github.com/senstella/parakeet-mlx) to compare.

All results can be found here: [cowen_results.zip](https://github.com/user-attachments/files/22580157/cowen_results.zip)

Inspected the first ~1.5 pages of all transcripts and noticed:
- Fluid Audio parakeet v3 drops a lot of sentences. See `fluid_audio_parakeet_v3/cowen_fluidaudio_parakeet_v3_errors.pdf` in the zip file.
- Fluid Audio parakeet v2 still drops sentences but much fewer (only 1 vs 8 with v3). See `fluid_audio_parakeet_v2/cowen_fluidaudio_parakeet_v2_errors.pdf` in the zip file. 
- `parakeet-mlx` drops no sentences at all. See `parakeet_mlx_parakeet_v3/cowen_parakeet_mlx_parakeet_v3` in the zip file.

Wondering if this is a known issue? Or is it just this one audio file causing a malfunction somehow?

Unfortunate, because the FluidAudio parakeet-tdt implementations are otherwise the fastest on Apple Silicon, faster than `parakeet-mlx`. Hopefully fixable; I will try to understand the model architecture and code and see if I'm able to find the cause as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dropped sentences in ASR transcript w/ Parakeet v3/v2 #128

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dropped sentences in ASR transcript w/ Parakeet v3/v2 #128

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions