Speech Transcription with Quantum_STT_V2.0

This demo showcases Quantum_STT_V2.0, a 600-million-parameter model designed for high-quality English speech recognition.

Key Features:

Automatic punctuation and capitalization
Accurate word-level timestamps (click on a segment in the table below to play it!)
Efficiently transcribes long audio segments (updated to support upto 3 hours)
Robust performance on spoken numbers, and song lyrics transcription

This model is available for commercial and non-commercial use.

Transcription Results (Click row to play segment)

Transcription Segments

Transcription Segments

Selected Segment