Speech Transcription with Quantum_STT_V2.0

This demo showcases Quantum_STT_V2.0, a 600-million-parameter model designed for high-quality English speech recognition.

Key Features:

  • Automatic punctuation and capitalization
  • Accurate word-level timestamps (click on a segment in the table below to play it!)
  • Efficiently transcribes long audio segments (updated to support upto 3 hours)
  • Robust performance on spoken numbers, and song lyrics transcription

This model is available for commercial and non-commercial use.

Example Audio Files (Click to Load)

Transcription Results (Click row to play segment)

Transcription Segments

Transcription Segments