Koveh API

Speech-to-Text API

Audio transcription using Whisper and T-One models

The Speech-to-Text API provides various endpoints for transcribing audio files into text using OpenAI's Whisper (base and medium) and the specialized T-One model for Russian language.

Base URL

https://api.koveh.com/whisper/

Authentication

Access is currently protected via Nginx (if applicable) or restricted to local environments. No explicit API Key is required for these endpoints at this time.


Endpoints

1. Whisper Lite (Fast)

POST /whisper-lite

Transcribes audio using the Whisper base model. Fast but less accurate for complex audio.

Form Data:

  • file (audio file): The audio clip to transcribe.

2. Whisper Medium (Accurate)

POST /whisper

Transcribes audio using the Whisper medium model. More accurate but slower.

Form Data:

  • file (audio file): The audio clip to transcribe.

3. T-One (Russian Specialized)

POST /t-one

Optimized transcription for Russian language audio using the T-One model.

Form Data:

  • file (audio file): The audio clip to transcribe.

Response Format

All transcription endpoints return:

{
  "text": "The transcribed text here.",
  "language": "ru",
  "model": "medium",
  "duration": 15.5
}

On this page