Speech to Text

Fast and Accurate Transcriptions with Minimal Errors

Speech-to-Text (STT) technology converts spoken language into written text. This technology is critical for enabling voice-driven interfaces, dictation systems, transcription services, and real-time communication.

Our STT solutions leverage cutting-edge algorithms and AI models to accurately capture speech in various languages and dialects. Whether it's live streaming conversations, customer service interactions, or voice commands, our STT infrastructure ensures fast and accurate transcription.

Highlights

Insanely Fast Speech to Text

Our STT is extremely fast and can convert speech to text up to 10 times faster than OpenAI

Reduced Hallucination

We use a state of the art technology to reduce hallucinations in more than 99% of the cases

Languages

We support 90 different languages.

Drivers for PBXs

We offer free UniMRCP drivers. This means you can use native Asterisk and FreeSwitch drivers and you are able to support Cisco and Avaya implementations

STT Diarization

The system can separate the speakers in the transcription even if the audio is mono.

API's

We have REST APIs with examples in Curl, Python and Node.

Redact

The system is capable of redacting the text hiding private information. The system has a specific endpoint for writing. 

Text Formats

JSON, VTT, SRT, DIARIZATION, JSON DIARIZATION

Audio Formats

wav, mp3, opus, flac, pcm, ogg, m4a, webm, weba, oga. mid, aiff, au and wma

Deployment models

Proxy Model

  • Fast and accurate proxy to OpenAI and others
  • Cost effective
  • Use best of breed models

Serverless

  • Best price per minute
  • Fast and precise model choice
  • Total privacy
  • Low volume commitment

Instance

  • Best option for large volumes
  • Total privacy
  • Fixed and predictable pricing per month

High Speed

  • High performance Inference servers. 
  • Minimum volume required

Location

  • For large projects we can deliver the system locally