Speech to Text

Fast and Accurate Transcriptions with Minimal Errors

Speech-to-Text (STT) technology converts spoken language into written text. This technology is critical for enabling voice-driven interfaces, dictation systems, transcription services, and real-time communication.

Our STT solutions leverage cutting-edge algorithms and AI models to accurately capture speech in various languages and dialects. Whether it’s live streaming conversations, customer service interactions, or voice commands, our STT infrastructure ensures fast and precise transcription.

Highlights

Insanely Fast Speech to Text

Our STT is extremely fast and con convert speech to text up to 10 times faster than OpenAI

Reduced Allucination

We use a state of art technology to reduce hallucinations in more than 99% of the cases

Languages

We support 90 different languages.

Drivers for PBXs

We offer free UniMRCP drivers. This means you can use native Asterisk and FreeSwitch drivers and you are able to support Cisco and Avaya implementations

STT Diarization

The system can separate the speakers in the transcription even if the audio is mono.

API's

We have REST APIs with examples in Curl, Python and Node.

Redact

The system is capable to redact the text hiding private information. The system has an specific endpoint for redaction. 

Text Formats

JSON, VTT, SRT, DIARIZATION, JSON DIARIZATION

Audio Formats

wav, mp3, opus, flac, pcm, ogg, m4a, webm, weba, oga. mid, aiff, au e wma

Deployment models

Proxy Model

  • Fast and accurate proxy to OpenAI and others
  • Cost effective
  • Use best of breed models

Serverless

  • Best price per minute
  • Fast and precise model choice
  • Total privacy
  • Low volume commitment

Instance

  • Best option for large volumes
  • Total privacy
  • Fixed and predictable pricing per month

High Speed

  • High performance Inference servers. 
  • Minimum volume required

Local

  • For large projects we can deliver the system locally