Text to Speech

Seamless Voice Generation That Sounds Like a Conversation, Not a Machine

Text-to-Speech (TTS) technology converts written text into natural-sounding speech. This technology is widely used in applications such as virtual assistants, audiobooks, accessibility tools, and automated customer service systems, enhancing user engagement by providing a human-like voice interface. We offer advanced Text-to-Speech (TTS) solutions tailored to meet your specific needs. Choose from our robust models to power your applications with natural, high-quality voices.

Naturalness and Speed

The audios generated by the TTS have a better quality than the originals. In the research the Comparative MOS was positive for this model.  Also the real time factor when used in an RTX3090 is 0.07 one of the fastest in the market, very appropriate for voice bots. The model retains well the prosody and the style of the conversation such as sadness and happiness. This model can be run in instance or serverless for minimum network delay. 

Audio TTS em Português

Fernando - Original and TTS

Carla - Original and TTS

Naturalness

In the benchmarks, the system generated audios with a quality higher than the original recordings. CMOS +0.28

fast

High Speed for use with Voice Agents. Real time factor of 0.17

Privacy

Run in your own instance with high privacy

Drivers

We offer free UniMRCP drivers. This means you can use native Asterisk and FreeSwitch drivers and you are able to support Cisco and Avaya implementations

Custom voices

We can produce custom voices with a small set of custom audio, ideal for languages with low resources.

APIs

We have REST APIs with examples in Curl, Python and Node.

Deployment models

Proxy Model

  • Fast and accurate proxy to Azure and OpenAI
  • Large number of voices available
  • Cost effective
  • Use best of breed TTS providers available

Serverless

  • Best price per minute
  • Natural Neural Voices
  • Total privacy
  • Custom voices available
  • Low volume commitment

Instance

  • Best option for large volumes
  • Natural Neural Voices
  • Total privacy
  • Custom voices available
  • Fixed and predictable pricing per month

Local

  • High performance Inference servers. 
  • Minimum volume required
  • Natural Neural Voices

High Speed

  • For large projects we can deliver the system locally