Decagon
Voice customer support
Users hung up on lagging voice agents. Speculative decoding on NVIDIA B200s cut latency from seconds to <400ms.
- P95 voice latency cut from seconds to <400ms
Translating the Oxford Dictionary took 39s. A 72-GPU supercomputer cut that to 3.7s, enabling real-time voice translation.
A global AI translation provider serves millions of daily users and over 200,000 businesses requiring accurate, context-aware communication.
Scaling services created a trade-off between model size and speed, where translating the Oxford English Dictionary took nearly 40 seconds. Legacy...
DeepL is a technology company that specializes in AI-powered language translation and natural language processing services.
NVIDIA is a technology company that specializes in semiconductors, graphics processing units, and artificial intelligence for applications in data centers, gaming, and more.
Collaborated on infrastructure planning for DeepL's liquid-cooled DGX SuperPOD deployment.
Related implementations across industries and use cases
Users hung up on lagging voice agents. Speculative decoding on NVIDIA B200s cut latency from seconds to <400ms.
Training human-sounding voices took 20 days on three servers. With H100s, cycles dropped to 5 days on one server, scaling 18 languages.
Buying hardware for 10x overnight spikes was risky. On-demand GPUs now absorb volatility, enabling same-day scaling.
Engineers manually correlated alerts across systems. AI agents now diagnose issues and suggest fixes, cutting recovery time by 35%.
Minor edits required days of crew coordination. Now, staff use avatars to modify dialogue and translate languages instantly.
Lab supply orders were handwritten in notebooks. Digital ordering now takes seconds, saving 30,000 hours for research annually.
Experts spent 15 minutes pulling data from scattered systems. Natural language prompts now generate detailed reports instantly.