Key results
The company
Vionlabs
vionlabs.comAI-powered video metadata and discovery platform for streaming and media services.
Result highlights
- Automated curation for 100,000 titles
- Automated synopses generated in 4 languages
The story
A content intelligence provider serves the global media industry by using AI to extract deep metadata like moods and emotions from audio and video assets.
Proprietary audio-visual models failed to detect plot twists revealed only in dialogue, such as a character admitting they are a ghost. Developing a custom text-based model to address this blind spot would have required six to nine months of training.
The team deployed Llama 3.1 on Vertex AI to fuse text transcripts with audio and visual data into multimodal embeddings. This system automatically generates narrative synopses in four languages and clusters titles into curated lists with AI-generated names. Vertex AI handles the model hosting and API integration, while BigQuery manages the data processing for the content libraries.
Scope & timeline
- Implementation cut from ~9 months to weeks
Quotes
“Audio-visual models alone couldn’t detect crucial plot details—for instance, that a character is a ghost, if that fact is only revealed in dialogue within the transcript.”