AI case study

DropboxModel evaluation

Scattered spreadsheets couldn't catch AI hallucinations. Now, automated LLM judges evaluate every prompt change to block regressions.

Published

The story

Context

A leading cloud storage platform developing a universal search tool that retrieves and organizes work across all of a user's connected applications.

Challenge

Behind the search interface runs a complex chain of retrieval and inference steps where a single prompt tweak can ripple unpredictably to cause...

Solution
Unlock full story

Scope & timeline

  • Under 10 minutes for automated PR evaluations

Quotes

Unlock 4 more quotes

The company

Cloud storage, file sharing, and collaboration platform for teams and individuals.

IndustrySoftware & Platforms
LocationSan Francisco, CA, USA
Employees1K-5K
Founded2007

The vendor

AI observability and evaluation platform that helps developers build, test, and monitor LLM-powered applications.

IndustrySoftware & Platforms
LocationSan Francisco, CA
Employees11-50
Founded2020

Use case

Dropbox's Model evaluation is part of this use case:

AI Infrastructure
70 case studies(+118% YoY)
Proven impact?
LowModerateVery Strong
3.4Moderate
2.9Lowwithin Software & Platforms
3.3Moderatewithin Product Engineering

Similar Case Studies

Related implementations across industries and use cases

72 AI case studies in AI Infrastructure

283 AI case studies in Software & Platforms

589 AI case studies in Product Engineering