AI case study

Arcee AIResearch data extraction

Standard tools failed on tables and equations. Intelligent parsing extracted 4M pages of scientific PDFs for model training.

Arcee AI

Software & Platforms

PublishedNov 25, 2024|1 year ago

Key results

Volume Processed

~4M pages

Result highlights

Unlock 1 result highlight

The story

Context

An enterprise AI platform needed to build a comprehensive training dataset from every NLP research paper published since 2017, totaling approximately 4 million pages of PDF content.

Challenge

Standard open-source tools struggled to accurately extract complex elements like tables, charts, and equations from scientific documents. These...

Solution

Unlock full story

The company

Arcee AI

arcee.ai

Development platform for specialized small language models and open-source AI tools.

IndustrySoftware & Platforms

LocationMiami, FL, USA

Employees11-50

Founded2023

The AI provider

LlamaIndex

www.llamaindex.ai

Data framework and agentic OCR platform for building LLM-powered applications.

IndustrySoftware & Platforms

LocationSan Francisco, CA, USA

Employees11-50

Founded2022

Similar Case Studies

Related implementations across industries and use cases

Maven Bio

Pharmaceuticals & Biotech|SMB

Scientific document processing

Standard parsers couldn't read scientific charts. AI now extracts visuals into text, making hidden data searchable.

10-20xFaster Workflows

10x-20x faster analytical workflows for users

via llamaindex.ai

Published Nov 3, 2025

Delphi

Software & Platforms|SMB

Document processing

Malformed PDFs forced engineers to manually patch pipelines. AI agents now parse complex files into clean markdown, ending manual fixes.

0Manual Patching

Zero manual patching for ingestion pipelines

via llamaindex.ai

Published Aug 5, 2025

Botminds

Software & Platforms|SMB

Document processing

Building models took months. Now, experts annotate docs to guide the AI, delivering functional solutions 90% faster.

15-20%Accuracy Improvement

15-20% accuracy improvement in v1 models
90% reduction in time to functional solution

via microsoft.com

Published Jun 28, 2024

MetaLearner

Software & Platforms|SMB

via meta.com

Unlock to view details

+1 more

Vectorize

Software & Platforms|SMB

via elastic.co

Unlock to view details

Parseur

Software & Platforms|SMB

via cloud.google.com

Unlock to view details

602 AI case studies in Software & Platforms

See All

BMC Helix

Software & Platforms|Enterprise

IT incident resolution

Engineers manually correlated alerts across systems. AI agents now diagnose issues and suggest fixes, cutting recovery time by 35%.

25-35% faster recovery time for customers
Model migration completed in one minor release

via cloud.google.com

Published Jan 31, 2026

HubSpot

Software & Platforms|Enterprise

up to

Video production and localization

Minor edits required days of crew coordination. Now, staff use avatars to modify dialogue and translate languages instantly.

2 wkWait Time Eliminated

Up to 2 weeks translation wait time eliminated

via heygen.com

Published Dec 16, 2025

+3 more

Anthropic

Software & Platforms|Mid-size

via gong.io

Unlock to view details

See All in Software & Platforms

Explore industries

1,352 AI case studies in Product Engineering

See All

AstraZeneca

Pharmaceuticals & Biotech|Enterprise

Lab logistics and onboarding

Lab supply orders were handwritten in notebooks. Digital ordering now takes seconds, saving 30,000 hours for research annually.

30k hrsAnnual Time Savings

30,000 hours saved annually
Supply order time cut from 30 mins to seconds
Projected 90,000 hours saved on onboarding

via servicenow.com

Published Nov 3, 2025

Hitachi Vantara

Technology|Enterprise

Employee workflow automation

Experts spent 15 minutes pulling data from scattered systems. Natural language prompts now generate detailed reports instantly.

15% increase in employee satisfaction
At least 40% reduction in developer time
Microsoft Copilot integration in 1 month

via servicenow.com

Published Nov 3, 2025

The Washington Post

Media|Mid-size

via together.ai

Unlock to view details

See All in Product Engineering

Explore functions