Key results
The company
Mirakl
mirakl.comSaaS platform for enterprise marketplaces, dropshipping, and retail media.
Result highlights
- Catalog onboarding cut from 28 days to <24 hours
- ~50% reduction in categorization errors
The story
The leading provider of eCommerce software solutions supports over 100,000 merchants worldwide with rapidly growing volumes of operational and product data.
Supplier catalog onboarding was a manual process that took an average of 28 days per vendor to complete. Strict data governance regulations prevented the team from simply sending customer data to third-party AI vendors for faster processing.
The engineering team built a GenAI application called Catalog Transformer that uses a multi-stage pipeline orchestrated with Lakeflow Jobs to categorize and enrich data. The system runs a mix of models including GPT, Llama, and Mistral to rewrite vendor content while Unity Catalog maintains strict governance. Spark parallelizes the workload to process thousands of items concurrently across extensive catalogs.
Quotes
“With Databricks, we can do more with less, and we can ship valuable AI-powered features to our customers faster.”