SUMMARY
To enhance AI-based facial recognition, Oprimes collected over 120,000+ diverse face
images from 25+ countries and 50+ device types. This effort reduced bias, boosted model
accuracy by 15%, and improved performance under low-light and varied conditions. It also
shortened AI training time by 2–3 weeks, accelerating time-to-market.
THE CHALLENGE
- Lack of Diversity in training data caused biased facial recognition across ethnicities,
lighting, and expressions. - Poor Performance on low-end devices and in low-light or motion conditions.
- Synthetic Data Limitations needed real-world samples to avoid inaccuracies.
SOLUTION
- Global Coverage: Focused on achieving demographic and environmental diversity through wide geographic representation.
- Real-world Grounding: Ensured synthetic data generated via GenAI was grounded in real-world samples to maintain accuracy.
- Human-in-the-Loop Validation: Manual checks helped improve dataset quality by 15%.
KEY OUTCOMES
- Data Collection: Over 120k face images from 25+ countries, capturing a wide range of ethnicities, lighting, expressions, and devices.
- Diverse Datasets: Included 20,000+ low-light images, 15,000+ images with accessories (glasses, masks), and 10,000+ facial expression variations.
- Captured data using 50+ device types under diverse conditions (low-light, varied angles, with accessories).