How your ML projects might fall victim to elusive 'perfect data'
Chasing the elusive dream of perfect data can stall your machine learning project and result in losing opportunities of creating good-enough models.
In artificial intelligence projects, there is a natural tension between technical rigor and business velocity. Machine learning teams often strive for statistically perfect models built on pristine datasets, a process that can take months or even years. But what if a "good enough" model, deployed quickly, could deliver more business value than a perfect one that arrives too late?
This is not a theoretical question. Many organizations fall into the trap of pursuing data perfection at the expense of tangible results. The following case study from a retail demand forecasting project illustrates how a pragmatic, iterative approach can outperform the traditional, slower path to development.
The challenge: a $2 million overstock problem
Our team was tasked with building an inventory demand forecasting system for a retail chain with 50 stores. The company was struggling with approximately $2 million in annual costs related to overstocking. The goal was to build a system that could accurately forecast demand for 10,000 unique products, or SKUs, to help the finance team make smarter purchasing decisions.
To meet this goal, the data science team set a precise technical target. They aimed for a Mean Absolute Percentage Error (MAPE) of 5%. This metric means that, on average, the model's sales forecasts would be within 5% of the actual sales numbers. This seemed like a reasonable benchmark for a system intended to guide significant financial commitments.
The traditional path: an eight-month quest for precision
The project began with an extensive data preparation phase. The historical sales data was fraught with issues common in real-world enterprise environments. It contained missing values, inconsistent product categorizations, untagged seasonal adjustments, and inventory counts that failed to reconcile across different systems.
The team spent eight months dedicated to solving these problems. They cleaned and standardized data formats and built complex feature engineering pipelines. The process involved interviewing store managers to understand regional variations and manually reconciling inventory discrepancies. The result of this intensive effort was a sophisticated model that accounted for regional preferences, seasonal trends, and the impact of promotions. In a controlled testing environment, it achieved a MAPE of 6%, very close to the original target.
A pragmatic alternative: defining the 'region of indifference'
While this comprehensive project was underway, our product team began to ask a different question: what is the minimum performance required to deliver any business value at all? We analyzed the company's existing manual ordering process and found it was highly unreliable, especially for the most volatile and costly SKUs.
This analysis led us to define a "region of indifference." We determined that any model capable of simply outperforming the manual guesswork on the top 20% most costly overstock items would save the company hundreds of thousands of dollars. A model with a 25% MAPE, while far from perfect, would be a huge win. In fact, we calculated that even a model with a 30% or 40% MAPE would likely be good enough to start delivering value.
Armed with this perspective, we launched a parallel effort. The data engineering team performed a minimal cleaning of the data, focusing only on removing obvious outliers, filling missing values with simple averages, and standardizing basic formats. Within two weeks, we trained a simple baseline model. It had a MAPE of 22%—not precise, but better than the status quo. It immediately identified clear, actionable patterns, such as consistently overstocked categories and mismatches in regional product distribution.
From lab to reality: deploying the 'good enough' model
We deployed this baseline system to five pilot stores. The results were immediate and significant. Within the first quarter, the pilot stores saw a 25% reduction in overstock. When extrapolated across all 50 stores, this represented an annual saving of approximately $500,000. This value was realized while the "perfect" model was still months away from completion.
Over the next six months, we continuously improved this deployed model based on real-world feedback. A key improvement came from a deeper collaboration with the finance team to understand the asymmetric cost of error. We learned that overstocking an item was three times more costly to the business than understocking it and losing a sale.
Standard metrics like MAPE treat these errors equally. We adjusted our model to reflect business reality. We modified its loss function—the mathematical component that guides its learning—to more heavily penalize over-predictions. This change, driven by business insight, had a far greater impact on reducing overstock costs than a marginal improvement in forecasting accuracy would have.
Principles for value-driven AI development
This experience offers a clear framework for any team building AI products. The first principle is to define business value before pursuing technical perfection. The initial focus should be on understanding the business baseline and the minimum threshold for meaningful improvement, rather than chasing an arbitrary statistical target.
Once that threshold is known, teams can ship a product to learn. The fastest way to acquire high-quality, relevant data is to deploy a functional model into the real world. This process uncovers edge cases, generates user feedback, and provides performance data that no offline test set can ever replicate.
Furthermore, successful AI products align their optimization process with business objectives. They look beyond standard metrics to understand the financial consequences of different model errors. By working with stakeholders to quantify these costs, teams can embed this business logic directly into the model’s training via its loss function, ensuring it optimizes for what truly matters.
Ultimately, this approach treats data quality as a starting input, not an insurmountable blocker. Instead of asking if the data is perfect, the more productive question is: what is the simplest model we can build to create value with the data we have today? Answering this question is the key to unlocking business value faster.