For a long time, companies measured AI progress by the model they used. GPT-4 or Gemini, proprietary or open-source, fine-tuning or RAG. That debate still exists, but it is no longer the most important one. The competitive differentiator has shifted layers: it has moved from the models to the data.
By Nicole Grossmann, Growth Engineer at Firecrawl, mathematician from Columbia University with a specialization in Artificial Intelligence from Georgia Tech
This isn’t an emerging trend. It is a reality that already separates those who are scaling from those still stuck in the pilot phase.
According to Gartner, by 2026, 60% of AI projects will be abandoned due to poor data quality. McKinsey points in the exact same direction: nearly two-thirds of companies have failed to scale their AI initiatives. The problem is rarely the model. In most cases, it is the inability to structure information reliably before feeding it into any system.
What makes this even more critical is what happens at the technical layer. Inadequate data serialization wastes between 40% and 70% of available tokens in overhead, which drives up API costs, shrinks the effective context window, and degrades model performance. This issue often flies under the radar in pilot projects but becomes painfully acute in production. At scale, the difference between well-structured and poorly-structured data dictates whether an AI operation is economically viable.
The democratization of models has accelerated this scenario. Today, companies of any size can access similar tools via open APIs or commercial platforms. In 2026, the question is no longer “Do we have AI?” but rather “How do we make AI give us a competitive advantage over our competitors?”. The answer almost always lies in the proprietary data accumulated throughout the operation: customer behavior, transactional history, consumption patterns, and market intelligence (the data flywheel).
The numbers confirm this shift in priority. While 74% of organizations have implemented some AI solution, only 33% have managed to integrate it broadly across the enterprise. The bottleneck isn’t technological. It is structural. 61% of companies list data quality as their main operational challenge, and 70% of the largest public companies are pivoting from a pure focus on innovation to demanding a return on investment.
Two companies using the exact same model can achieve completely different results. The difference lies in the depth and quality of the information feeding those systems. Without consistent structure, AI devolves into superficial automation: it looks modern, but it doesn’t deliver real gains.
Artificial intelligence is putting information quality back at the center of corporate strategy. Companies that understand this early on will innovate faster, reduce operational costs, and build moats that competitors cannot replicate simply by getting access to the next model.
Nicole Grossmann is a Growth Engineer at Firecrawl—a startup with a global footprint in open-source data structuring solutions for artificial intelligence—as well as a Director of the Columbia Alumni Association in Brazil and a representative for Crimson Education in the country. She holds a degree in Applied Mathematics and Economics from Columbia University, with a specialization in AI from Georgia Tech.






