AI Readiness Assessment: Score Your Data First

AI readiness assessment for executives: score your data on completeness and lineage before your next AI investment.

Gartner's research, cited by Informatica in 2025, found 39% of organizations name data gaps as their top barrier to AI adoption. Not model selection. Data. And yet most AI budget conversations start with a demo of the model.

The scorecard executives are not running

RAND's 2024 study interviewed 65 experienced AI and machine learning builders to find out why projects fail. The findings point to data problems and poor project framing as root causes, with people problems compounding both. Model quality does not appear as the binding constraint. Pedro Domingos, a computer science professor at the University of Washington, puts the logic plainly: models learn from patterns in the data they receive, not from the intentions of the people who built them. A well-chosen model trained on broken data learns the breakage.

The four dimensions worth scoring before you approve a budget are completeness and lineage. Governance rounds out the set. Score each from one to four. A one means the dimension is absent or unknown. A four means a named person documents and owns it, with testing confirmed. Total the scores. Anything below ten warrants a stop before funding.

What each dimension measures

Completeness asks whether the data covers the population the model needs to serve. Missing records are not a minor inconvenience. The 2021 systematic literature review on data quality and machine learning found missing and noisy data degrade model performance at deployment, not during testing. If your training data covers 60% of your customer base, the model will perform well on that segment and poorly on the rest. You will not know which customers fall into which group until after launch.

Consistency asks whether the same fact appears the same way across your systems. A customer's address stored differently in your CRM and your billing system is not a formatting problem. It is a signal your data has no single source of truth, and the model will learn whichever version appears most often.

Lineage asks whether you know where each data field came from and what transformed it. Sculley and colleagues showed in their 2015 paper on technical debt in machine learning systems data pipeline problems accumulate after training and create failure points skilled teams cannot debug without a clear record of how data moved. Without lineage, you cannot audit a model's output or fix it when it drifts.

Governance asks who owns the data and who approves changes to it. Gartner, cited by Informatica in 2025, predicts at least 30% of GenAI projects will be abandoned after proof of concept by end of 2025, with data quality and controls among the named causes. Governance is the control layer. Without it, completeness scores decay the moment the project moves from pilot to production.

Where this scorecard does not protect you

RAND 2024 is worth reading carefully here. The same study names data as a root cause also names people and project framing as independent failure causes. Even a project scoring 16 out of 16 on data dimensions fails when the team cannot interpret the model's output. It also fails when the team defines the problem incorrectly from the start. No senior leader owning the decision to act compounds both.

The MIT Sloan and Fortune 2025 report on GenAI pilots adds a distinct failure mode it calls a learning gap: firms lack the internal knowledge to adapt AI tools to their actual work. This is not a data quality problem. It is an organizational readiness problem, and a data scorecard does not touch it.

Score your data first because data problems are a necessary condition for failure, not a sufficient one. A project with clean data and a skilled team can still fail. Dirty data guarantees failure every time, regardless of how skilled the team is. The scorecard tells you whether you have cleared the floor, not whether you have cleared the ceiling.

Running the assessment before the next board review

Ask your data team to score each of the four dimensions against the one-to-four scale above. If they cannot score lineage at all because no documentation exists, that is a score of one by definition. Aggregate the scores. Bring the total to your next AI investment conversation alongside the model proposal. Ajay Agrawal at the University of Toronto and Robert Seamans at NYU Stern have both written on how AI value depends on matching tools to the data and tasks an organization already has. Run this scorecard as a match test before you sign the contract.