Start with the data you already have
There is a familiar opening to AI programs: before anything can happen, we must build the data foundation. A platform, a lake, a year of plumbing — and only then, the value. It is a comfortable story, because it postpones the hard question of whether the idea works at all.
Most useful first projects do not need it. The data you already have — the records in the system you run today, the documents your people already write — is usually enough to prove whether an idea is worth pursuing.
Proof first, platform later
A working prototype on imperfect, existing data tells you something a pristine platform cannot: whether the thing is worth building. Prove the value on what is at hand, and the investment case for better data infrastructure writes itself, grounded in a result rather than a hope.
Reverse the order and you risk spending a year and a large budget on a foundation for a building no one has confirmed they want. The platform becomes the project, and the value it was meant to enable quietly recedes.
Let the use case pull the data
When you start from a concrete use case, you learn exactly which data actually matters — and it is almost always far less than a general-purpose platform would have you gather. The need pulls the investment, instead of the investment hunting for a need.
Clean data is worth having. It is just rarely the right place to start, because the cleaning never ends and the value keeps waiting. Begin with what you have, prove something real, and let the result justify the next layer.