AI cannot magically interpret data out of thin air, but it can handle messy inputs surprisingly well. It can work with incomplete context, inconsistent formats, and even partially incorrect data in ways that older approaches simply could not.
My previous mental model was that AI could only match what a human could infer from the same dataset. In other words, if you could fully document a human expert’s tacit knowledge, AI might reach that level. I am gradually revising that assumption.
General-purpose AI brings broad contextual knowledge that often goes beyond a single human’s expertise. For example, terminology differences between industries can confuse people who are deeply specialized in one domain. AI, on the other hand, does not really care about those boundaries. It can often infer meaning across contexts without getting stuck on unfamiliar phrasing.
Then there is patience. If guided properly, AI does not get tired or lose focus. Where a human might start skimming by page five, AI will stay just as attentive on page five hundred. That alone can reduce missed details.
It also performs well in areas where humans rely on educated guesswork. Interpreting misspelled names, identifying misplaced information, or reconstructing intent from imperfect data are all tasks where AI can be surprisingly effective.
So, my updated view is this: with proper instructions, AI can often perform at least as well as a human when working with messy data, and sometimes even better.
That raises an interesting question. If humans can already manage the process with imperfect data, do we really need to prioritize cleaning it before trying to automate?
Improving data quality is still valuable, especially when it is easy to do. But it may no longer need to be the automatic first step. It might be more efficient to first test whether the data actually limits the outcome.