Reality is the Ultimate Context for AI
When I’m on stage in front of and audience of people, I hear a lot of the same questions. And they’re very often the right ones to be asking.
Here’s a biggie: “Mark, when you talk about causality and AI, you keep talking about Reality. What exactly do you mean by “reality”?
The best definition I’ve come up with — one most people recognize immediately — is this:
Reality is what you bump into when you are wrong.
That may sound philosophical, but it has very practical implications for how organizations think about data science.
In contemporary data science, context is often defined much more narrowly. Context typically means the information contained within a particular dataset: variables, features, metadata, and the statistical relationships among them.
In other words, context is usually treated as what is visible inside the data.
That approach works well for certain tasks. If your goal is prediction — forecasting demand next week, classifying images, detecting anomalies — machine learning systems can perform extraordinarily well using only the patterns contained in the data itself.
But prediction is not the same thing as explanation, and it is certainly not the same thing as intervention.
The moment a business asks a different kind of question — What would happen if we changed something? — the definition of context has to expand dramatically.
Because the real drivers of outcomes often exist outside the dataset:
• competitor behavior • economic conditions • organizational decisions • time lags between actions and results • structural relationships between parts of the system
These elements form the causal structure of reality. And that structure is usually only partially visible in historical data.
This is where the limitations of purely correlative approaches begin to show.
Correlation tells us which variables tend to move together within a dataset. But it cannot reliably tell us what will happen if we change the system itself. For that, we need models that represent how the system actually works — models that incorporate domain knowledge, assumptions about mechanisms, and explicit causal relationships.
This is why causal inference has been gaining so much attention recently across economics, epidemiology, and increasingly in business.
It recognizes something simple but profound:
Data is a record of what happened. Reality is the system that produced it.
And the system is always larger than the data we happen to have collected.
When companies rely exclusively on correlations extracted from historical data, they are effectively letting the dataset define the boundaries of context. But the real world doesn’t respect those boundaries.
Eventually, decisions based on incomplete context collide with reality.
And that’s when we “bump into it.”
For business leaders, this is becoming one of the most important shifts in analytics and AI: moving from systems that recognize patterns in data to systems that reason about how the world actually works.
Because in the end, the most important context for any model isn’t the dataset.
It’s reality itself.


