Why does AI suck at Excel?
I've been playing around with using Copilot and Claude for building models in Excel, lately.
As I've been playing*, I've noticed something:
Using AI can really speed up some building.
Some of my favorite prompts so far:
"Can you make the model formatting prettier?"
"Can you review this whole model and tell me where there may be mistakes?"
"I want to use this model in a demo. Can you sanitize it of any identifying data?"
"I'd like this revenue forecast to be based on a marketing funnel. Can you build me a marketing sales funnel tab and use it to calculate new customers each month?"
The power is incredible. I could save so much time!
But there's a huge downside:
I can't trust the output.
Some examples of mistakes I've discovered (which doesn't include the ones I'm still unaware of):
It sometimes uses insane formulas that are heavy, really hard to understand, and overbuilt. Eg. a million nested SumProducts( when a single COUNTA( would have done it.
It uses formulas that happen to calculate the right number output, but might not if I change the inputs
It uses inconsistent formulas, (not copy/paste-compatible), making auditing harder and makes the model more likely to break with edits.
It puts the wrong data in the wrong spot (randomly putting Average Revenue per User in the Net Revenue cells)
It builds new functionality quietly that breaks prior functionality
It randomly hardcodes outputs
All in all, this means that the tool is powerful but also untrustworthy. A really frustrating combination. And means we all have to continue to use with caution.
This could honestly be the end of this post. Because knowing what things AI struggles with is helpful for beginning to use it carefully.
But the nerd in me isn't satisfied. I want to know: WHY?
AI is notoriously great at coding. We've seen the articles and heard podcasts on how vibe-coding is taking over the world.
If it were as bad at coding as it is at Excel, we'd be seeing a lot more carnage.
So what the heck is going on?
I asked one of my favorite engineers and got a great answer that has completely hijacked my brain this week.
He shared this slightly intimidating but super interesting article on the difference between Shannon entropy and Kolmogorov complexity.
Essentially, AI models are great at prediction, but cannot handle Kolmogorov complexity.
They're extremely good at pattern-matching. They're not good at understanding the mechanism underneath or inventing good logic from scratch.
Extrapolating a bit, you can see why coding is so easy for AI, but why Excel is not:
In code:
There's a mountain of open-source code to learn from. And lots of functions look like other functions.
Coding is modular by design. Small chunks, built and tested piece by piece.
There are tests! Write something wrong and the compiler will stop you.
Below the code is a layer of encoded, predictable behavior. Every layer of abstraction sits on standardized, predictable layers beneath it.
But Excel is a sh*t show:
There are barely any norms or hard-coded abstraction functionality to create predictable patterns. You build all your raw formulas from scratch every time.
There's no modularity. It's one giant grid of interconnected cells.
There are no standard tests. A wrong formula just returns a number. The number looks fine.
Dependencies aren't transparent. You can't see what feeds what without tracing.
It's positional and visual. The meaning lives in where a cell is, not just what it contains.
So what's the solution?
For me, I think it's this:
I'm excited for the possibility that this will get better, but prepared for the reality that it might not.
If we want AI's help in finance, I think the best path is having it help us build financial software with defined requirements.
And in the meantime, AI in Excel is a powerful but limited tool that should be wielded and audited carefully.
Enjoyed reading this article? Subscribe to receive more via email here.
Know a Founder or Entrepreneur who'd love this content? Please share it!
*When I say playing, I really mean playing by the way; I'm not using it on client work for the most part (without getting explicit permission). I'm using only for playing around with templates and examples to share in class.
The main reason: The data privacy, retention, and security parameters aren't tuned well for working with confidential or sensitive data yet.
(But that's the topic of a different post to come...)