When AI Proves Theorems: Is Being Right Enough?

3 days ago
3 min read

Or why discernment remains the most human skill — and the most precious one to pass on to our children.

An idea that is no longer science fiction

Imagine an artificial intelligence that no longer just answers our questions, but actively extends the frontier of human knowledge. In mathematics, this scenario is becoming a reality.

The secret? Tools called proof assistants, the best known of which is Lean. A proof assistant is a piece of software that checks a mathematical argument line by line, with absolute rigor. No step can be skipped, no approximation is tolerated. If a proof passes verification, it is logically correct — full stop.

Around Lean, a vast collaborative library called Mathlib has grown, gathering tens of thousands of formalized theorems. Think of it as the Wikipedia of verified mathematics: every brick has been machine-checked, and each one can serve as a foundation for building the next.

Here is where AI enters the picture: an AI model can now propose a theorem, write out its formal proof, and submit the whole thing to Lean. If verification passes, we obtain a new mathematical truth, certified by the machine. Not a plausible hallucination, not a "probably correct" answer — a proof, in the strictest sense of the word.

Recent progress by AI systems on problems at the level of the International Mathematical Olympiad shows that this capability is no longer theoretical. We are entering an era where machines can genuinely contribute to the production of mathematical knowledge.

And yet, an uncomfortable question

This is where things get really interesting.

Is producing something true enough?

Because here is a secret every mathematician knows: there is an infinity of theorems that are true... and perfectly uninteresting. You can prove that 7,845,213 + 1 is even or odd. It's true. It's verifiable. And it teaches no one anything.

An AI capable of generating thousands of valid theorems per day would essentially be producing noise. Truths, yes — but truths that illuminate nothing, connect nothing, open no doors.

Great mathematicians are not distinguished primarily by their ability to prove things. They are distinguished by their intuition for what deserves to be proven. Henri Poincaré spoke of a sense of mathematical elegance: that almost aesthetic ability to recognize the beauty of an idea before it has even been proven, to feel that one path leads somewhere while another goes nowhere.

In mathematics, as everywhere else, validity is not enough. You also need meaning, interest, beauty, and discernment.

The real challenge for AI may not be the one we think

We debate a great deal about whether AI can generate new ideas. But perhaps the deeper debate lies elsewhere: can an AI learn to recognize which ideas are worth exploring?

Taste, judgment, a sense of relevance — these qualities are built through lived experience of the world, through history, culture, failures and moments of wonder. Can they be encoded? Can they be learned from data? The question remains open, and it is a fascinating one.

What is certain is this: in a world where producing valid content becomes trivial — code that compiles, well-formed text, verified theorems — value shifts. It shifts toward those who know how to choose, prioritize, and discern.

What this means for our children

This is where a seemingly abstract reflection connects directly to my daily practice at Codeacademy123.

When an 8-year-old uses an AI tool to generate code, the question is no longer "does the code work?" — the tool handles that better and better. The real questions become: is this project interesting? Is this solution elegant? Is this idea worth spending the afternoon on?

Concretely, in our workshops, this looks like:

We ask children to articulate their vision before producing anything: what do you want to create, and why does it interest you?
We compare several solutions to the same problem and discuss which one is the clearest, simplest, most beautiful — not just which one works.
We learn to critique what the AI produces: what the machine suggests is a starting point, never a verdict.

Technique can be taught. Tools evolve — Scratch today, something else tomorrow. But judgment, taste, discernment: these are cultivated slowly, through practice, discussion, and real choices. And they cannot be delegated to a machine.

By way of a (provisional) conclusion

Perhaps one day an AI will develop something resembling Poincaré's mathematical "taste." Perhaps not. But in the meantime, one thing is certain: raising children who can discern what has value — in ideas, in code, in what they create — has never been more important.

This has been the wager of Codeacademy123 since 2015: forming creators, not consumers. And in the age of AI, creating begins with knowing how to choose.

What about you — do you believe a machine could one day develop a sense for beautiful ideas? The discussion is open — in the comments, or at one of our upcoming workshops.

“Enrichment beyond the classroom”

Contact us: info@codeacademy123.com