AI-assisted solutions to Erdos problems How amateurs did it

Insights AI News AI-assisted solutions to Erdos problems How amateurs did it

AI News

21 Jan 2026

Read 10 min

AI-assisted solutions to Erdos problems How amateurs did it

AI-assisted solutions to Erdos problems enable amateurs to find and verify proofs, sparking research.

AI-assisted solutions to Erdos problems are moving from novelty to results. Amateurs used chatbots to draft proofs and proof-checkers to verify them, rediscovering several results and producing at least one new solution. This shift hints at faster, broader math research, even if today’s models still tackle the simpler end of Erdos’s legacy. Mathematicians did not expect hobbyists to push classic number theory and combinatorics forward with consumer AI. Yet over recent months, chatbots have helped people spot relevant papers, outline proofs, and even pass computer checks. The wins sit at the easier end of the field, but they signal a real change in how math can be done. Much of the action centers on problems posed by Paul Erdos. They are short to state but hard to crack. That makes them ideal tests for whether AI can connect ideas across the literature and suggest viable proof paths. Researchers report that, since October, chatbots have become far better at finding and using genuine sources rather than inventing them.

AI-assisted solutions to Erdos problems: what just happened

From bite-size challenges to serious progress

Erdos left more than 1000 open questions across many branches of math. A UK mathematician, Thomas Bloom, curates a public list and tracks progress. Because many problems fit in a single paragraph, they are easy to paste into AI tools. That simplicity has allowed rapid experimentation and quick feedback.

An amateur–student pairing, a chatbot, and a proof checker

Cambridge undergraduate Kevin Barreto and amateur mathematician Liam Price looked for under-studied entries on the Erdos list. They asked a premium chatbot to draft an argument for problem 728, a number theory conjecture. The model returned a plausible proof outline. The pair then sent the human-readable proof to a second tool, Aristotle by Harmonic, which translated it into Lean so a machine could check it. Their pipeline shows a repeatable pattern:

Find a concise, precise statement (an Erdos-style conjecture helps).
Ask a chatbot to propose approaches, cite sources, and outline a proof.
Cross-check references and tighten steps by hand.
Translate the proof into Lean with a tool like Aristotle.
Run a formal verification to confirm every step is correct.

What counts as new?

By mid-January, AI-backed efforts had produced full solutions to six Erdos problems. Five matched results that already existed in the literature. One, number 205, appears to be new from Barreto and Price. In addition, small improvements and partial results were logged for seven more problems. Even when the answer was “known,” the route there often used papers that did not mention Erdos by name, which many humans had not linked to the problems.

Are these solutions really new?

This question sits at the heart of the current debate. Critics note that rediscovery is not the same as invention. Supporters argue that finding the right paper, recasting the problem, and stitching ideas together is valuable scholarship—especially at speed. – Thomas Bloom says earlier chatbots often hallucinated citations. Around October, he noticed a shift: models began surfacing real, useful papers and combining them in nontrivial ways. – Kevin Buzzard calls the progress “green shoots.” Most successes are on accessible problems, so professionals are not alarmed. But the direction is positive. – Kevin Barreto warns against hype. Prize problems remain out of reach for now. Once the low-hanging fruit is gone, stronger models will be needed.

How the workflow could change math

Faster cross-pollination

Most mathematicians specialize. That limits the set of tools they can apply. With a chatbot, a researcher can request methods from adjacent fields in seconds, then ask for concrete lemmas and references. This speeds up exploration and helps people jump across areas they do not know well.

Formal verification as a guardrail

Turning prose into Lean and checking it by machine reduces the burden on human referees. It also forces proofs to be explicit. For AI-generated or AI-edited arguments, this extra layer makes the difference between a clever sketch and a certified result.

Toward large-scale, empirical math

Terence Tao suggests a future where researchers run many attempts across hundreds of problems, compare methods, and gather statistics on what works. This is rare today because expert time is scarce. If AI can do the grunt work—drafting, searching, and testing—then humans can focus on judging which paths are most promising.

How to try this responsibly

Build a simple, reliable pipeline

Pick clear problem statements with precise definitions and citations.
Prompt the chatbot to propose multiple strategies and to justify each step.
Verify every source; ask the model to quote exact theorems and page numbers.
Use a code assistant or a tool like Aristotle to translate into Lean.
Iterate until the formal proof checks; document each fix.

Watch for common pitfalls

Hallucinated references: demand exact bibliographic details and cross-check them.
Hidden gaps: require the model to expand steps that say “it follows that.”
Overfitting to easy cases: test boundary values and adversarial examples.
Misaligned definitions: ensure the model uses the same notation and conventions as the problem.

What this moment means

These results do not dethrone human insight. They do show that tools can now help with real theorems, not just toy algebra. As models improve at search, reasoning, and formalization, we should expect more rediscoveries—and more new links across fields. The winners will be the teams that combine clear prompts, careful reading, and strong verification. In short, AI-assisted solutions to Erdos problems are early but meaningful. They help amateurs contribute, free experts to survey broader ground, and encourage formal, testable workflows. The pace will depend on better models and better habits, but the direction is set—and it points to a more open, empirical style of discovery.

(Source: https://www.newscientist.com/article/2511954-amateur-mathematicians-solve-long-standing-maths-problems-with-ai/)

For more news: Click Here

FAQ

Q: What are the recent developments in AI-assisted solutions to Erdos problems? A: Recently, amateurs and students have used chatbots to draft proofs and formal proof-checkers to verify them, moving AI-assisted solutions to Erdos problems from novelty toward concrete results. As of mid-January, AI-backed efforts produced full solutions to six Erdos problems, five of which matched existing literature while one (number 205) appears to be new. Q: Who were the main people involved in these amateur AI-driven proofs? A: Amateurs and a Cambridge undergraduate, notably Kevin Barreto and Liam Price, led experiments using chatbots, while Thomas Bloom tracked progress and publicised the developments, and professionals such as Kevin Buzzard and Terence Tao have commented on and helped validate some outcomes. Barreto and Price applied a premium chatbot and formal verification tools in their successful pipeline. Q: Which AI tools and verification systems were used to produce and check these proofs? A: The reported workflow used a premium chatbot (ChatGPT-5.2 Pro) to draft arguments and Aristotle, a tool from Harmonic, to translate prose into Lean so a computer could check the proof. Researchers combined these tools with manual cross-checking of references and steps before formal verification. Q: What is the typical pipeline amateurs used to turn a chatbot idea into a checked proof? A: Practitioners described a repeatable pipeline: pick a concise Erdos-style statement, ask a chatbot for approaches and citations, tighten steps by hand, translate the proof into Lean with a tool like Aristotle, and run formal verification. Iteration and documentation of fixes until the formal proof checks were emphasised as essential parts of the process. Q: How many Erdos problems have been solved with AI and how many are genuinely new? A: By mid-January, six Erdos problems had full solutions produced with AI assistance, but five of those matched previously published results and only one, problem 205, appears to be a genuinely new solution by Barreto and Price. AI tools also contributed small improvements and partial solutions to seven other problems. Q: Are AI-generated solutions considered original research or rediscoveries? A: There is an ongoing debate: critics note many AI outputs simply rediscover existing work, while supporters argue that surfacing relevant but unlinked papers and recasting problems is valuable scholarship. Thomas Bloom observed that models have started finding real papers and non-trivial combinations that humans had not previously connected to Erdos problems. Q: What limitations do current AI models have when tackling harder Erdos problems? A: Current models tend to handle the simpler, low-hanging-fruit problems and fall short on the most demanding Erdos questions, so experts view the progress as promising but not yet transformative. Practitioners warn that more capable models will be needed once the easier problems are exhausted. Q: How can mathematicians use these AI methods responsibly and avoid common pitfalls? A: Responsible use includes picking precise problem statements, demanding exact bibliographic details and justifications from models, manually verifying sources and expanded steps, and translating results into Lean for formal checking. Users should also test boundary cases, watch for hallucinated references, ensure consistent definitions, and document each iteration until the proof is formally verified.