Section 01 · Time Machine
The 36 Months
That Changed Everything
Click on timeline or press play button and watch ten years of capability gain animate before you in seconds.
Dec 2022 → Jan 2026
Dec 2022
Jun 2023
Dec 2023
Jun 2024
Dec 2024
Jan 2026
2022
Press play or drag the scrubber to travel through AI's most compressed period of progress.
Adoption Speed
2 years
To reach hundreds of millions of users — vs 46 years for electricity to reach ¼ of US households.
Benchmark Saturation
90%+
Frontier models scoring above 90% on MMLU, MMLU-Pro, and GSM8K — tests designed for graduate-level humans.
SWE-Bench Leap
15×
More software engineering bugs solved in 2024 vs the year prior — from single digits to majority-solved.
Cost Drop
100×
Inference cost per million tokens fell from ~$20 to near-cents in roughly two years of continuous improvement.
Section 02 · The Wall
When Machines
Crossed the Line
Watch AI benchmark curves sprint past the human expert ceiling — not over decades, but in months.
Math Olympiad · AIME Competition
2022 — GPT-3 era
~2 / 15
problems solved
↓
2025 — Frontier models
13–15 / 15
problems solved (90%+)
Education · AI Adoption in K–12 Classrooms
60%
teachers use AI
29%
students use for math
21%
math teachers use AI for planning
50%
districts now train teachers on AI
"It took speech recognition more than a decade to approach human-level performance. It took modern language models roughly 3–5 years to go from clumsy autocomplete to surpassing humans on broad exam suites. This isn't incremental improvement. This is benchmarks breaking under the speed of progress."
Section 03 · The Cost Collapse
Everything Gets
10× Cheaper, Again
The cost of inference isn't following Moore's Law. It's lapping it.
Year: 2022
$20.00
per million tokens (GPT-4 class inference)
"Steam changed our muscles. Electricity changed the environments we occupied. The microchip changed our tools. AI is different: it is changing the thinking that runs everything else. For the first time, cognition itself is getting exponentially cheaper."
Section 04 · The Shrinking Engine
Same Intelligence,
Fraction of the Size
Instead of asking "how big can we build?" — what if the question is "how small can we go?"
THEN · 2022
540B
parameters
Data center required
PaLM-class model
540,000,000,000
→
Same capability.
Fraction of the size.
Fraction of the size.
NOW · 2022
2022 baseline
540B
Data center required
Equivalent performance
540,000,000,000
Drag the slider below. In 2022, matching expert-level performance on language benchmarks required a massive 540-billion-parameter model running in a data center. By 2025, a model 140× smaller achieves the same results — and fits on your phone.
Parameter Reduction
140×
From 540B to 3.8B parameters for equivalent MMLU-class performance in just a few years.
Where It Runs
Phones
Capable AI assistants now run locally on consumer laptops and mobile devices — no data center needed.
Training Efficiency
50×
Faster improvement than Moore's Law in training costs, according to ARK Invest and Epoch AI analyses.
Historical Comparison
Decades → Years
Industrial engines improved gradually over generations. AI achieved orders-of-magnitude gains in 2–3 years.
Section 05 · From Gadgets to Infrastructure
The Fastest
Adoption Curve Ever
Electricity, smartphones, generative AI — three S-curves, each steeper than the last.
Electricity (1880–1930)
Smartphones (2007–2020)
Generative AI (2022–2026)
20–45%
Productivity gains in software development and content creation in early real-world deployments
2 hrs
Saved per day on repetitive tasks, reported by workers using AI tools
100M+
Users reached by ChatGPT in the fastest product adoption in recorded history
2 years
To reach scale that took electricity 46 years and smartphones more than a decade to achieve
"Steam changed our muscles. Electricity changed the environments we occupied. The microchip changed our tools. AI is different: it is changing the thinking that runs everything else. Watch capability, cost, and adoption curves all bend — faster than any prior general-purpose technology documented in the economic or historical record."
Data Sources · All Sections
Data Highlights
& Sources
The key numbers behind every claim in this experience — including a dedicated section on AI in mathematics and education — with direct links to the primary source for each.
Capability & Benchmarks
~90%
Frontier models on MMLU-Pro — tests designed for graduate-level humans
LLM Stats · llm-stats.com/benchmarks/mmlu
↗
50%+
SWE-Bench Verified solved by o3 — majority of real-world software bugs automated
Interconnects · interconnects.ai
↗
15×
More SWE-Bench bugs solved in 2024 vs the prior year
Interconnects · interconnects.ai
↗
97%+
GSM8K math reasoning score for frontier models in 2024–25
Stanford HAI 2025 AI Index · hai.stanford.edu
↗
24%
GUI agent task completion on complex interface sequences — still early stage
ArXiv 2025 · arxiv.org/html/2602.09007v1
↗
3–5 yrs
Time for LLMs to go from clumsy autocomplete to surpassing humans on broad exam suites
BRAC AI · bracai.eu
↗
Cost & Economics
$20→$0.02
Cost per million tokens (GPT-4 class), 2022 to 2025 — a 1,000× reduction
Epoch AI · epoch.ai/data-insights
↗
~10×/yr
Annual inference cost reduction rate — vs Moore's Law at 2× every two years
Epoch AI · epoch.ai/data-insights
↗
50–100×
Cost improvement per year in peak AI workloads — far exceeding Moore's Law
The Stack · thestack.technology
↗
50×
AI training cost improvement rate vs Moore's Law — per ARK Invest analysis
ARK Invest · ark-invest.com
↗
Model Efficiency & Scale
140×
Parameter reduction for equivalent MMLU-class performance — 540B in 2022 to 3.8B in 2025
Stanford HAI 2025 AI Index · hai.stanford.edu
↗
540B→3.8B
Parameters needed for expert-level language benchmark performance, 2022 vs 2025
LifeArchitect · lifearchitect.ai/mapping
↗
On-device
Capable AI assistants now run locally on consumer laptops and phones — no data center needed
AI Multiple · aimultiple.com/llm
↗
Mathematics & Education
90%+
AI score on AIME 2025 — the elite high school math competition that qualifies students for the US Math Olympiad. Frontier models now solve 13–15 of 15 problems.
IntuitionLabs · AIME 2025 Benchmark Analysis
↗
IMO Gold
AI achieved gold-medal standard at the 2025 International Mathematical Olympiad — problems specifically designed to require creative insight, not pattern matching.
IntuitionLabs · AI Reasoning at IMO 2025
↗
<2%
AI success rate on FrontierMath — original, never-published problems created by 60+ mathematicians requiring hours to solve. This is the one wall AI hasn't crossed. Note: FrontierMath tests PhD research-level math — everything taught in K–12, including AP Calculus and competition prep, is well below this ceiling. For a high school math teacher, the more relevant number is the 90%+ AIME score above.
Epoch AI / ArXiv · FrontierMath Benchmark (2024)
↗
60%
K-12 teachers used AI tools during the 2024–25 school year (Gallup/Walton). 32% use AI at least weekly. Preparing lessons is the #1 daily use at 20%.
YSU / Gallup–Walton Family Foundation Survey 2025
↗
21%
Math teachers using AI for instructional planning — roughly half the rate of ELA and science teachers. Math teachers face unique challenges integrating AI into their practice.
Education Week / RAND Report · April 2025
↗
2×
Districts providing AI training to teachers doubled in one year — from ~25% in fall 2023 to ~50% in fall 2024. Low-poverty districts still lead higher-poverty ones.
RAND Corporation · American School District Panel 2025
↗
29%
U.S. teens use ChatGPT specifically to solve math problems — the second most common student use after research (54%). Teen AI use for schoolwork doubled from 2023 to 2024.
Pew Research Center via YSU · 2024
↗
Less bored
Students report being less bored in math class when teachers use ChatGPT to personalize lessons around student interests — with higher quality feedback and more relevant examples.
Wiley / School Science & Mathematics · April 2025
↗
≈ 1-on-1
AI tutoring systems shown to match individualized human tutoring outcomes in K-12 math — identifying gaps, personalizing paths, and providing real-time feedback at scale.
Journal of Computer Assisted Learning · 2024
↗
25%
Teachers who say AI tools do more harm than good in K-12 education. Only 6% say more benefit than harm. Math teachers cite concerns about errors, academic integrity, and loss of conceptual understanding.
Pew Research Center · May 2024
↗
Adoption & Society
2 years
For generative AI to reach hundreds of millions of users — vs 46 years for electricity to reach ¼ of US households
Our World in Data · ourworldindata.org
↗
100M+
Users reached by ChatGPT — fastest product adoption ever recorded
Our World in Data · ourworldindata.org
↗
20–45%
Productivity gains in software development and content creation in early real-world deployments
AI Multiple · aimultiple.com/llm
↗
2 hrs/day
Time saved on repetitive tasks per day, reported by workers using AI tools
St. Louis Fed · stlouisfed.org
↗
10 yrs
Estimated capability gain compressed into ~36 months of AI development, 2022–2025
TIME Magazine · time.com
↗
