The Practitioner's Wager: Salman Khan's Brave New Words

The Problem With Proof

There is a particular kind of intellectual cowardice that disguises itself as rigor. It says: we cannot act until the studies are complete. It says: correlation is not causation. It says: the randomized controlled trial is still running, so let’s wait.

In education, where children move through grades whether or not the research has caught up, this caution isn’t neutrality. It’s a choice. The children who fall behind while the academy deliberates pay the price.

Salman Khan knows this. Brave New Words, his account of Khan Academy’s pivot into AI-powered tutoring through a tool called Khanmigo, is part manifesto, part product chronicle, part meditation on what learning might become. But its most important quality is this: it is written by someone who has decided not to wait. Khan is not a researcher presenting findings. He is a practitioner making a wager. The distinction matters, and it is the most refreshing thing about the book.

The field of educational technology is drowning in studies. Rigorous ones, too—the efficacy literature on Khan Academy alone spans large-scale randomized controlled trials in Uttar Pradesh that found roughly half-standard-deviation gains in mathematics over 31 weeks, longitudinal panel studies in Newark showing that students who master 60 or more skills annually can triple the average state-level learning gain, correlational data from Official SAT Practice suggesting 20 hours on the platform yields an average 115-point score increase. This is real scaffolding of credibility. But Khan barely touches it in this book. He is interested in something else: what happens when you stop waiting for permission and just build.

A New Year’s Day and a Genie

The book opens on Khan’s kitchen counter, January 1, 2023, his eleven-year-old daughter Diya writing a story with GPT-4 about a social media influencer stranded on a desert island. The fictional character talks back. Diya types, the character responds, and Khan watches his daughter become one of the first people on earth to co-author a story with an AI in real time. It is a scene carefully chosen—not a benchmark result, not a board presentation, but a child’s delight. A parent’s goosebumps.

This is Khan’s rhetorical method throughout the book, and it is worth examining. He knows the academic literature. He cites Bloom’s two-sigma problem—the 1984 finding by educational psychologist Benjamin Bloom that one-on-one tutoring moves a student from the 50th to roughly the 96th percentile, a gain so large that no classroom model has ever replicated it at scale. He knows that this finding has haunted education policy for forty years, that every attempt to solve it through technology has fallen short, that GPT-3.5 couldn’t even follow simple instructions not to give away answers. But instead of leading with Bloom, he leads with Diya.

The research is the foundation. The story is the door.

This is not manipulation. It is honest about what kind of knowledge Khan is offering. He is not saying the studies prove Khanmigo works. He is saying: I watched this happen, I built something from it, here is what it looks like when children use it. Come to your own conclusions. But don’t pretend you don’t see what I’m seeing.

The Speed Problem

Here is the thing about formal studies in education: they take time that children do not have.

The AI landscape Khan encountered in 2022 was moving faster than any institutional research apparatus could track. OpenAI was four months from releasing ChatGPT when Sam Altman and Greg Brockman invited Khan to test GPT-4. Between that preview and the public release, the technology changed dramatically. By the time a rigorous two-year randomized controlled trial could be designed, pre-registered, executed, analyzed, and published, the model being tested would be three generations obsolete.

This is not a criticism of academic research. The Uttar Pradesh RCT—which achieved its remarkable half-standard-deviation gains specifically because of dedicated “lab-in-charges” whose sole job was to protect weekly practice time, fix internet problems, and simplify student rostering—found precisely because of that rigor what so many ed-tech studies miss: the organizational structure around a tool matters as much as the tool itself. That finding will outlast any specific platform. Formal research does things that practitioner iteration cannot.

But Khan is making a different argument. Not that studies don’t matter—he has funded dozens of them on Khan Academy’s existing platform—but that in a technological moment this compressed, the practitioner’s method of rapid iteration based on observed student response is not recklessness. It is the methodological stance that the situation demands.

In the summer of 2022, Khan received early GPT-4 access and spent ten hours at his computer prompting. He asked it to act as a math tutor. It did. He asked it to refuse to give answers. It refused. He asked it to explain why incorrect answers were wrong. It explained. Two months later, he gathered thirty engineers, educators, and content creators for what he called a “Hack AI Thon.” The question wasn’t should we use this? The question was what happens when we try everything we can think of? What if the AI wrote lesson plans? Ran debates? Played historical characters? Served as a guidance counselor?

This is how you discover what something is capable of. It is not the only valid method. But when the technology is moving this fast, it may be the only method that keeps pace with it.

The Bloom Promise, Operationalized

The pedagogical core of Brave New Words is not flashy. It is, in fact, quite old.

What great tutors have always done: meet students where they are, ask questions instead of delivering answers, adjust pace in real time, maintain enough context about a student’s prior knowledge to make each interaction feel continuous. Aristotle did this with Alexander the Great. Khan did it with his cousin Nadia over instant messenger in 2004. The dream of mass education has always been to replicate this at scale without the cost of one human tutor per student.

What Khanmigo attempts—and what the book’s most compelling sections demonstrate through transcripts—is a Socratic method that does not give way. When a student types “I’m having trouble with polynomials,” the AI does not explain polynomials. It asks: “Can you identify the term with the highest power of x?” When the student gets it wrong, it says: “Close, but not quite—let’s try again together.” When the student asks why they should care about algebra, the AI asks what they care about, learns they love soccer, and uses polynomial expressions to model goal-scoring. Days later, when the same student works on federalism in history, Khanmigo remembers the soccer. It is not a trick. It is memory put to pedagogical use.

Every design decision flows from the same philosophy. The AI doesn’t give answers—it asks questions. It logs all interactions transparently for parents and teachers—because accountability without surveillance is how you build trust rather than dependency. It proactively checks in on students who haven’t logged in—because Bloom’s two-sigma result came from tutors who cared, and caring means noticing absence. It handles the procedural work—grading, drafting progress reports, answering “what’s the next step?” at 11pm—that currently consumes more than half of a teacher’s working hours, freeing teachers for what classrooms actually require: human connection, motivation, the kind of mentorship that doesn’t scale.

This is the Bloom promise, operationalized. Not proven at scale yet. But visible, right there in the transcript, for anyone willing to look.

What the Research Already Tells Us

The formal efficacy literature gives Khan’s optimism important context—both supportive and complicating.

The platform works, and works measurably, for students who actually use it. The SAT partnership data is among the cleanest evidence in ed-tech: half a million students, consistent gains across gender, race, ethnicity, and parental education, effect sizes that would be considered substantial in any clinical trial. The Newark longitudinal data is striking in a different way: students mastering 60 or more skills annually outpaced state learning averages by a factor of three to four. These are not marginal improvements. They are the kind of numbers that justify the ambition.

But most students don’t reach the threshold that produces these numbers. The “5% problem”—the finding that only about 5% of students in many implementations hit the recommended 30 minutes per week—is real, and it has an equity dimension that the book treats too lightly. Low-income students reach dosage thresholds at measurably lower rates than their more advantaged peers. The students who most need the two-sigma boost are precisely the students least likely to receive it without intentional organizational support.

The Brazil data makes the negative case most clearly. In schools where students shared computers and rotated during 50-minute sessions, platform usage didn’t just fail to help—it was associated with scores falling significantly below comparison groups. The technology became a substitute for instruction rather than a complement. This is the difference between a tool deployed thoughtfully and a tool deployed because it exists. Software without hardware, without protected time, without teacher buy-in, without someone whose job is to make it work—is not neutral. It can actively harm.

Khan knows all of this. The organizational change problem appears in the book. But the book’s energy is firmly on the tool side. Building the thing, demonstrating the thing, imagining what the thing might become. The harder question—how do you get the organizational infrastructure to every school that needs it, and who pays for that—receives less attention than it deserves.

The Cost Equation

The book’s most intellectually honest moment is also its least comfortable.

Khan acknowledges that Khanmigo is expensive to run—between $5 and $15 per user per month in current computation costs. At millions of users, this is tens of millions of dollars annually in infrastructure alone, on top of the platform’s existing $70 million operating budget. Philanthropy cannot sustain universal free access at this cost. School districts can subsidize it for their students, but this means the students in under-resourced districts—the ones for whom the two-sigma promise is most urgent—are the students least likely to have it funded.

Khan’s optimism about computation costs dropping by a factor of 100 over five to ten years may be right. The trajectory of AI infrastructure costs has historically bent downward faster than anyone predicted. But that timeline does not match the urgency of a third-grader who is already two grade levels behind today, in a district whose budget cannot absorb even a modest per-student monthly charge.

This tension lives unresolved in the book. It is the right place for it to live—unresolved, visible, honest. The vision is real. The gap between the vision and universal access is also real. Naming both is what intellectual honesty looks like.

Educated Bravery, Honestly Reckoned

Khan uses the phrase “educated bravery” throughout the book. By this he means something specific: not blind bravery, but the willingness to act in the presence of uncertainty, informed by what you have learned but not paralyzed by what you have not yet proven. It is the practitioner’s posture.

The school districts that banned ChatGPT in early 2023 were not wrong to be cautious. The concerns about academic integrity, bias, hallucination, and student data privacy are legitimate. Khan does not dismiss them. But he makes the observation—and it is correct—that the question is not whether students will encounter generative AI. They will. The question is whether the first time they encounter it, it will be designed to help them learn, or designed to help them cheat.

Khanmigo, with its Socratic constraints and teacher-facing transparency, is a bet that the answer to AI in education is not prohibition but architecture. Whether that bet pays off at scale, at equity, at the pace the technology demands—that is what the next decade of research will determine.

What Brave New Words offers is something the research cannot yet give: a practitioner’s testimony. Here is what I saw. Here is what I built. Here is a girl named Diya writing a story that talks back to her. Here is what it looked like when she learned.

That is not proof. But in a field where children cannot wait for proof, it is not nothing.

He builds first. He studies second. For an institution with 150 million users and the ability to iterate fast, that may be exactly the right approach.

The school district that can’t move this fast should wait for the formal studies.

Khan Academy will have the data ready when they do.

Now the harder question: who gets to be Diya, and when?

Brave New Words: How AI Will Revolutionize Education (and Why That’s a Good Thing) by Salman Khan. Viking, 2024.