2025-11-03

The Key to the Problem is to Find the Key Problem

A few days ago, I stumbled upon something called AlterEgo—a device created by an MIT grad student from India. It’s essentially a headband you wear behind your ear. Once you put it on, you can think of a sentence, and it outputs the corresponding text.

AlterEgo Official Demo

Sounds like science fiction, right? But MIT Media Lab still showcases it on their website, so let’s give them the benefit of the doubt. According to Wikipedia, AlterEgo was developed by MIT graduate student Arnav Kapur and launched in 2018. It’s designed to help people with speech disabilities, achieving a median accuracy of 92%. The clever part? It reads electrical signals from your facial muscles, not directly from your brain, which helps protect privacy.

What strikes me as odd is that this thing has been around for 7 years and I’m only hearing about it now. Goes to show how far we still have to go from lab experiment to real product. But regardless, the technology matters because it tackles an age-old question: How can we communicate without speaking?

This is a fundamental problem. Thousands of years ago in myths and legends, we called it “mind reading.” In modern society, it could not only help people with speech disabilities—if accurate and fast enough, it might even replace keyboards. Then we’d realize that keyboards were just a detour in human history—forcing people to adapt to the QWERTY layout was inherently unnatural.

Last night I listened to an episode of Zhang Xiaojun’s podcast where she interviewed Cao Yue, founder of SAND AI. One part really stuck with me. He explained why OpenAI and DeepMind produce such groundbreaking work: they think fundamentally differently from most researchers.

Most researchers are paper-driven. Success means publishing at top conferences, fighting over first authorship. Since peer reviewers prize novelty, researchers obsess over inventing new methods and clever tricks to squeeze out marginally better results and slightly higher benchmark scores.

OpenAI and DeepMind operate differently. Many of their papers don’t introduce novel methods at all. Instead, they start by identifying a truly fundamental, important problem. Then they solve it using straightforward approaches and maximum computing power.

Look at what they’ve built over the years:

CLIP (2021): Taught AI to understand what’s in images. Previous AI was like rote memorization—see 10,000 cat photos, then recognize cats. CLIP is different. It grasps the relationship between images and text. Show it an animal it’s never seen, and it can still tell whether it’s more like a “koala” or a “panda.”
DALL-E (2021): Lets people generate images from text descriptions. Type “an astronaut riding a horse on the moon,” and it draws it—a scene that never existed in reality. AI gained “imagination” for the first time, creating things that don’t exist in its training data.
ChatGPT: Speaks for itself.
AlphaGo (2016) / AlphaZero (2017): Didn’t study any human games. Just played itself millions of times and surpassed thousands of years of accumulated human Go wisdom. AI proved for the first time it could discover knowledge humans don’t have, without learning from humans.
AlphaFold (2018): Predicts protein 3D structures with near-experimental accuracy. Biologists worked on this problem for 50 years. AlphaFold computes it in hours, dramatically accelerating drug discovery.
AlphaEvolve: AI that evolves knowledge like biological evolution. Traditional research: propose → experiment → fail → repeat. One cycle takes months. AlphaEvolve generates 1,000 proposals at once, completes thousands of iterations in days. It improved a 50-year-unbeaten math algorithm and boosted Google’s data center efficiency by 0.7% (saving tens of thousands of servers). Next targets: materials science, drug discovery, sustainable energy—any field requiring trial and error.

You might argue these companies can pursue grand ambitions because they’re incredibly wealthy. But money doesn’t explain everything. What sets these companies apart: they choose to solve genuinely important, fundamental problems. Quantifying image-text relationships. Making AI superhuman at Go. Turning text descriptions into real images.

These problems mattered 100 years ago. They mattered 1,000 years ago. Important problems have staying power—they’re often timeless. So if people centuries ago faced the same problem, chances are it’s a big one. Paul Graham has made similar observations.

But here’s the thing: can you solve an important problem just because you found it?

In early 2018, I was at an AI lab where I could explore any direction I wanted. I chose a topic: Replicating human voice timbre. I was fascinated: What is the essence of human voice timbre? Why can I identify someone from a single sentence? If we could capture the core of each person’s voice—say, map it to a fixed vector—could we theoretically clone anyone’s voice? Make Obama sing Adele? Human vocal impersonators can do it. Why not AI?

I dove into papers, studied vocal mechanics, printed spectrograms. The technology at the time had already figured out how to simulate natural sounds like streams, flames, and wind. But human voices? Still in very early stages. In the end, I made only trivial progress before abandoning this shallow academic pursuit.

Looking back, I’d found an important problem. I just couldn’t solve it.

Most important problems can’t be solved by one person, one team, or even one generation. The protein folding problem that AlphaFold solved—biologists had worked on it for 50 years. The language understanding problem behind ChatGPT—the entire NLP field spent decades on it.

But that experience made me want to tackle problems with longer time horizons. For example, I built an offline photo search app—using one sentence to find that specific photo among tens of thousands in your mind. Technically simple: just deploy CLIP on a phone. But it addresses an important problem: ancient emperors needed to locate specific passages among thousands of memorials, modern people need to find that girl in sunglasses on the beach somewhere in their photo library. The essence is the same: how do you use natural language to find something you know exists but don’t know where it is?

Important problems ignite your curiosity and competitive drive. They make you more focused and driven than usual. They even push you to dive into papers, seek collaborators, or naturally attract like-minded people to join. During the time I worked on those two problems, I was always focused and energized, eager to start working the moment I woke up.

The flip side? Mediocre problems make you numb. Worse still: if the problem you’re solving has negative social value—like optimizing addiction algorithms or perfecting price discrimination systems—the sense of frustration is even stronger.

To end with a once-popular piece of tautological wisdom: The key to the problem is to find the key problem.

TLDR