Good Ideas Are Rare; Taste Is the Search

The first pillar—how to tell good work from bad, and how to make more of the good

This is the first of three posts on what I think research is made of. The overview argued for three pillars—taste, execution, communication. This one is about taste, the slipperiest of the three, the one people most want to believe is innate magic. It isn’t magic. It’s a search.

Here is the fact the whole post stands on: good ideas are rare. Not scarce-ish— rare, the far tail of an enormous space of things you could think, almost all of which are wrong, or boring, or already done. Research is the work of finding the rare good ones in that space. And taste is the search procedure: the thing that tells you where in the space to look, and the thing that tells you, once you’ve grabbed something, whether it’s actually any good. Those two moves have names— generating and judging—and your taste is just how well you make them.

So taste is the difference between searching that huge space efficiently and wandering it at random. The encouraging part, which people resist, is that the search is learnable. “Taste is just subjective” is a comfortable thing to say because it ends the argument—but it isn’t true, and the proof is in your own history. Look at work you loved five years ago and wince at now—your taste didn’t merely change, it got better, and you know it got better, which means there was a better to move toward. A skill with a direction like that can be trained. You just walk the direction, deliberately, for years, and your search gets sharper.

I’ll take the two moves in turn, starting with judging, because you can’t generate toward a target you can’t yet recognize.

Judging: the three questions

When a reviewer reads your paper, or a program committee decides your fate, or I read a draft from a student, the evaluation collapses—almost always—into three questions:

  • What’s new? (novelty)
  • Who cares? (importance)
  • Why now? (timeliness)

The first two aren’t my invention; they’re the questions every program committee and grant panel already asks, in one phrasing or another—what is new here, and if it works, who benefits. (The funding agencies dress them up as “intellectual merit” and “broader impacts,” but it’s the same two nerves.) The third, “why now?”, I add myself, because I’ve come to think timeliness is a real and separate nerve. Different words, same handful of nerves.

The fastest way to feel how load-bearing each question is: knock one out and watch the work collapse.

No “what’s new.” Suppose I build a free office suite that is 100% compatible with Microsoft Office. Who cares? Millions of people. Why now? It saves them money. Both great answers. But what’s new? Nothing—it’s a clone. So it’s a fine product and it is not research. Novelty isn’t optional; it’s the thing that makes it research at all.

No “who cares.” Suppose I invent a genuinely clever new data structure—never been done, provably elegant—that speeds up an operation no real system performs. Novel: yes. Timely: sure. But nobody cares, so it’s a puzzle, not a contribution. This is the most common failure mode among technically strong students, and the hardest to feel from the inside, because cleverness is so satisfying that it masquerades as importance.

No “why now.” Suppose I propose something both novel and important—say, a beautiful scheme that needs hardware nobody will have for twenty years, or that re-solves a problem the field already moved past. The honest reaction is “interesting, but not yet” or “interesting, but too late.” Timing is a real axis. The same idea is a triumph in 2015 and a footnote in 2005.

Hold onto the structure: a contribution has to survive all three questions, and most weak work dies on exactly one. When you read a paper, find the one it’s weakest on. When you write one, defend all three before you defend anything else.

The two-of-three rule

There’s a related heuristic I use for the shape of a strong systems paper. Good work tends to have three possible sources of merit:

  1. a hard, real problem,
  2. a novel idea, and
  3. a substantial implementation and/or thorough evaluation.

You rarely get all three at a publishable level, and you almost never need all three. You need two. The combinations are each a recognizable species of paper:

  • Problem + idea (light on implementation): a clever, lightweight idea on an important problem—the kind of paper that’s mostly insight, a few pages, and changes how people think.
  • Problem + implementation (idea is straightforward): the heavyweight paper. The idea is “obvious in hindsight,” but making it actually work—at scale, for real—is a year of hard engineering and careful measurement. Building a real OS kernel in a memory-safe language is this kind of paper.
  • Idea + implementation (problem is niche): a sophisticated, well-built attack on a smaller problem. Narrower, but a pleasure, because both the thinking and the making are excellent.

This isn’t a law—it’s folklore, and the honest ancestor of it is Levin and Redell’s 1983 note on how to write a good systems paper, where they insist a paper must contain “at least one new idea” and then ask the author what was actually built and what was actually learned: “If you didn’t learn anything, it is a reasonable bet that your readers won’t either.” The two-of-three rule is just a compression of that. Use it as a checklist on your own work: if you can’t honestly name two of the three, you don’t have a paper yet—you have a start.

One more sharpening, on what “contribution” means, because students get this backwards. Lines of code are not contribution. Consider two results: I write 10,000 lines of C++ to make a system 10× faster, or I write 100 lines to make it 10% faster, for every application that will ever run on it. The second is often the bigger contribution. Effort is an input; impact is the output; taste is knowing they’re not the same number.

Generating: more shots, thrown well

Now the harder move. Judging is comparatively easy—you can learn it by reading a hundred papers with someone who has taste, which is most of what a reading group is for. Generating is the part that feels like magic, and the search frame is what dissolves the magic. Remember the setup: a vast space, the good ideas vanishingly rare in its tail. Two consequences for how you should actually work fall straight out of it.

The first is just honesty about the odds. You will generate far more bad ideas than good ones, no matter how good you get—“good” is the tail of a distribution; that’s what the word means. Producing duds isn’t a sign you’re failing. It’s the structure of the problem, and it never goes away.

The second is the strategy that follows: if hits are rare, take more shots. The folk version is the old line about having a lot of ideas and throwing the bad ones away. There’s a research-flavored version too—the claim that a creator’s number of great works tends to track their total output, as if the hit rate were roughly constant. It’s a contested claim, and shadowed by survivorship bias, since we mostly count the people whose volume did pay off and never see the prolific producers of pure noise. So don’t read it as a law. Read it as permission: generating a lot of bad ideas and discarding them fast is normal practice, not failure, and the researchers with the best ideas are often the ones having the most.

But raw quantity is a dumb search—spraying the space at random, which no good researcher actually does. A good search is directed: you spend your shots where they’re likely to land. And directed search, however you run it, turns on one tradeoff you already know by name.

Exploration vs. exploitation

It’s the explore/exploit tradeoff: do you mine the promising vein you’ve already found (exploit), or wander off to look for a better one (explore)? Exploit too hard and you polish a local hill forever, publishing increments while the real mountain sits one valley over. Explore too hard and you start everything and finish nothing. A research career is a long sequence of this one decision, and taste is largely knowing, this month, which mode you should be in.

The two modes even seem to want different physical conditions. Exploration wants you away: alone, off the devices, mind unclamped from the immediate. The famous breakthrough stories all rhyme—the insight that arrives on a walk, in the shower, stepping onto a bus, never at the desk where you’d been grinding. Treat these as suggestive, not as data—they’re the stories winners tell, and we never hear from the equally idle people who got nothing. But there’s a modest real effect underneath, the one called incubation: step away from a hard problem and some part of you keeps working it. The instruction is what matters. If you are stuck, frequently the answer is not more hours at the desk—it’s a walk. Exploitation wants the opposite: the desk, the screen, the long uninterrupted afternoon of grinding a found idea into something real. Know which one you need and arrange your day for it.

The generate-judge loop, and your advisor

Put the two halves together and you get the actual engine of research: a loop. Generate an idea, judge it, kill it or keep it, generate again. Fast and merciless. The whole point of building taste-as-judgment is to make the judge in this loop sharp and quick, so you can run the loop many times—because, see above, you need many swings.

Two things make the loop run faster. The first is writing: you cannot reliably judge an idea that’s still only a feeling in your head, and the act of writing it down is what forces the judgment to get honest. That deserves its own treatment, and it gets one in the communication post—for now just know that the judging half of taste runs on a pen. The second is your advisor, who is, in this loop, a faster and more experienced verifier. The reason you meet with me every week is not for me to hand you ideas. It’s to be a high-quality judge you can query cheaply, so the loop runs against real taste before you’ve sunk a year into a bad branch. That is most of what advising is, and it’s why the apprenticeship model survives: judgment transfers by being used out loud, over and over, until one day it’s yours and the door isn’t needed.

One last thing, the most encouraging thing I know about taste. Here is why beginners quit: you start with taste ahead of your ability, so everything you make disappoints you, and the gap is so painful that most people conclude they have no talent and stop. They’re wrong. That gap is the normal starting condition, and the only way to close it is to keep producing—a volume of work slowly drags your ability up to meet your taste. Your taste running ahead of your output isn’t evidence you’re bad at this. It’s the precondition for getting good. The disappointment is the pillar working. Keep taking swings.

Step back and notice what all of it—judging, generating, the loop, the years of reading—is really building: an accurate model of the field inside your own head, detailed enough that you can feel, from the inside, what’s worth doing and what isn’t. That’s the first of the three systems you learn to model. Next, a system that pushes back harder, because it either runs or it doesn’t: the machine.

Next: Execution—turning a judged idea into a real thing that works.