Exploration and exploitation

June 2013

I love it when I get to use computer-science metaphors to illuminate real-life processes. One of my favorite such metaphors was introduced to me by my algorithms professor last year. I was meeting with him to get his thoughts on what classes I should take. His suggestion was to frame it as an exploration-exploitation trade-off, a concept that comes from a technique for solving difficult optimization problems. It turns out to be a powerful and helpful way to view things.

Suppose, for example, that you’re trying to design an airplane wing and you want it to be as aerodynamic as possible. How would you do this? You could try and calculate the optimal wing shape directly, but there are lots of variables that affect the aerodynamics, and they interact in complicated ways, so that’s probably too hard. It turns out the best solution is the obvious one: try a bunch of different designs and pick the one that turns out to work the best.

The hard part is generating the designs to test in the first place. You could just generate random ones, but you’d be very unlikely to find good ones–most randomly generated wings will be useless. So what you probably want to do is take an existing design and make small modifications that you know will make it better. After enough iterations, each of which improves the design incrementally, you’ll hopefully have something quite good. As you may have guessed, this is the exploitation part.

But if you only exploit, you might get stuck in a local optimum: a design that’s mediocre, yet any incremental change makes it worse. You could only find a better wing by scrapping everything and radically rethinking your design. So it’s a good idea to sometimes start over from an essentially random design, or make a large change whose effects you have no idea about, to explore a different part of the “optimization landscape.”

The key observation is that, since both exploration and exploitation take effort (in this case computer time), there’s a trade-off between them. If you only explore, you’ll end up with a bunch of misshapen, randomly-generated kludges. But if you only take one design and exploit it, you risk getting stuck in a mediocre local optimum. So you need to strike a balance between the two.

The other important insight is that if you have a finite amount of computer time, you should explore more at the beginning of your process and exploit more at the end. There’s no point in exploring at the end, because you’ll have no time to refine the designs that you generate, even if you find good ones. On the other hand, you don’t want to start exploiting in depth until you’ve explored enough to have a decent idea of the overall space of airplane wings.

This extends very well by analogy to making life choices. Explore too much, and you become a dilettante; exploit too much and you’ll be stuck doing something boring or useless. So again, it’s best to strike a balance: explore a lot when you’re young, and transition to exploiting as you age.

In my experience, we’re bad at the first part, for a couple of reasons. The first one is availability–people who stop exploring and start exploiting become visibly successful, and it’s tempting to follow their lead. I’ve found this especially true in college, where no matter what your field, there’s always some brilliant talent whizzing ahead of you because they live and breathe math (or violin or tennis or whatever your activity of choice is). And in the rest of the world, it’s much more interesting to tell the story of, say, Terence Tao, who’s been exploiting math since the age of two, than Doug Melton, who majored in philosophy of science before becoming a star biologist. But prodigies are lucky to have found fruitful territory so early–or they’re on their way to a local optimum. For the rest of us, premature exploitation is a seductive but ultimately bad idea.

The other big reason we avoid exploration is delayed payoffs. Most of the time, exploration doesn’t work. Your airplane with funky wings crashes. Your clay pots come out lumpy. Your essay is received with scorn (or worse, apathy). And our fear of failure makes the problem even worse. It’s only occasionally that exploration actually pays off, so we never associate exploration itself with good things happening–only a few of its products.

The best way to combat this, I think, is to consciously frame exploring positively. If you notice yourself feeling sad or anxious that something new isn’t going well, try to replace it with gratefulness for the chance to explore. Remind yourself that nothing truly bad is happening; no tigers are eating you.1 Remember that it’s positive-expectation in the long run, even if this time it didn’t work out. Think back to a time you explored and it worked out. Visualize your future self doing all kinds of crazy, fun things you found by taking risks. Maybe award yourself points, and keep a running total. Use whatever kind of trick you use to get your brain to ignore availability and delayed payoffs, so that you can learn to love exploration as much as you should.

  1. Hopefully. If you’re exploring zookeeping you may have to find a different mental cue. 

Enjoyed this post? Get notified of new ones via email or RSS. Or comment:

email me replies

format comments in markdown.

William MacAskill

I like it. Two other favourite computer science metaphors applied to life:

  1. Failing fast. If you’ve got a plan, then first try whatever bits might make make the entire plan fail.

  2. Doing the whole of a project in as minimal and quick a way as possible, then see where all the ‘processing time’ is being used up and focusing on that, rather than trying to optimise each section at a time.

Not being a computer scientist, I don’t know the actual names for these ideas, but Toby said they come from computer science!



Will, those are also great ones! I’m not sure if the first one has a standard name (although “failing fast” is fairly widely used), but the second is called “avoiding premature optimization” after the Donald Knuth quote “premature optimization is the root of all evil”–which definitely extends far beyond compsci.



Nice post, Ben!

In life, I always struggle with when to exploit, and how long. If you get bored really quickly, is this a signal that it’s time to switch gears, or is it a lack of grit?



Thanks, Girish! If you get bored quickly, perhaps you should automate your task. I hear getting bored quickly is the sign of a good programmer :P

But in seriousness, I don’t really know. I think it depends on how you work and what your current situation is. For me, when I start doing something new, there’s often a period where it’s annoying and boring, but then eventually I get into a groove and things start going more smoothly. I know that folks like 80,000 hours and various popular bloggers (Cal Newport, maybe?) also suggest that you shouldn’t worry too much about “finding your passion,” but rather, you’ll develop a passion for anything you get sufficiently good at and invested in–but most of what I know about this is secondhand.



Does that mean that I can’t be passionate about something without believing that I am good at it?



@David Maybe – that and the Dunning-Kruger effect and complexe d’imposture might explain why my passions for academic subjects don’t seem to last long enough, and maybe also why my most “exciting” projects turn out to be the least feasible.



This is a nice way to break apart the problem of making life choices. Though it’s worth mentioning a point where humans and computers differ. For humans, some of the skills required for ‘exploiting’ are generally useful in many situations. So I don’t think that early exploitation is premature. Rather both exploration and exploitation are essential skills.