It’s May and you’ve just opened a brand new ice cream shop in balmy Montreal, Canada. You’re looking for your first customers and are faced with the enviable prospect of deciding what flavors to sell. Though your supplier has dozens of flavors to choose from, you have a fixed budget and a finite freezer. How do you find the flavors that bring in the most profits?
Choosing which flavors to buy is a simplistic version of the multi-armed bandit, a probability theory problem focused on the tradeoff between exploration and exploitation. Your approach might be as follows: you start by buying small quantities of many flavors. This is the exploration phase; you’re trying to find which flavors are most conducive to your bottom line. As you collect more data, you double down on the best-performing flavors by discontinuing others. This is the exploitation phase; you give up searching for new information in favor of maximizing your return.
The balance between exploration and exploitation is a mental model that permeates all aspects of professional life. In academia, grad schools select and self-select for people with proven records in their chosen field. In industry, the situation is not dissimilar; Alice, who has four internships in web design and solves LeetCode Hards in JavaScript is more competitive for a front-end position than Bob, who has machine learning research, consulting, and product management internships on his resume. Once hired, Alice will hit the ground running and immediately begin providing value for the company. This is good news for the employer—but is it good news for Alice?
Let’s rewind a bit. For Alice to have had four web design internships, she must have chosen to specialize early, entering her exploitation phase. She must have been so enamored with or proficient at web design that she decided to pursue it further. Or perhaps she had wanted to try hardware or distributed systems—but front-end roles were the only ones offered to her. And the more she did them, the more she had cemented her “brand” as a front-end developer. This is great for Alice if she truly enjoys web design; but if she doesn’t, she’s stuck in a shoehorned specialty. Employers prefer specialists, but their goals are not the same as ours. As students who plan to enter the workforce, we have a responsibility to figure out where we can best strike the balance between doing something that is both enjoyable to us and valuable to employers.
Optionality, or the intentional dedication to keeping as many paths open as possible, has been my approach to university. Of course, there’s the possibility of trying to do too many things at once—something I’m definitely guilty of—but after narrowing down a few potential paths, optionality becomes much easier. A higher GPA opens more doors than a lower one, so I spent the time to get better grades. Computer science grad schools care about research experience, so I did a couple of research courses. Software companies care about internships and personal projects, so I threw in some of those too. Basically, I first explored to find a small set of ladders that looked appealing before climbing up a few rungs on each of them.
The result is that I’ve become Bob: I’m less competitive for software positions because I don’t have as many internships and hackathon projects as the average kid from Waterloo. I’m less competitive for grad school because I spent time on software internships instead of research. And, because I spent time on both of those things, I don’t have perfect grades. But if I could go back in time, I wouldn’t do it any other way—I value having had those diverse experiences over an extra boost in competitiveness.
Both approaches, Alice’s laser-focused exploitation and Bob’s exploration and subsequently tempered exploitation, have their pros and cons. Neither is necessarily better; the bottom line is simply that the approach you take to school (and life) is worth considering with intention.