Gabe Wu: Top Ten Seniors in Innovation
This interview has been transcribed and edited for clarity.
Gabe Wu is a senior at Harvard College studying Computer Science and Mathematics. He is from Rockville, Maryland.
HTR: What does being an innovator mean to you, and how does this manifest itself in your daily life?
Gabe: I think being an innovator has to do with the idea of being very deliberate about what questions you’re asking. I think a lot of people who do research, for example, end up pursuing a research path that people before them have already established, or it seemed like the default things to do in your current environment, or what the people around you are doing. Whereas an innovator is someone who is willing to take the time to, before they actually try to solve a problem, figure out whether it’s the right problem to solve. So in a very abstract sense, that’s how I think of it.
HTR: So basically, instead of a set of predetermined questions, being an innovator is like, “I’m gonna actually question even what things to think about.”
Gabe: Exactly.
HTR: I know you’re in math, computer science, and technology. What sparked your interest in these sorts of fields?
Gabe: So for me, it was really the challenge and the competition of it. When I was in middle school and high school, I was really drawn to the idea of “Oh, I can kind of prove myself to the world by doing these math competitions and programming competitions.” So I ended up just doing them a lot, and I really enjoyed the thrill of competing and learning and getting better. When I got to college, I realized it’s less of a thing to do in college, but it turns out it’s super applicable to studying actual math and computer science for research purposes. And there’s entire careers that are made out of this content. So that led me pretty naturally to the academic interest that I currently have.
HTR: Do you want to talk a little bit about what those academic interests are?
Gabe: I mainly see myself as a theoretical computer science person. So I think of this as the intersection between math and computer science: everything in computer science that doesn’t actually require a computer, and that you can do on a whiteboard. So that’s one area of academic interest, and then the other half is really in machine learning and AI safety. I see this as the field that I’m interested in because I think it’s really important, especially this decade. My research interest lies at the intersection of theoretical computer science and AI safety and alignment.
HTR: If you have any that come to mind, do you mind sharing some of the key stories, challenges, or successes from these academic interests, or from other projects you’ve worked on?
Gabe: I think one story — it’s not a very concrete story, but it’s a part of my thought process that occurred a few years ago. I came into Harvard really wanting to do theoretical computer science. I was like, “Yes, this is my thing. This is what I know how to do. I know I will be successful.”
But then in college, I learned about AI safety, and I learned about the fact that AI capabilities are advancing very quickly, and it could be extremely important that we do research now to make sure that they’re developed safely. So I was kind of torn. I was on one hand, in this field that I know academically interests me a lot, and I have experience in. And then on the other hand, there’s this field that seems super important, but I don’t really know anything about it yet. So I was kind of torn, and I ended up spending a lot of my classes and studying theoretical computer science stuff, but most of my free time reading research papers and doing internships and stuff in the AI safety sphere.
And eventually, I think I realized, there are definitely trade offs to be made when choosing a field of study. But oftentimes there are intersections that you can leverage where you can kind of do both of them at once. And even when there’s not, it’s still possible to do both and enjoy and appreciate aspects of both. And even if you end up doing one of them, you can still read about and talk about and think about the other one.
HTR: Did you start getting interested in AI safety and machine learning related things because you were doing that sort of thing in your theoretical computer science interests? Or was it sort of like “I’m interested in these things broadly, and these are two areas that I’m interested in”?
Gabe: It was definitely separate. I didn’t even really know how machine learning worked, or I wasn’t interested in AI, until I was introduced to the field of AI safety by some friends here. And once I realized how it was such a big deal, or once I realized how important AI is going to be for the future of humanity and how quickly it was coming, I decided to invest a lot of time into learning more about it.
HTR: For either of your two pillars, were there any classes and papers that you think are of note in your academic journey?
Gabe: I think in terms of classes, CS 121. It was the class that convinced me, “Hey, yeah, I definitely want to do theoretical computer science”. And my favorite class at Harvard has been CS 2210, which is complexity theory, which I just took this semester. And so those are classes I very much enjoy.
On the papers and learning side, or AI safety side, there’s a paper called “Risks from Learned Optimization in Advanced Machine Learning Systems”. It formally introduces the idea of how an AI system can be trained to pursue one goal, but during the training it develops its sort of inner goal that may or may not be the same as the goal of being trained for. And this presents a big challenge in machine learning. If you’re trying to make a powerful AI system, you kind of have no guarantees about what the actual thing the AI system will end up pursuing is.
HTR: Are there any people or mentors or sort of key figures in your life related to these interests that you think that were very impactful for you?
Gabe: When I was first getting into AI safety, the founders of HAIST – or it was called HAIST back then, now it’s called AISST – which is the group that I now run. The previous founders were pretty impactful in guiding me and ensuring me what the HAIST space looks like, and introducing me to opportunities in that space. One of them was Xander Davies, who actually was one of the top 10 innovators from two years ago. Others were Max Nadeau and Sam Marks.
HTR: And how about professors?
Gabe: Professors? I think I would list Madhu Sudan, who I’ve taken a class with every single year at Harvard, and Boaz Barak, who is now working at OpenAI.
HTR: How do you go about finding community on campus? I mean, you mentioned the AI safety team is a big extracurricular interest, so maybe something related to that, or other groups. How do you approach that?
Gabe: I think there’s many different purposes of community, and ways that it can make your life a lot better than just living alone and being a hermit. And for me, the AISST community is a very big one, and it’s very important for me, because it’s a group of people who allowed me to think and talk about the big ideas with AI safety and career paths and how you can improve the world from big, deep ideas. And then I have my community back in Currier with all my friends I live with and learn with and eat many of my meals with. I end up goofing off with these people a lot more: there’s less pressure to think about the really big ideas, and I can just watch Instagram Reels.
HTR: What advice would you give to your younger self about navigating Harvard?
Gabe: I think I would say don’t do things just because everyone else is doing them. I’ve seen so many people enter this default mode where in high school, they would just do all the clubs on campus that were prestigious, or seemed like it. They were, like, showy, because that’s kind of what you do in high school. But then they get to college and they just do the exact same thing. But I would just encourage people to take the time to think through why you want to do things. And if the reason why you want to do something is just because, “oh, it’s like, the thing all the cool kids are doing,” then maybe that’s not such a good reason.
HTR: So looking into the future, what sort of projects do you think you’re excited to work on, whether that’s when you graduate, long term, that kind of thing?
Gabe: I plan on working in technical AI safety research, which I would describe this as doing machine learning research, where we figure out how to make sure that current and future AI systems do the thing that their designers want them to do, which is a surprisingly hard problem. I plan to do this. I’m not completely sure where I will end up, but the biggest options right now for me are kind of the Alignment Research Center, which is a nonprofit research group that is working on a theoretical approach to AI alignment, or I could end up at one of the labs, like Anthropic, or OpenAI.
HTR: Do you think that this is sort of a combination of your academic interests?
Gabe: The Alignment Research Center especially is kind of like the best theory game in town in AI alignment. It’s an approach to AI alignment that is heavily drawn on ideas from theoretical computer science. So it’s a good fit for my interest. At a very high level, what they’re trying to do is the following. In normal machine learning, you have a loss function, and then you train in a machine living system to do well against that loss function. But the problem is that it ends up creating AI systems that perform very well on the task, but are uninterpretable to humans, so the humans can’t understand how they are doing the thing that it’s doing. The goal of ARC is to design a loss function, not for the AI, but for explanations themselves.
So then you can try to find an explanation of why the model is doing what it’s doing by training your explanation against this new loss function that we want to design, and then hopefully by the end, we’ll get a really good explanation of why the model is doing what it’s doing. And it’s going to be a mechanistic explanation that points to the neurons of the neural network and explains what things are doing.
HTR: Is there anything else that you’re thinking of that’s important that you would want to include?
Gabe: I think one thing that has really improved my time at Harvard has been [the] willingness to engage with big if true ideas. So I think a lot of times people agree to disagree about things because it’s too much friction to have a debate about something, and then you’re like, “Okay, we’ll just believe our own separate thing, and just go our separate ways”.
I think that’s fine to do for almost all ideas, but for some ideas, especially those revolving around the way the future will play out – those ideas are too important to just ignore. So when an idea is big if true — for example, the idea that all human labor could be obsolete within 10 years, because AI systems just automate all of it – this is an idea where I’m not sure if it’s true, but if it’s true, it’s pretty important. If people knew this were true, people would be living their lives very differently. So whenever you identify what these ideas are, it’s worth the time to have that discussion with the person, and that debate with the person who’s supporting it or denying it, to kind of get to the bottom of whether or not it’s true.
HTR: You want to solve these questions in your work in the future?
Gabe: Exactly, yeah. I think if I had not engaged with these big-if-true ideas, I would be doing something a lot different with my life. So I’m glad that I did end up engaging with them.
HTR: Have you always sort of been interested in rigorous debate on problems that feel very important, even if it’s not one that people normally are willing to really sort out? Or did you sort of have to develop that sense in college?
Gabe: I think I’ve gotten better at it, but I’ve always been. I’ve always naturally tried to examine the axioms that people use implicitly in their everyday decision making. So something as simple as eating meat – when I was in middle school, I looked into that more. I was like, “Hey, wait, this thing that everyone does and everything is so normal, it seems kind of inconsistent on a moral level, if you dig far enough.” And so I think that similar instinct is an instinct that comes kind of from math – when you do math, you always want to make sure that your core assumptions are right, otherwise, the proof that you make is going to be completely wrong. So yeah, that desire for consistency in underlying worldview is something that has helped me a lot when thinking about AI safety.
HTR: How do you go about checking that underlying worldview you’re talking about [and] what do you think influences the way that you approach that?
Gabe: When you’re really looking at the core belief sort of things, there’s two types of core beliefs. Some of them are empirical core beliefs where they can be tested, and they’re either right or wrong. They can be empirically tested with science. And some of them are value based beliefs that are much more philosophical, and then harder to prove that you’re right about. So for the empirical beliefs, science was made for testing the empirical belief. Those are the easy ones to deal with, but I think a lot of times, people overlook them. You can get those out of the way. And then the core philosophy or moral belief. It’s hard to really figure out whether you’re right about those, but a good first step is just at least writing them down and being explicit about the things that you believe in [and] the way that those beliefs lead to the downstream decisions that you end up making. Once you make those explicit, oftentimes you can find contradictions [or] inconsistencies that you might wish you did not have, and you can kind of untangle them there.
HTR: Do you think that’s where problems normally arise, or that the disagreement normally arises from “We have the same underlying principles, but we disagree on this” [mentality]?
Gabe: I think in practice, there are definitely people who have different underlying philosophical assumptions. But it turns out that AI safety is an issue where for a very wide range of underlying philosophical assumptions, you can kind of get the same conclusions. And those conclusions are “Let’s try to build AI that we can safely steer and control, and rather than building AI that kind of gets out of hand.” And so in practice, even though we have a lot of differences, we may have disagreements on the philosophical side, they boil down to very empirical, testable things that we can all get on board with and discuss productively.
HTR: So you think that, in your estimate, a lot of the conversation then revolves around what’s best in this particular strategy, or what are the sort of nuts and bolts of the situation?
Gabe: To give concrete examples that you could list about the nuts and bolts, we have discussions about things like, “Is interpretability a more promising research direction for AI safety than scalable oversight?”. And either interpretability and scalable oversight are two different research directions in AI safety. Or, something like “How many years do we have left until we have AI systems that can automate 90% of a research engineer’s day to day job?”
So these are empirical questions, and you can debate them and come up with different methodologies to figure out what the answers should be. But they ultimately are empirical questions.
Since conducting this interview, Gabe has decided to join the Alignment team at OpenAI.