September 30, 2020
I recently began a collaboration with Craig Smith, a longtime writer for The New York Times and host of the Eye on A.I. podcast, to chat with some of my friends and colleagues about various aspects of the machine learning pipeline, including data preparation, model development, hardware management, and deployment. Today we’re excited to release our first podcast, with my friend and mentor Dave Patterson.
Dave is one of the world’s foremost experts in semiconductor architecture, is helping to lead Google’s TPU project, and is a recipient of the 2017 Turing Award. Our conversation touched on a wide range in topics, including the end of Moore’s law, the recent Cambrian explosion of specialized AI chips, and the increased importance of RISC-V given the impending sale of ARM to NVIDIA. You can check out podcast below, and the full interview transcript with a few highlights from our conversation. Listen on Spotify, Apple Podcasts, or Overcast as well.
The Turing Award, which is the Nobel Prize for computer sciences, was given for our contributions to reduced instruction set computers (RISC), and for the [computer architecture] textbook, which explained how to build computers out of a more quantitative approach.
The New York Times actually officially declared when Moore’s Law ended. They had a date, and the date was the announcement of Google’s TPU. And their argument was, well, if Google has to build their own chips rather than just buy them from Intel, then it’s over. If even a software company like Google is building chips, then it’s over.
In 2014, [Google] did a calculation that if this app, using deep neural networks running on standard microprocessors, if a hundred million of their customers use this app for three minutes a day, Google in 2014 would have to double the number of data centers and all of the microprocessors to handle that load. Even Google couldn’t afford it, and it would have taken years to build twice as many data centers and fill them up with microprocessors. So they said, “Oh my God,” had kind of an emergency Manhattan Project that, “We have to build something that’s going to run deep neural networks much more efficiently than standard microprocessors. So that was the goal. They started the project 2014, “build a microprocessor just for deep neural networks that’s at least a factor of 10 more effective than standard microprocessors.” They called it the Tensor Processing Unit.
I refer to it as the Cambrian age for computer architecture, where we’ve got dozens of companies with very different bets about the best way to design hardware for machine learning. So it’s a really exciting time if you’re a computer designer, you get all these opportunities to do things. And we don’t really know, which is the best way…And we’re going to have this war to settle, which one’s the best architecture in the marketplace over this next 5 to 10 years.
Having a portable [training] platform for the Cambrian age, that sounds like you’re in a good spot.
Just exactly a decade ago at Berkeley for the research that was going on in one of these research labs that Ameet was talking about, the faculty and some of the students decided we needed to build new microprocessors, particularly for parallel computing, because of the slowing downs of Moore’s Law. And at that time, in 2010, we thought we were going to need to build accelerators. We thought accelerators are going to be the future and we needed a vocabulary to build. So even though this lab was sponsored by Intel and Intel had, at the time, the most popular vocabulary, they owned it and they didn’t want us to use it. In fact, if any academic used that instruction set, they get a letter telling them to stop using it because Intel owned it. The other popular instruction set architecture at the time was ARM. Same story. They would license it to you, and if you want to pay millions of dollars, you can use it. But if you don’t have a license, you can’t use it. So we decided in 2010 to build our own instruction set [RISC-V], … and if there are other academics in the same shoes, that we would make it available to them. And so we made it open and free for everybody to use.
In our paper, “Instruction sets the case for RISC-V” we pointed out that one of the issues of proprietary instruction sets, is that instruction set’s tied to the fate of that company. Well, the big deal that’s in the news right now is Arm, which is the dominant instruction set for the edge, was bought [by NVIDIA]. So a bunch of people who felt like, “Well, RISC-V has many potential technical advantages, but I don’t know if I want to take that business risk to try this new architecture.” Suddenly, “Oh my God! Maybe what we want to do is have an architecture that’s supported by an open foundation, not something that a company owns, because we’re not sure what’s going to happen if it’s owned by a rival. This was a theoretical benefit of RISC-V, but it’s in the newspapers right now.
What makes sense in my career was to do multiple projects, to give myself more times at bat. Give myself a better chance of hitting a home run by giving more at bats. And also, it struck me that in universities, graduate students are there for about five years. So we decided to do five-year projects. We would come up with a vision, work on it and at the end of five years, we would throw a party. And that would be the end of the project, and we’d move on to the next one…If you’re trying to do difficult things, you should get feedback from other people, honest feedback, to see how well you’re doing. So part of that five year lab model was to have two offsite retreats a year where we would invite people from industry to give us feedback on our projects…Particularly at universities where everybody has tenure, they’re not used to hearing honest feedback, they don’t have to hear it and they don’t have to react to it. But when people you respect from industry, when two or three of them all say the same thing, that gives you pause.
We’ll be publishing a new podcast here periodically with some of the brightest minds in AI. Watch this space for more updates, and feel free to drop us a line here.