January 29, 2019
2019 has gotten off to a good start for us at Determined AI. As a company focusing on accelerating deep learning model development for our users, we saw the deep learning community gathering together at the RE•WORK Deep Learning Summit and it further validated our mission. Over 800 deep learning practitioners got together in San Francisco joining 100 speakers from Google, Facebook, Uber, Netflix, University of Texas, UCSD, Stanford University and other institutions. As witnessed at the event, the demand for deep learning technology in practical applications is growing at a rapid rate, and consequently there is a pressing need to develop and evolve the compute infrastructure to support these applications.
Our CTO and co-founder Neil Conway gave a talk on “Taming the Deep Learning Workflow”. Neil made the argument that “we’re in the golden age of deep learning, but deep learning infrastructure is still stuck in the dark ages!” Although open source tools like TensorFlow and PyTorch are very useful, they are focused on solving the “Problems of One”: one researcher, training one model, using one GPU. For an enterprise looking at using deep learning, many challenges are beyond the scope of a tool like TensorFlow, e.g. workload containerization, GPU scheduling, hyperparameter tuning, metrics management, etc.
A select group of companies have the scale and expertise to build their own, internal deep learning platforms to solve these problems; but for the vast majority of organizations, this is not a viable option. Deep learning practitioners are thus forced to work without adequate infrastructure, which drastically inhibits productivity. Deep learning practitioners spend most of their time on low-value tasks, such as managing data and launching training jobs by hand. This drudge work can and should be automated.
Deep learning practitioners need an integrated, end-to-end platform for their work, i.e. a system that allows researchers to focus on solving the machine learning and business problems they care about and automates away much of the drudgery they spend time on today. That is precisely what we are building at Determined AI.
The Determined AI platform for deep learning offers a comprehensive spectrum of deep learning optimization technologies, including massively parallel hyperparameter search, deep learning job mapping and scheduling, and automated metadata tracking. In his talk, Neil went into further details about each of these various optimizers, and how they seamlessly integrate and in fact build upon each other. For further details, you can find Neil’s slide deck online.
If you are interested to learn more about the massively parallel hyperparameter search algorithm (Hyperband [ICLR 2017]) used in the Determined AI platform, you can refer to the talk given by our Chief Scientist and co-founder Ameet Talwalkar at the O’Reilly AI conference in 2018. Ameet’s blog on Scalable Deep Learning has all the details.
All in all, the RE•WORK Deep Learning Summit 2019 treated us well. We fostered many friendships and raised our awareness. Determined AI will continue to reach out to the deep learning community and aim to contribute to the success of deep learning practitioners around the world.