April 29, 2020
Lack of software infrastructure is a fundamental bottleneck in achieving AI’s immense potential – a fact not lost on tech giants like Google, Facebook, and Microsoft. These elite firms have invested massive resources and expertise to build proprietary, AI-native internal infrastructure, and are already reaping the benefits in the form of transformative AI-powered applications and productive Deep Learning Engineers. For everyone else who doesn’t have access to this infrastructure, building practical applications powered by AI remains prohibitively expensive, time-consuming, and difficult.
We started Determined AI three years ago to bring AI-native software infrastructure to the broader market. Working closely with cutting-edge deep learning teams across a variety of industries, a clear narrative emerged: without better infrastructure, training deep learning models at scale remains extremely difficult, as organizations move from research to production. This feedback led us to build the Determined Training Platform, which now powers teams of Deep Learning Engineers and large GPU clusters in industries like pharmaceutical drug discovery, AdTech, Industrial IoT, and autonomous vehicles.
Our innovative infrastructure platform is now ready for widespread adoption, and we’re excited to share it with the DL community! Today we are announcing that we have open sourced our deep learning training platform under the Apache 2.0 license.
We designed the Determined Training Platform to empower Deep Learning Engineers to focus on the task at hand — training high-quality models. To achieve this goal, our platform tightly integrates all of the features that a DL engineer needs to train models at scale, including:
The Determined Training Platform already powers hundreds of GPUs at innovative companies. Here are some insights into how two of our current customers have integrated Determined’s platform to become a core part of their DL efforts:
“With AI at the forefront of Recursion’s vision for biopharmaceuticals, we use Determined to manage 100s of on-premise GPUs, as well as dynamically scale to using GPUs on Google Cloud Platform. Using Determined’s native support for distributed training, we were able to reduce the training time for a key computer vision model from 3 days to 3 hours, without changing our model code.”
—Ben Mabey, Interim CTO, Recursion
“By adopting Determined AI’s software platform, our team of deep learning engineers has been able to rapidly deliver new, advanced, Industrial IoT products to our customers. We’re delivering new AI features 10 times faster than before.”
—Ben Chehebar, Chief Product Officer, Compology
We look forward to building the future of AI-native infrastructure together. We invite you to check us out on GitHub, install the product, and read the documentation. If you have feedback or run into issues, please join us in the Determined Community Slack or post on the mailing list.
A System for Massively Parallel Hyperparameter Tuning. L. Li, K. Jamieson, A. Rostamizadeh, E. Gonina, J. Ben-tzur, M. Hardt, B. Recht, A. Talwalkar. Conference on Machine Learning and Systems (MLSys), 2020. ↩
Random Search and Reproducibility for Neural Architecture Search. L. Li, A. Talwalkar. Conference on Uncertainty in Artificial Intelligence (UAI), 2019. ↩
Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization. L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, A. Talwalkar. International Conference on Learning Representations (ICLR), 2017. ↩
Non-stochastic Best Arm Identification and Hyperparameter Optimization. K. Jamieson, A. Talwalkar. International Conference on Artificial Intelligence and Statistics (AISTATS), 2016. ↩
Automating Model Search for Large Scale Machine Learning. E. Sparks, A. Talwalkar, D. Haas, M. Franklin, M. I. Jordan, T. Kraska. Symposium on Cloud Computing (SOCC), 2015. ↩