The Determined deep learning training platform now runs natively on Kubernetes, providing a simpler way to manage on-prem and cloud GPU resources.
Learn how to do production-grade MLOps with scalable, automated machine learning training and deployment using Determined, Kubeflow Pipelines, and Seldon Core.
Training deep learning models for NLP tasks typically requires many hours or days to complete on a single GPU. In this post, we leverage Determined’s distributed training capability to reduce BERT for SQuAD model training time from hours to minutes, without sacrificing model accuracy.
How to build an end-to-end deep learning pipeline, including data preprocessing with Spark, versioned data storage with Delta Lake, distributed training with Determined, and batch inference with Spark.
A better approach to loading data for deep learning models.
We peek behind the curtain of TensorFlow Datasets to reveal some surprising problems.
We compare cloud and on-prem deep learning infrastructure options across five key criteria.
A geometry-aware approach to optimization for neural architecture search.
See an enterprise deep learning platform in action that comprises Pachyderm for data management, Determined for model development and training, and Seldon Core for deployment.
How you structure your machine learning codebase has a big impact on how easy it is to scale, including adding support for distributed training, hyperparameter tuning, and experiment tracking.