In this post, we discuss the missing key to fully leveraging your hardware investment: specialized software that understands the unique properties of deep learning workloads.
To maximize the value of your deep learning hardware, you’ll need to invest in software infrastructure. Setting up a cluster manager is an essential first step in this process, but it’s not the end of the story.
Reproducible machine learning is hard, particularly when training deep learning models. We review common sources of DL non-determinism and how to address them.
Machine learning today resembles the dawn of aviation.
I recently joined the O’Reilly Data Show podcast to talk about various challenges associated with developing deep learning at scale.