In the first of a series of posts, we share some thoughts on papers and blog posts that we’re reading right now that have generated some fiery internal discussion at Determined AI. If you’re interested in learning more about how we’re incorporating these ideas into our product and services, get in touch with us!
- Machine Learning is hard. A recent paper from Luke Oaken-Rayner and Jared Dunnmon indicates that some results applying deep learning to radiology are in question because of a subtle data leakage problem. Ben Recht summarizes in a tweet.
- RAdam presents an alternative adaptive optimization algorithm that automatically warms up learning rates. In some early internal tests, we’ve seen nice performance on large distributed vision problems!
- Wei Hu and Simon Du summarize some of their recent work on Neural Tangent Kernels. These are an interesting alternative to Neural Networks for a standard benchmark task. Their results are promising but still below the performance of deep networks.
- The GLUE NLP benchmark has recently been optimized by recent advances in language models (BERT, Elmo, etc.). A new, improved, and harder set of benchmarks has been devised with SuperGLUE.
- Adrian Coyler summarizes recent work from Cynthia Rudin over on the Morning Paper. While we definitely agree that simpler, explainable models are better — unfortunately the best performing models in NLP, Computer Vision, Speech (and an increasing number of areas) are deep learning models and therefore black box. More generally, there are a lot of reasons not to like deep learning (non-convexity, ad-hoc architectures, massive compute) but people put up with them because right now no other models compete for these domains.