Deep in the Trenches: What Are We Reading? (November 2019)

By Evan Sparks

November 12, 2019

In the first of a series of posts, we share some thoughts on papers and blog posts that we’re reading right now that have generated some fiery internal discussion at Determined AI. If you’re interested in learning more about how we’re incorporating these ideas into our product and services, get in touch with us!

Applications

Machine Learning is hard. A recent paper from Luke Oaken-Rayner and Jared Dunnmon indicates that some results applying deep learning to radiology are in question because of a subtle data leakage problem. Ben Recht summarizes in a tweet.

Methods

RAdam presents an alternative adaptive optimization algorithm that automatically warms up learning rates. In some early internal tests, we’ve seen nice performance on large distributed vision problems!
Wei Hu and Simon Du summarize some of their recent work on Neural Tangent Kernels. These are an interesting alternative to Neural Networks for a standard benchmark task. Their results are promising but still below the performance of deep networks. Benchmarks
The GLUE NLP benchmark has recently been optimized by recent advances in language models (BERT, Elmo, etc.). A new, improved, and harder set of benchmarks has been devised with SuperGLUE.

Interpretability

Adrian Coyler summarizes recent work from Cynthia Rudin over on the Morning Paper. While we definitely agree that simpler, explainable models are better — unfortunately the best performing models in NLP, Computer Vision, Speech (and an increasing number of areas) are deep learning models and therefore black box. More generally, there are a lot of reasons not to like deep learning (non-convexity, ad-hoc architectures, massive compute) but people put up with them because right now no other models compete for these domains.

Fun

Scott Aaronson is a guest author on a recent Saturday Morning Breakfast Cereal and helps us all understand what this quantum computing stuff is about.

Deep in the Trenches: What Are We Reading? (November 2019)

Applications

Methods

Interpretability

Fun

Recent Posts

Finding the best LoRA parameters

Summer '24 Conference Recap

How does Video Generation work?