Loading...

A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation - Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher | Arena