User Tools

Site Tools


Empirical Wonders Reading Group

Caretaker: Pritish Kamath (for questions/comments/suggestions, email pritish [at] mit [dot] edu).

When: Wednesdays (Visit days) @ 1.15pm

Room: 116 (unless otherwise stated)

Suggest papers to be discussed on this Google doc.

For more and most recent information, join #empirical-wonders slack channel.

Date Presenter Papers Notes
June 19 Pritish Kamath 1. Bad Global Minima Exist and SGD Can Reach Them Google Slides
2. Are All Layers Created Equal?
June 26 Logan Engstrom 1. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks PDF slides
Dimitris Tsipras 2a. Do deep neural networks learn shallow learnable examples first? PDF slides
2b. SGD on Neural Networks Learns Functions of Increasing Complexity
July 3 Matus Telgarsky Theme: Practical differences between Gradient Flow, Gradient Descent and mini-batch SGD PDF slides
1. Efficient Backprop (LeCun, Bottou, Orr, Müller)
2. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
3. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
4. Don't Decay the Learning Rate, Increase the Batch Size
5. Measuring the Effects of Data Parallelism on Neural Network Training
6. Train longer, generalize better: closing the generalization gap in large batch training of neural networks
7. Three Factors Influencing Minima in SGD
8. An Alternative View: When Does SGD Escape Local Minima?
July 10 Daniel Soudry Theme: Neural networks with reduced numerical precision PDF slides
1. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
2. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
3. The High-Dimensional Geometry of Binary Neural Networks
4. Training Quantized Nets: A Deeper Understanding
5. Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss
6. Scalable Methods for 8-bit Training of Neural Networks
7. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets
8. A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off
July 24 Yu Bai Theme: Exploration in Deep Reinforcement Learning PDF slides
1. An Empirical and Conceptual Categorization of Value-based Exploration Methods
2. Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment
3. Exploration by Random Network Distillation
July 31 Boris Hanin Theme: Heavy tailed random matrix theory and generalization in deep networks
2.15pm-3.30pm 1. Traditional and Heavy-Tailed Self Regularization in Neural Network Models
1. Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks
1. Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
empirical_wonders.txt · Last modified: 2019/07/29 04:15 by pritishkamath