Reading List about Bayesian Deep Learning

5 minute read

Published: August 01, 2020

This is a reading collection about bayesian deep learning (BDL) and Deep Bayesian Learning (DBL). (last updated: 2020/10)

Fundamental Books

PRML Pattern Recognition and Machine Learning, Bishop 2006
Machine Learning: A Probabilistic Perspective, Murphy 2012
Bayesian Learning for Neural Networks, Neal 1996
Deep learning, Goodfellow 2016
PGM Probabilistic Graphical Models: Principles and Techniques, Koller and Friedman 2009

Core

### core reseacrch areas * Bayesian Deep Learning in Approximation Inference * Representation Learning * Deep Genarative models * MCMC methods

1 Expectation Maximization (EM) and Variational Inference (VI):

PRML Chapter 9, 10.1-10.6
Variational Inference: A Review for Statisticians, Blei et al. 2016
An Introduction to Variational Methods for Graphical Models, Jordan et al. 1999 Amortized Variational Inference and Reparameterization Trick:
Auto-Encoding Variational Bayes, Kingma and Welling 2013
Stochastic Backpropagation and Approximate Inference in Deep Generative Models, Rezende et al. 2014
The Generalized Reparameterization Gradient, Ruiz et al. 2016
Inference Suboptimality in Variational Autoencoders, Cremer et al. 2018
Forward Amortized Inference for Likelihood-Free Variational Marginalization, Ambrogioni et al. 2018

1.1 Hierarchical Variational Methods:

An Auxiliary Variational Method, Agakov and Barber 2004
Hierarchical Variational Models, Ranganath et al. 2015
Auxiliary Deep Generative Models, Maaløe et al. 2016
Markov Chain Monte Carlo and Variational Inference: Bridging the Gap, Salimans et al. 2014
Variational Inference with Normalizing Flows, Rezende and Mohamed 2015
The Variational Gaussian Process, Tran et al. 2015

1.2 Expectation Propagation (EP):

PRML Chapter 10.7
PGM Chapter 11.4
Proofs of Alpha Divergence Properties (lecture note), Cevher 2008
Divergence Measures and Message Passing, Minka 2005

1.3 Implicit Inference

Adversarially Learned Inference, Dumoulin et al. 2016
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks, Mescheder et al. 2017
Variational Inference using Implicit Distributions, Huszar 2017

2 Deep Generative Models (DGMs)

Deep State Space Models (for tiem series)

A Recurrent Latent Variable Model for Sequential Data, Chung et al. 2015
Deep Kalman Filters, Krishnan et al. 2015
Filtering Variational Objectives, Maddison et al. 2017
Variational Sequential Monte Carlo, Naesseth et al. 2017
Auto-Encoding Sequential Monte Carlo, Le et al. 2017
Variational Bi-LSTMs, Shabanian et al. 2017

2.1 Variational Autoencoders

Importance Weighted Autoencoders, Burda et al. 2015
Reinterpreting Importance-Weighted Autoencoders, Cremer et al. 2017
Sequentialized Sampling Importance Resampling and Scalable IWAE, Huang and Courville 2018
Tighter Variational Bounds are Not Necessarily Better, Rainforth et al. 2018
On Nesting Monte Carlo Estimators, Rainforth et al. 2018
Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference, Nowozin 2018

2.2 Normalizing Flows

Variational Inference with Normalizing Flows, Rezende and Mohamed 2015
Improving Variational Inference with Inverse Autoregressive Flow, Kingma et al. 2016
Improving Variational Auto-Encoders using Householder Flow, Tomczak and Welling 2016
Improving Variational Auto-Encoders using Convex Combination Linear Inverse Autoregressive Flow, Tomczak and Welling 2017
Sylvester Normalizing Flows for Variational Inference, Berg et al. 2018
Neural Autoregressive Flows, Huang et al. 2018
Density Estimation using Real NVP, Dinh et al. 2016
Glow: Generative Flow with Invertible 1x1 Convolutions, Kingma and Dhariwal 2018
Neural Ordinary Differential Equations, Chen et al. 2018

2.3 Transfer Learning and Semisupervised Learning

Semi-Supervised Learning with Deep Generative Models, Kingma et al. 2014
Towards a Neural Statistician, Edwards and Storkey 2016
One-Shot Generalization in Deep Generative Models, Rezende et al. 2016
Uncertainty in Multitask Transfer Learning, Lacoste et al. 2018
Conditional Neural Processes, Garnelo et al. 2018
Neural Processes, Garnelo et al. 2018

2.4 Representation Learning

Ladder Variational Autoencoders, Sønderby et al. 2016
PixelVAE: A Latent Variable Model for Natural Images, Gulrajani et al. 2016
Variational Lossy Autoencoder, Chen et al. 2016
Generating Sentences from a Continuous Space, Bowman et al. 2015
Generating Sentences by Editing Prototypes, Guu et al. 2017
The Variational Fair Autoencoder, Louizos et al. 2015
VAE with a VampPrior, Tomczak and Welling 2017
Hierarchical VampPrior Variational Fair Auto-Encoder, Botros and Tomczak 2018
Neural Relational Inference for Interacting Systems, Kipf et al. 2018
Hyperspherical Variational Auto-Encoders, Davidson et al. 2018
Neural Scene Representation and Rendering, Eslami et al. 2018

2.5 Bayesian Compression

Bayesian Compression for Deep Learning, Louizos et al. 2017
Improved Bayesian Compression, Federici et al. 2017
Variational Dropout Sparsifies Deep Neural Networks, Molchanov et al. 2017
Learning Sparse Neural Networks through L0 Regularization, Louizos et al. 2018
Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors, Ghosh et al. 2018

Bayesian Neural Networks

MCMC Approaches

Bayesian Learning via Stochastic Gradient Langevin Dynamics, Welling and Teh 2011
Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring, Ahn et al. 2012
Stochastic Gradient Hamiltonian Monte Carlo, Chen et al. 2014
Bayesian Sampling Using Stochastic Gradient Thermostats, Ding et al. 2014
Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks, Li et al 2015
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys, Chaudhari et al. 2017
Adversarial Distillation of Bayesian Neural Network Posteriors, Wang et al. 2018 Deep neural networks = Gaussian Process
Priors for Infinite Network, Neal 1994
Bayesian Learning for Neural Networks, Neal 1995
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, Gal and Ghahramani 2015
Avoiding Pathologies in Very Deep Networks, Duvenaud et al. 2016
Deep Neural Networks as Gaussian Processes, Lee et al. 2018 SGD / Approximate Inference / PAC-Bayes
PAC-Bayesian Theory Meets Bayesian Inference, Germain et al. 2016
Stochastic Gradient Descent as Approximate Bayesian Inference, Mandt et al. 2017
Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks, Chaudhari and Soatto 2017
Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints, Mou et al. 2017
Entropy-SGD Optimizes the Prior of a PAC-Bayes Bound: Generalization properties of Entropy-SGD and data-dependent priors, Dziugaite and Roy 2017
A Bayesian Perspective on Generalization and Stochastic Gradient Descent, Smith and Le 2018

Inference Approaches

Twitter Facebook LinkedIn

Gexin Huang