Deterministically perform backprop on MNIST

3 minute read


I took some time to figure out how to make backprop produce replicable results on MNIST dataset. It turns out that properly setting up random seeds is not enough. Rather, one needs to additionally control the random shuffle of training data in mini-batch-based gradient descent. Here below I first show some snippets of codes. I then provide the codes for the whole classification pipeline. Read more