Sharpness-Aware Minimization For Efficiently improving Generalization pytorch dataset : Fashion MNIST model : simple CNN model SGD : epoch 100, 200 SAM : epoch 100 reference https://github.com/davda54/sam paper https://arxiv.org/pdf/2010.01412v3.pdf