See also this post by @paulinpaloalto
which references these papers:
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
and
Visualizing the Loss Landscape of Neural Nets