The formula is:
n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1
So if you want to solve that for p to achieve the result n_{out} = n_{in}, that would not be how it would work out. This is eighth grade algebra, except that you may have to fudge a little to take the “floor” operation into account.
Once s > 1, what you will find is that the padding numbers get big pretty fast, so the question is whether this really makes sense to do in practice. For example, suppose that n_{in} = 64, f = 5 and s = 2. In order for n_{out} to end up as 64, you need p = 34. It’s kind of crazy to do that much padding. It is interesting to note that the way TensorFlow interprets “same” padding is that it always calculates the p value with s = 1, so that you don’t really end up with the same size output when s > 1. Maybe there’s a serious reason for that beyond just laziness on their part.