Isn’t it simply because in the case of L2 regularization, COST includes the values of the Frobenius norms of all the weight matrices, i.e. the value of COST everywhere in the space of weights is higher by a large-ish sum of all the w^{2}. One is adding a high-dimensional parabola around the “all weights 0” point.
This is not the case with the NN that is used with dropout, so COST is correspondingly lower.
Note that this doesn’t really matter as COST is just an arbitrary value that tells us how good we are currently doing (relative to other solutions), which is why its formula can be chosen rather freely.