Regularization Programming Assignment / Cost function

The cost function in the Regularization programming assignment (in program : reg_utils.py) uses the following formula:
logprobs = np.multiply(-np.log(a3),Y) + np.multiply(-np.log(1 - a3), 1 - Y)
cost = 1./m * np.nansum(logprobs)

The cost function we’ve used in previous assignements calculated cost this way:
cost = (1./m) * (-np.dot(Y,np.log(AL).T) - np.dot(1-Y, np.log(1-AL).T))

Why is there a difference?

In fact both formula produce different cost results.

For example in the Football assignment part 4. ( non regularized model), the cost after 20,000 iterations, using the first formula gives a cost=0.13851642423234922.
whe using the second formula, the cost is = 29.226965514041307

The strange thing is that both formula eventually produce the same Test set and Train set accuracy. So the actual outcome seems to be the same, except for the cost value.

But I’m really wondering why the cost formula is different.

anybody could help on this?

PS: I’m aware that the cost function is not the central topic of assignment 2 (“football”) and that it is calculated not in the notebook but in another file, but i’m still very curious… :slight_smile: many thanks.

Hey @Nicolas_Hirel,
I don’t think the formulae are any different. In fact, I tried out both the formulae, and they gave the exact same results. I included the below function definition, just above the definition of the model function in the notebook. You can return either cost1 or cost2, and you will find that in both the cases, you will get the exact same result, even the loss values.

def compute_cost(a3, Y):
    """
    Implement the cost function
    
    Arguments:
    a3 -- post-activation, output of forward propagation
    Y -- "true" labels vector, same shape as a3
    
    Returns:
    cost - value of the cost function
    """
    m = Y.shape[1]
    
    ### Used in `reg_utils.py`
    logprobs = np.multiply(-np.log(a3),Y) + np.multiply(-np.log(1 - a3), 1 - Y)
    cost1 = 1./m * np.nansum(logprobs)
    
    ### Used in previous assignments
    cost2 = np.ravel((1./m) * (-np.dot(Y,np.log(a3).T) - np.dot(1-Y, np.log(1-a3).T)))
    
    return cost2

Let me know if this helps.

Cheers,
Elemento

Elemento

You’re absolutely right and I was wrong. Although the formulas are different, they indeed generate the same result (cost) as mentionned in your post. The result for the difference was in fact a problem in my code (formula) which i was able to fix thanks to yours.

Again, thank you so much for your precious help! (second time in 1 week!!).

Hey @Nicolas_Hirel,
I am glad I could help :nerd_face:

Cheers,
Elemento