Cost function convexity question

In one of the first lectures on Log Regression, Prof Andrew Ng says that cross-entropy function is convex (unlike least squares).

Does it still hold true when we get to multi-layer forward feed networks (with ReLU activations in hidden layers) in Week 4? I.e., does cross-entropy still stay convex with respect to all W^{[l]} and b^{[l]}?

If there is a NN with a hidden layer, then its non-linear functions (i.e. ReLU, sigmoid, etc) cause the cost function to be non-convex.

If you’re just doing simple linear or logistic regression, without a hidden layer, then both of those cost functions are convex.

1 Like