Yes, as Tom says, the normal cost functions we use for linear regression (MSE) or logistic regression (BCE) are convex in those cases. But note that if we use those same cost functions with multi-layer neural networks (perceptrons), then they are no longer convex. The point is that the cost function is the complete function that takes the parameters of the network (weights and bias values at all layers) as input and maps that to the cost, based on the training data. Of course we are also in very high dimensions, since it’s typical for a neural network to have thousands or even millions of parameters. The cost surfaces are very complex and impossible for us to really visualize with our human brains evolved to perceive in only 3 spatial dimensions.
There is a lot of math going on here and the questions have been studied and (as Tom also pointed out) the experts have figured out how to make gradient descent work in a lot of cases. If you take DLS, for example, you’ll learn about more sophisticated techniques like Adam, RMSprop and so forth that are useful for getting efficient convergence. Here’s a thread which discusses the general point in more detail and also links to some other information about this, including a paper by Yann LeCun’s group showing that for sufficiently complex networks, there are good solutions that we have a good probability of finding with gradient descent.