When creating a post, please add:
Week # must be added in the tags option of the post. 
Link to the classroom item you are referring to: 
 
  
  
    
  Video created by DeepLearning.AI, Stanford University for the course "Supervised Machine Learning: Regression and Classification ". This week, you'll learn the other type of supervised learning, classification.  You'll learn how to predict ...
   
  
    
    
  
  
 
Description (include relevant info but please do not post solution code or your entire notebook) 
 
why is cost function(log term) not plugged in while calculating gradient descent(w,b) ?
             
            
               
               
              1 Like 
            
            
           
          
            
            
              Hello @shouryaangrish  
can you be little more clear in your question?
             
            
               
               
              1 Like 
            
            
           
          
            
              
                TMosh  
                
               
              
                  
                    May 3, 2024,  7:45am
                   
                   
              4 
               
             
            
              Gradient descent uses the gradients of the cost function. It doesn’t directly use the cost function itself - only the gradients.
The gradients are found from the equations for the partial derivative of the cost function.
             
            
               
               
              1 Like 
            
            
           
          
            
            
              @shouryaangrish 
I am sorry, I didn’t fully understand your question.
Still, Logistic regression does use gradient descent as its main optimization method to find the best model parameters. Log loss is our cost function that is used for logistic regression. As, it is related to maximum likelihood estimation, it works especially well for classification issues.
Also, to get the cost function as low as possible, gradient descent is used. We can find the direction and size of changes that need to be made to the model parameters to get the smallest mistake by looking at the gradient of the log loss function.
             
            
               
               
              1 Like 
            
            
           
          
            
            
              Sorry i missed your replies,
Yeah i think i got it sorry for basic question,
so when cost function is calculated for logistic regression-
J[w,b]x = 1/m(1/2*Loss[w,b],y)
it is not put in as loss term for gradient descent for logistic regression which is calculated as below -
w = w - alpha*d/dw(J[w,b]x)
This J[w,b]x != J[w,b]x(^above)
Rather it just get’s the squared error cost function - 1/(2*m)sum(f[w,b]x - y)**2
instead of the Loss for logistic regression
-1/m*(y*sum[log(x)] + (1-y)*sum[log(1-x)])
             
            
               
               
               
            
            
           
          
            
            
              so exactly my point cost function for logistic regression is -1/m*(y*sum[log(f[w,b]x)] + (1-y)*sum[log(1-f[w,b]x)])
whereas when we do the gradient we take it as (f[w,b]x - y)**2 basically the squared error cost function.
normal cost function and the gradient descent cost function are different
             
            
               
               
               
            
            
           
          
            
              
                TMosh  
                
               
              
                  
                    May 8, 2024,  7:54am
                   
                   
              9 
               
             
            
              You are correct in that the linear regression and logistic regression cost functions are different.
When you compute the partial derivatives (i.e. the gradients), they are also different.
             
            
               
               
              1 Like 
            
            
           
          
            
            
              Yup and the cost function for logistic regression is different with the cost function for gradient descent when we compute partial derivatives