Exercise 4 - utils2 function get_batches is that right?

Stefanus_Yudi_Irwan · September 16, 2023, 8:01am

As I know dividing whole train data into batches data is a technique to train your model. And the amount of data is always constant. As an example If I have 1000 data and I want batch_size = 10, this means that I will have 100 batch of data, because 1000 / 10 = 100. And If the model is learning from all 100 batch of data, then we called this 1 epochs.

I dont understand the function of get_batches data here.

def get_batches(data, word2Ind, V, C, batch_size):
    batch_x = []
    batch_y = []
    for x, y in get_vectors(data, word2Ind, V, C):
        while len(batch_x) < batch_size:
            batch_x.append(x)
            batch_y.append(y)
        else:
            yield np.array(batch_x).T, np.array(batch_y).T
            batch_x = []
            batch_y = []

in the part of while len(batch_x) < batch_size this line will append the exact same x and y data to batch_x and batch_y until the len(batch_x) and len(batch_y) is equal to the batch size. If the purpose of this function is to duplicate the vector representation of center_word and context_word as much as batch_size. Yea this is correct, but my question is, is this a right function to train the model?

Thanks

Stefanus_Yudi_Irwan · September 17, 2023, 4:29am

Oh Im sorry this problem have arrised one year ago here : What's the purpose of batch_size in the "get_batches" function in "utils2.py", and still not any answer or modification of the function till today 2023-09-17

arvyzukai · September 19, 2023, 6:27am

Hi @Stefanus_Yudi_Irwan

That is a good question (and I had forgot the thread you linked to). I think you are right that the get_batches function is flawed and I remember not having time to dig deeper. But just by looking at the function I think you’re right and since the Assignment is about the gradient it might not have any real influence on the outcome (just for illustration purposes this might be good enough even though - confusing). I will report about it.

Thanks!

Topic		Replies	Views
What's the purpose of batch_size in the "get_batches" function in "utils2.py" NLP with Probabilistic Models week-module-4	2	574	September 12, 2022
C2W4 function get_batches() is flawed NLP with Probabilistic Models week-module-4	1	549	January 28, 2022
Question regarding cur_batch in the data generator function NLP with Sequence Models week-module-2	1	539	November 27, 2022
Week 2 final lab Advanced Learning Algorithms week-module-2	3	36	June 9, 2025
Course 4, week 1 assignment 2 Convolutional Neural Networks coursera-platform	1	840	June 1, 2021

Exercise 4 - utils2 function get_batches is that right?

Related topics