Shuffle_buffer_size

hi

I’m hoping someone could provide insight into the SHUFFLE_BUFFER_SIZE parameter. I understand that the shuffling process is important to ensure that the algorithm randomly selects batches for training and having a large shuffle_buffer is important to support this. However, I have learned there is a direct correlation of the accuracy and shuffle_buffer_size. Please provide guidelines on setting an appropriate value for the SHUFFLE_BUFFER_SIZE in general. For example, if I have a series_train size of 10,000, what should the SHUFFLE_BUFFER_SIZE be?

thanks,
Ed

Please see this link.

thanks; the site states ‘For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.’

the question is what are the consequences if it is not ‘perfect’ shuffling? another way of putting it is 'How important is it that we use ‘perfect shuffling’?

thanks,
Ed

How about gathering metrics on your dataset to determine the shuffle buffer size?