In Search of Data

paulinpaloalto · November 21, 2023, 12:48am

Sorry that no-one noticed this thread when you posted it. It may be too late for Roberto, but if anyone else sees this, here are some thoughts:

You can get the data files used in any of the assignments here easily. Here’s a thread about how to get all the files for a given assignment. Then you just have to look at the various utility functions provided in order to see how to access the contents of the files.

There are lots of sources of datasets. ImageNet is a famous one for image data. Kaggle, which you already found, is also a great source of ML datasets. Here’s a thread with more links to datasets.

2GB is not a large dataset by modern ML standards. You should be able to handle that on your local computer in terms of memory and disk space, although training a complex algorithm may require that you get a GPU. I have no experience with running “real” training locally, though. There are many “cloud” based services for running your training. I have tried Google Colab and it’s easy to get started there, since they support Jupyter Notebooks. Here’s a thread about how to get one of our assignment notebooks to run on Colab. One nice thing about Colab is that you can experiment with it for free and get access to GPUs and TPUs. The only thing is that if you are not a paying customer, you may have to wait to run your jobs if the paying users are busy. There are other services, including AWS, that support running your training, but I have no personal experience with them. They will probably not be free, but it’s still way cheaper than building the same amount of compute power yourself.

Topic		Replies	Views
Poor Traing Performance on Local Machine Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	565	January 28, 2022
Course datasets TF Developer Professional Certificate Resources	4	369	January 10, 2023
Programming environment for the exercises Deep Learning Resources	2	325	March 7, 2023
Data used for lab(machine learning spec.) AI Discussions	4	249	March 26, 2023
General question about project Convolutional Neural Networks coursera-platform	1	617	May 23, 2022

In Search of Data

Related topics