Creating my own dataset

I actually started working on a problem. I have been preparing my own dataset by clicking pictures from my phone. I need a dataset consisting of m examples of images. Each image will have some text associated with it. So it will be a tensor like this: (textString, Image, i)
where textString = String text associated with the image
Image = our image
i = denotes the ith training example.

I​t’s been quite a while and it is incredibly frustating. How am I supposed to save the pictures and label them? What file format do I have to use and how? Searching on google did no good too. Please help me in this regard. Thank you so much.

Check this package out

HDF5 for Python — h5py 3.10.0 documentation

I read in several places that H5 file format are becoming kind of a standard dealing with large numerical data. Didn’t work with it yet though! Please try it and tell me how it goes :blush: