H5py files generation

Mohammad_Hamza · August 11, 2022, 7:04am

What are these h5py files and how do we convert our images to h5py??

anon57530071 · August 11, 2022, 7:50am

Welcome to the community.

h5py is a library to access HDF5 (Hierarchical Data Format). HDF5 is quite useful for numpy users, since we can easily store high rank big data directly, and can read/write part of data with numpy slice style. Image data is typically 3-dimensional data to be easily stored into HDF5.
For more detail, please see this.

Mohammad_Hamza · August 11, 2022, 8:05am

how to convert images on our PC to h5py files???

anon57530071 · August 11, 2022, 8:32am

Here is an example. You can do on your preferable language/platform.

import h5py
import matplotlib.image as mpimg
from matplotlib import pyplot as plt

# Read image data
img = mpimg.imread('./images/classification_kiank.png')
plt.imshow(img)

# Write into HDF5
with h5py.File('./images/image.h5', 'w') as f:
    dset = f.create_dataset('classification', data=img)

# Read back
f2 = h5py.File('./images/image.h5', 'r')
dset2 = f2['classification']
plt.imshow(dset2)
f2.close()

Mohammad_Hamza · August 13, 2022, 7:05pm

how to convert a whole folder of images with test and train set in it into h5py??

anon57530071 · August 14, 2022, 4:10am

A big extra service for you. In this case, I created both training and test set in HDF5, but you can create a single set, and separate into two later as you like. You can also put a label in HDF5, just like I create two dataset in one file. Again, HDF5 has hierarchical structure, which is very useful.

Code itself is super-straight forward. Simple iteration of read and write.

If you need further assistance, please talk with your friend, Google.

import numpy as np
import h5py
import cv2
import glob
import random
from matplotlib import pyplot as plt

# Get image list
# This is just an example.  You can split train and test as you like

list_images = glob.glob('images/*.jpg')

TRAIN_SPLIT = 0.7
split_count = np.floor(len(list_images)*TRAIN_SPLIT).astype(int)

# shuffle images as samples for testing
random.shuffle(list_images)

# split list into training and test set
train_list = list_images[0:split_count]
test_list = list_images[split_count:]

train_length = len(train_list)
test_length = len(test_list)

# set train/test image size (convert to this shape if an original image is not)
IMAGE_WIDTH = 640
IMAGE_HEIGHT = 640

# Write into HDF5
with h5py.File('./images/image.h5', 'w') as f:

    # training set
    train_set = f.create_dataset('train', shape = (train_length, IMAGE_WIDTH, IMAGE_HEIGHT, 3), dtype=int)
    for count, img_name in enumerate(train_list):
        img = cv2.imread(img_name, cv2.IMREAD_COLOR)
        # This is an optional.  But, usually, we expect that all images in train/test set has the same size
        img = cv2.resize(img, (IMAGE_WIDTH, IMAGE_HEIGHT))
        train_set[count] = img

    # test set
    test_set = f.create_dataset('test', shape = (test_length, IMAGE_WIDTH, IMAGE_HEIGHT, 3), dtype=int)
    for count, img_name in enumerate(test_list):
        img = cv2.imread(img_name, cv2.IMREAD_COLOR)
        # This is an optional.  But, usually, we expect that all images in train/test set has the same size
        img = cv2.resize(img, (IMAGE_WIDTH, IMAGE_HEIGHT))
        test_set[count] = img

Topic		Replies	Views
Converting Images into 'h5' file? Neural Networks and Deep Learning coursera-platform	3	1100	February 22, 2023
Write subvolumes into .h5 format AI for Medical Diagnosis week-3	2	519	October 20, 2022
Help me to understand the purpose of 'h5py' package? Convolutional Neural Networks coursera-platform	3	512	August 28, 2022
Creating my own dataset Sequence Models coursera-platform	1	542	May 1, 2021
H5file Know how Neural Networks and Deep Learning coursera-platform	2	279	November 26, 2023

H5py files generation

Related topics