Hello, I was working with the brain tumor challenge dataset on kaggle and i stumbled upon a problem when, I loaded the images into the load_image_into_numpy_array
function the error says that
ValueError: cannot reshape array of size 422500 into shape (650,650,3)
def load_image_into_numpy_array(path):
img_data = tf.io.gfile.GFile(path, 'rb').read()
image = Image.open(BytesIO(img_data))
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(im_height, im_width, 3).astype(np.uint8)
But when i removed the third dimension from the reshape function it wasnt showing any error, but as i moved up further it says that ValueError: could not broadcast input array from shape (650,650,3) into shape (650,650)
Please guide me through this problem, Thanks in advance
Hi Rohit, As you know 650*650 = 422500. So you can change it to (650, 650, 1) if you need 3 dimensions. But you simply don’t have enough elements to do (650, 650, 3)
Thank you so much for your response sir,
But when I changed the dimensions from (650, 650, 3) to (650, 650, 1), Iam getting this type of error again
TypeError: Invalid shape (650, 650, 1) for image data
what should i do know. Thanks for the help
Hi Rohit ( You can call me just by my username ‘vsnupoudel’ …
).
To make it work you could try repeating 3 times, the array of length 422500.
But this is just repeating the same information numpy.repeat — NumPy v1.22 Manual
Where is this error poping up for you? I mean, which line of the function is giving this error.
TypeError: Invalid shape (650, 650, 1) for image data
If it is a line outside the function, then maybe this part does not support gray-scale images ( with 1 dimension). Look into support for grayscale images.
Did you look directly at image.size
? It depends a little on the preprocessing that was done specifically for that Kaggle competition, but MRI files are often 4D. Don’t just assume it’s 3. Or that it can be read easily with vanilla Python file operations.
You might also take a peek at one of the submitted notebooks. This one, for example,
seems to have code that reads the data files from that competition.
def load_dicom(path):
dicom=pydicom.read_file(path)
data=dicom.pixel_array
data=data-np.min(data)
if np.max(data) != 0:
data=data/np.max(data)
data=(data*255).astype(np.uint8)
return data
train_dir='../input/rsna-miccai-brain-tumor-radiogenomic-classification/train'
trainset=[]
trainlabel=[]
trainidt=[]
for i in tqdm(range(len(train_df))):
idt=train_df.loc[i,'BraTS21ID']
idt2=('00000'+str(idt))[-5:]
path=os.path.join(train_dir,idt2,'T1wCE')
for im in os.listdir(path):
img=load_dicom(os.path.join(path,im))
img=cv.resize(img,(64,64))
image=img_to_array(img)
image=image/255.0
trainset+=[image]
trainlabel+=[train_df.loc[i,'MGMT_value']]
trainidt+=[idt]
Notice that it relies on the Python package Pydicom
. https://pydicom.github.io/
ps: looks like I’m giving you the same advice @paulinpaloalto gave you here : Cannot batch tensors with different shapes in component 0. First element had shape [224,224,3] and element 2 had shape [224,224,4] - #7 by Rohit_Kumar
You need to understand your data before you try running through code written for other datasets and purposes.
1 Like
Thanks for your response,
when i tried to run the image.size function it gave me an output like this
422500
so how can i reshape it to (650, 650)
Thank you for your response,
When i saved the images numpy array into a list,and tried to display it, i.e this function then it gave me error
``train_image_dir = ‘/content/computed-tomography-images-for-intracranial-hemorrhage-detection-and-segmentation-1.0.0/Patients_CT/049/brain’
for _, _, files in os.walk(train_image_dir):
main_list1=list(files)
train_images_np =
for i in range(1, 4):
image_path = os.path.join(train_image_dir +’/’+main_list1[i])
train_images_np.append(load_image_into_numpy_array(image_path))
plt.rcParams[‘axes.grid’] = False
plt.rcParams[‘xtick.labelsize’] = False
plt.rcParams[‘ytick.labelsize’] = False
plt.rcParams[‘xtick.top’] = False
plt.rcParams[‘xtick.bottom’] = False
plt.rcParams[‘ytick.left’] = False
plt.rcParams[‘ytick.right’] = False
plt.rcParams[‘figure.figsize’] = [14, 7]
for idx, train_image_np in enumerate(train_images_np):
plt.imshow(train_image_np)
plt.show()``
Thanks in advance
@Rohit_Kumar Hi Rohit,
The problem is interesting.
Is it possible to share the link to kaggle notebook. Maybe save a copy, so I can edit it from my end.
If not google colab allows multiple users to edit at the same time as well.
Hello
Can you please share the notebook you use with me as well
Many thanks