Hi there,
I hope you’re doing well. I am almost finished with the Deep Learning Specialization and am currently working on the Programming Assignment for the Neural Machine Translation module (Sequence Models).
I noticed that from the second week of the Convolutional Neural Networks (CNNs) modules, the data isn’t explicitly split into training, test, and validation sets. I was wondering why this split isn’t shown in the code. Is it because the pre-trained models (like YOLO, for instance) already have the training and test splits set up when they are used? Similarly, with RNN models, for example, in the project “Neural Machine Translation with Attention,” the model is based on a human-readable dataset with equivalent standardized machine-readable data. The data is preprocessed, but I am curious about where the X and Y splits occur for training and testing.
Typically, when building models, the data is split into training and testing sets before being passed to the model. For instance, in object detection tasks with YOLO, the dataset is split into training and validation sets using either functions like train_test_split
from sklearn.model_selection
or by organizing the data into separate folders (such as train
and test
). However, I couldn’t see this step in the code provided in the assignment.
I understand that evaluation is performed when the predict
function is defined, and the data is mapped into a designated directory in the project. However, I would still appreciate clarification on how and where the training and test splits occur in the provided code. Is this something I need to handle myself, or is it integrated within the pre-trained models?
Thank you for your time, and I look forward to your response!
Best regards,
Yosmery Gonzalez