403 Error section 1.1 of Ungraded Lab Retrieval Metrics
Location: 1.1 The dataset
The 20 Newsgroups dataset is a classic text dataset with text data on various topics, with labeled categories. Let’s use the sklearn.datasets module to load this dataset.
from sklearn.datasets import fetch_20newsgroups
# Load the 20 Newsgroups dataset
newsgroups_train = fetch_20newsgroups(subset='train', shuffle=True, random_state=42)
# Convert the dataset to a DataFrame for easier handling
df = pd.DataFrame({
'text': newsgroups_train.data,
'category': newsgroups_train.target
})
# Display some basic information about the dataset
print(df.head())
print("\nDataset Size:", df.shape)
print("\nNumber of Categories:", len(newsgroups_train.target_names))
print("\nCategories:", newsgroups_train.target_names)
Hey all! Sorry for the delay, this issue has been fixed. Please notice that you might need to restore to the original version to fetch the updated lab.