Unit test case failure in C2_W4_Decision_Tree_with_Markdown

Hi,
During execution of

# UNIT TESTS    
split_dataset_test(split_dataset)

Error is

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-197878478053> in <module>
     31 
     32 # UNIT TESTS
---> 33 split_dataset_test(split_dataset)

~/work/public_tests.py in split_dataset_test(target)
     79                 'right': np.array([2, 7, 9, 10])}
     80 
---> 81     assert np.allclose(right, expected['right']) and np.allclose(left, expected['left']), f"Wrong value when target is at index 0. \nExpected: {expected} \ngot: \{left:{left}, 'right': {right}\}"
     82 
     83

I found that one of the unit test is wrongly written, because of this its failing.

Test case form public_tests.py

# Case 3

X = (np.random.rand(11, 3) > 0.5) * 1 # Just random binary numbers
X_t = np.array([[0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0]])
X = np.concatenate((X, X_t.T), axis=1)

left, right = target(X, [1, 2, 3, 6, 7, 9, 10], 3)
expected = {'left': np.array([1, 3, 6]),

'right': np.array([2, 7, 9, 10])}

assert np.allclose(right, expected['right']) and np.allclose(left, expected['left']), f"Wrong value when target is at index 0. \nExpected: {expected} \ngot: \{left:{left}, 'right': {right}\}"

print("\033[92m All tests passed.")

From above code you can see that right and left arrry values are not correct

From code

X_t = np.array([[0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0]])

you can see indices for value 1 are 1,3,5,6 and for 0 are 0,2,4,7,8,9,10

which are not matching with left and right arrays used for verification.

In code

expected = {'left': np.array([1, 3, 6]),
'right': np.array([2, 7, 9, 10])}

Can someone please fix this test case? or let me know if my understanding is not correct here.

Thanks

Your understanding of the issue seems to be correct. The expected output values in the test case do not match the actual logic of splitting the dataset based on the 0s and 1s in the target column.

@chris.favila, could you please look into this?

@chris.favila @nadtriana
Hello,
Do you have any ETA for this fix?

Hi Vishal and Nayid. Sorry for the late reply. We will look into it this week.

2 Likes

Hi Vishal. Can you send me your notebook (ipynb) file via DM? Thanks.

Sent you DM with file attached.

1 Like

Hi Vishal. I saw your solution and saw that you’re not using the node_indices parameter in your code. Remember that the split_dataset() function uses this to indicate which subset of the samples needs to be analyzed. Check the function inputs and expected output for Case 2 in the in-notebook unit test:

This should guide you in revising your solution. Hope this helps!

2 Likes