This part:
Information Gain from splitting the root on brown cap: 0.034851554559677034
Information Gain from splitting the root on tapering stalk shape: 0.12451124978365313
Information Gain from splitting the root on solitary: 0.2780719051126377
works just fine. I’m matching all three of those.
There seems to be some test immediately following those, though, and that’s where I’m struggling. Here’s a sample of my debugging output.
NODE INDICES GIVEN: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Data selected for feature 2: [1 1 0 0 1 1 0 1 0 0]
my left 1, [0, 1, 4, 5, 7]
my right 1, [2, 3, 6, 8, 9]
Left indices: [0, 1, 4, 5, 7]
Right indices: [2, 3, 6, 8, 9]
snode =1.0, sleft=0.7219280948873623, sright=0.7219280948873623, wleft=0.5, wright=0.5
Information gain: 0.2780719051126377
Information Gain from splitting the root on solitary: 0.2780719051126377
NODE INDICES GIVEN: [0, 1, 2, 3, 4]
Data selected for feature 0: [1 1 1 0 0]
my left 1, [0, 1, 2]
my right 1, [3, 4]
Left indices: [0, 1, 2]
Right indices: [3, 4]
snode =0.0, sleft=0.0, sright=0.0, wleft=0.6, wright=0.4
Information gain: 0.0
NODE INDICES GIVEN: [0, 1, 2, 3, 4]
Data selected for feature 0: [1 1 1 0 0]
my left 1, [0, 1, 2]
my right 1, [3, 4]
Left indices: [0, 1, 2]
Right indices: [3, 4]
snode =0.0, sleft=0.0, sright=0.0, wleft=0.6, wright=0.4
Information gain: 0.0
NODE INDICES GIVEN: [0, 1, 2, 3, 4]
Data selected for feature 0: [1 1 1 0 0]
my left 1, [0, 1, 2]
my right 1, [3, 4]
Left indices: [0, 1, 2]
Right indices: [3, 4]
snode =0.9709505944546686, sleft=0.9182958340544896, sright=1.0, wleft=0.6, wright=0.4
Information gain: 0.01997309402197489
NODE INDICES GIVEN: [0, 1, 2, 3, 4]
Data selected for feature 1: [0 0 0 0 1]
That’s up to where it crashes
and the error I get
TypeError Traceback (most recent call last)
<ipython-input-103-2a50df79e115> in <module>
9
10 # UNIT TESTS
---> 11 compute_information_gain_test(compute_information_gain)
~/work/public_tests.py in compute_information_gain_test(target)
105 assert np.isclose(result, 0.019973, atol=1e-6), f"Wrong information gain. Expected {0.019973} got: {result}"
106
--> 107 result = target(X, y, node_indexes, 1)
108 assert np.isclose(result, 0.170951, atol=1e-6), f"Wrong information gain. Expected {0.170951} got: {result}"
109
<ipython-input-102-3bf83ca050ed> in compute_information_gain(X, y, node_indices, feature)
18 """
19 # Split dataset
---> 20 left_indices, right_indices = split_dataset(X, node_indices, feature)
21
22 # Some useful variables
<ipython-input-100-8fc02917f9b6> in split_dataset(X, node_indices, feature)
53
54 #this step converts them to a list, because the QA tests fail if its not a list
---> 55 left_indices=list(left_indices) #this should be a list
56 right_indices=list(right_indices) #this should be a list
57 print("my left 1, ", left_indices)
TypeError: iteration over a 0-d array
I have found many errors very helpful and made a lot of improvements based on these errors over the course of the last few hours. That’s how I’ve gotten to the point where I’m nearly able to start section 4.4!
However, I’m kind of discouraged by this one. Is it really saying that the method I used, using numpy, for “split_dataset” is inherently unacceptable to the autograder, and I have to go back and redo everything!!!
Please help, if you have any suggestions!
Thanks,
Steven