Problem in finding accuracy UNQ_C5

mohit · February 27, 2022, 4:59pm

Hi, I am facing following error in the line that sets value of v1 and v2 by calling model. Please help

ai_curious · February 27, 2022, 5:15pm

the Siamese model requires inputs, calling it as model() doesn’t give it anything to operate on. Don’t take this hint

use model() to create v1, v2

too literally.

PS: I can see there is a problem with your “chunk” of output targets. take another look at the hint above…

use batch size chuncks [sic] of questions as Q1 & Q2 arguments of the data generator. e.g x[i:i + batch_size]

mohit · February 27, 2022, 6:27pm

Sorry I could not understand what you are referring. I tried following approaches

v1,v2=model((q1,q2)) and got accuracy of 0.4107422 but not as expected 0.69
v1,v2=q1[i:i + batch_size],q2[i:i + batch_size] and got error index 0 is out of bounds for axis 0 with size 0
v1,v2=q1,q2 and got Accuracy 0.4951172 but not as expected 0.69
v1,v2=model((test_Q1,test_Q2)) and got error TypeError: JAX only supports number and bool dtypes, got dtype object in array
v1,v2=model((test_Q1[i:i + batch_size],test_Q2[i:i + batch_size])) and got error TypeError: JAX only supports number and bool dtypes, got dtype object in array

I am not getting what I have done wrong and what is proper approach

PS: I got the reason of error of case 1 of accuracy mismatch that earlier I was using vocab[‘pad’] which was causing issue so I changed it to vocab[’<PAD>’]

ai_curious · February 27, 2022, 7:05pm

#1 above solves the problem about model() not having positional arguments, and those are the right arguments.

The slicing syntax to take a “chunk” of batch_size starting at i is used twice, but not in the call to model(). First, it is applied to test_Q1 and test_Q2 where they are passed into the data_generator. This is what produces q1 and q2. Then, it is used again to extract the corresponding range from y.

Robert_Garner · March 1, 2022, 11:29am

I am also struggling with this. My accuracy is 0.546. I have called the model using:
v1, v2 = model((q1,q2)), as suggested above
The shape of the resulting v1 and v2 is (512,128). I am not sure if that is what it should be?

I defined y_test using:
y_test = y[i:i + batch_size]

When I run the unit test cell is just says:
."…default_example_check Wrong output for accuracy metric.
Expected .
Got ."

I’ve been staring at this for a while now and can’t really see where I might be going wrong, unless there is an error in my data generator, but that passed it’s unit test …

ai_curious · March 1, 2022, 1:04pm

Before nested loops

batch_size: 512
v1.shape: (512, 128)
v2.shape: (512, 128)

After both nested loops, before division

accuracy = 7075

If you get 7075 but wrong expected output, check the denominator - mine is 10240. If you don’t get 7075, check the generator - HINT: Don’t forget to set shuffle - or the outer loop control. i iterates on the Q1 input in steps of batch_size.

The unit tests are available for inspection in w4_unittest.py and this one has test case parameters

{
            "name": "default_example_check",
            "input": {
                "test_Q1": Q1_test,
                "test_Q2": Q2_test,
                "y": y_test,
                "threshold": 0.7,
                "model": model_mock(
                    "./support_files/classify_fn/accuracy_metric_batch512.pkl"
                ).mocked_fn,
                "vocab": vocab,
                "data_generator": data_generator,
                "batch_size": 512,
            },
            "expected": 0.69091796875,
        }

There is an error of omission in the exception handling so the actual and expected aren’t being printed out

for test_case in test_cases:
        result = target(**test_case["input"])

        try:
            assert np.isclose(result, test_case["expected"])
            successful_cases += 1
        except:
            failed_cases.append({"name": test_case["name"]})
            print(
                f"{test_case['name']} Wrong output for accuracy metric.\n\tExpected .\n\tGot ."
            )

HTH

Robert_Garner · March 2, 2022, 1:45pm

Thank you for your reply. It was very helpful in that it got me focused on the numerator only.

I checked everything and could find nothing wrong. Eventually I reread the comments and saw on that said “# use batch size chuncks of questions as Q1 & Q2 arguments of the data generator. e.g x[i:i + batch_size]”. This made no sense to me but in desperation I tried it. That has solved the problem, BUT…

I now have no idea why. Why do I need to chunk up the inputs to the generator? Is that not what it does for itself? What am I missing here?

ai_curious · March 2, 2022, 2:42pm

Glad you got the algebra to match expected. This function, classify, is designed for us to iterate through the input set in chunks. See the bulleted list of instructions

Instructions
Loop through the incoming data in batch_size chunks
Use the data generator to load q1, q2 a batch at a time.

We’re not told why that design choice was made for us, but my guess is memory. 512 pairs seems a manageable size for the data_generator and model to deal with at one time. If you’re a glutton for punishment, you could bump up the batch_size parameter being passed in and see what happens. My guesses? Best case, runs slower. Worse case, server out of memory error.

eranpick · March 10, 2022, 11:00pm

I am still getting the accuracy of: 0.55380857

I got the q1, q2 = as described on this thread: next(data_generator(test_Q1[i:i + batch_size], test_Q2[i:i + batch_size], batch_size, pad=vocab[’’], shuffle=False))

with and y_test = y[i:i + batch_size]

then fastnp.dot(v1[j], v2[j].T) for d

accuracy += (res == y[j])

accuracy = accuracy / len(test_Q1)

Sorry for the codeshare, but obviously there is an issue with it, because not working for me.

ai_curious · March 12, 2022, 6:25pm

Couple of observations:

Compare the inline comment

# compute accuracy if y_test is equal ‘res’

with the implementation you share above. Using the wrong argument in the equality operator would impact the numerator of the accuracy computation.

Compare the inline comment

# compute accuracy using accuracy and total length of test questions

with the implementation above. Using the wrong variable for total length impacts the denominator.

Sakshi_Priya · May 28, 2022, 11:55am

hi @ai_curious/@Mubsi I am facing issue in UNQ_C6

getting below error
38 Q1, Q2 = next(data_generator([Q1], [Q2],1,vocab[’’]))
39 # Call the model
—> 40 v1, v2 = model((Q1, Q2))

TypeError: cannot unpack non-iterable NoneType object

ai_curious · May 28, 2022, 3:29pm

my notebook includes a comment line…

    # Hint: use `vocab['<PAD>']` for the `pad` argument of the data generator
   Q1, Q2 = None

that I don’t see that reflected in your code. I have found it best practice to comply with the hints and guidance provided in the notebooks. Maybe give it a try and let us know what happens.

Topic		Replies	Views
W1 Q6 accuracy not converging NLP with Sequence Models week-module-1	2	539	May 21, 2022
C3_W4 Ex 5 : Classify - wrong accuracy NLP with Sequence Models week-module-4	1	379	December 11, 2023
C3W4 Assignment Exercise 4 Samese model for question duplicates: Evaluation part NLP with Sequence Models week-module-3	12	505	January 4, 2024
UNQ_C5: classify(...) returns incorrect accuracy NLP with Sequence Models week-module-4	4	694	January 7, 2022
UNQ_C8 Model Accuracy too low NLP with Sequence Models week-module-1	1	517	June 5, 2023

Problem in finding accuracy UNQ_C5

Related topics