C2_W1 - Exercise 9 - errors

Chris_Tsalakopoulos · April 22, 2025, 2:56am

Hi,

There are some posts on this from a few years ago, but I believe I have covered what is discussed there, but still got a couple of errors I can’t address. Test result is this:

I have allow_switches check done for both one and two edits:
edit_one_letter(word,allow_switches)
edit_one_letter(X,allow_switches)

My general logic looks correct:

call edit_one_letter(word,allow_switches) and perform one edit on the original word.
The called function takes account of the argument for allow_switches and based on value for allow_switches performs the appropriate edits.
Take the returned set from step one above, and use this as a list and iterate through each word and perform a second edit with edit_one_letter(X,allow_switches) for each iteration (X holds the word). For each iteration, take the returned set and update this into a clean set from the first step.

I’ve tried removing allow_swtiches from the second edit step and still get two errors, but further away from expected outcomes. With allow_switches in for the second edit, I’m only 1 off the expected value in both errors.

Anyone have any guidance/thoughts on this, and what I might be missing?

paulinpaloalto · April 22, 2025, 4:31am

I added a print statement to show the length of the output and then ran the full set of tests and here’s what I see:

len(edit_two_set) 2654
len(edit_two_set) 7154
len(edit_two_set) 7130
len(edit_two_set) 14206
len(edit_two_set) 14352
 All tests passed

So it looks like you are only failing the last two tests with the length being less by 1. You can read the tests cases by opening the file w1_unittest.py. What I see is that the 5 tests are:

word = 'a', allow = True
word = 'at', allow = True
word = 'at', allow = False
word = 'cat', allow = False
word = 'cat', allow = True

So you pass some tests with allow = T or F, so that’s not the problem. You only fail when the input word as 3 letters apparently. So what in your logic would handle that case differently? Why would your logic omit one output word in the case that the input has 3 letters?

It would be worth printing the length as I did, so that you can compare and confirm the theory that your length values are correct for the first 3 tests.

Chris_Tsalakopoulos · April 22, 2025, 5:11am

Hey Paul, thanks for your analysis.
I’ll run some further tests based on your suggestions see what comes out.

Chris_Tsalakopoulos · April 22, 2025, 6:31am

Hi Paul,

I ran some tests, the lengths all match expect for the last two test cases in unittest, that is:
len(edit_two_set) 2654
len(edit_two_set) 7154
len(edit_two_set) 7130
len(edit_two_set) 14205 (should be 1406)
len(edit_two_set) 14351 (should be 14352)

So, as you state, its failing by one value on the 3 letter word for second edit step, in both True & False cases. So it wouldn’t be the ‘allow_switches’ logic.

My 3 letter edits in general work, as they have been tested by unittest in the previous question.

Something in the second edit step seems to be causing the issue for 3 letter words. I seem to be iterating correctly through each word on the second step. I used print statement to check a sample of the iterative steps, and it looked fine. At any rate, if that were the issue here I would likely get errors elsewhere.

Unittest doesn’t have the full list of expected outcomes, otherwise I could run a set comparison and see what words were actually missing. It only has a sample head and tail subset of the total outputs.

I also tried an “if X = ’ ': continue”’ to skip over any null values in the iteration steps for second edit (there is one I saw), but this made no difference either way.

So, I’m not sure…I’ll have to think further about it. Unless you have any further suggestions?

paulinpaloalto · April 22, 2025, 2:48pm

They don’t show the full set of output words and with > 14k of them that’s probably not going to be easy to deal with in any case. I copied their “first 10” and “last 10” logic inside the function so that we can at least get some visibility and here’s what I see:

len(edit_two_set) 2654
First 10 strings ['', 'fac', 'acw', 'eha', 'eal', 'agd', 'nas', 'aun', 'avh', 'yar']
Last 10 strings ['wua', 'ray', 'haw', 'ow', 'fwa', 'vai', 'akd', 'eam', 'ars', 'bc']
len(edit_two_set) 7154
First 10 strings ['', 'afot', 'hayt', 'zot', 'agd', 'atvm', 'aun', 'uto', 'astg', 'avte']
Last 10 strings ['waft', 'bo', 'mtb', 'ghat', 'pax', 'afft', 'acta', 'auth', 'uhat', 'aul']
len(edit_two_set) 7130
First 10 strings ['', 'afot', 'hayt', 'zot', 'agd', 'atvm', 'aun', 'uto', 'astg', 'avte']
Last 10 strings ['waft', 'bo', 'mtb', 'ghat', 'pax', 'afft', 'acta', 'auth', 'uhat', 'aul']
len(edit_two_set) 14206
First 10 strings ['csatf', 'cuakt', 'ckayt', 'jckt', 'cixt', 'hayt', 'cvd', 'hcawt', 'caje', 'zot']
Last 10 strings ['adcat', 'cptat', 'cfsat', 'focat', 'uctt', 'cxap', 'cmatr', 'cjrat', 'csatj', 'uhat']
len(edit_two_set) 14352
First 10 strings ['csatf', 'cuakt', 'ckayt', 'jckt', 'cixt', 'hayt', 'cvd', 'hcawt', 'caje', 'zot']
Last 10 strings ['cptat', 'cfsat', 'focat', 'uctt', 'cxap', 'cmatr', 'acta', 'cjrat', 'csatj', 'uhat']
 All tests passed

I guess we may not get lucky enough that the bug is at the beginning or at the end, but it’s worth a try.

If that doesn’t shed any light, maybe it’s time for the “in case of emergency, break glass” method and we should just look at your code. We can’t do that in a public thread, but please check your DMs for a message from me.

Deepti_Prasad · April 22, 2025, 2:57pm

@Chris_Tsalakopoulos

if your previous grade functions codes, are correct. then to correct your edit two letters codes require you to recall based on this LinkedIn comment post

but I would want you to check your process data codes (the first grade function) to0, the same link has response further on how to recall those codes too.

go through that post, in case if you still encounter issue, you can share screenshot of your grade codes with any of us by DM.

Regards
DP

Deepti_Prasad · April 23, 2025, 4:11am

after going through grade functions codes for process data, edit one letter and edit two letter, I surely noticed you have hard_coded the codes especially in edit two letters. even in edit one letter it seems you copy pasted codes as I can see ###START CODE HERE TWICE in the screenshots.

The main issue does lie in edit two letter but I would recommend a fresh copy and re-do the assignment.

the autograder for edit one letter might fail due to hard coding each type of letter in if and else condition and then recalling the edit one set. you are suppose to recall those codes in a single condition line.

for edit two letters, when you first created a set with words using edit one letter set and allow switches.

Which again you used to check X in the same codes and finally created edit two letter using the first created set. This is incorrect.

First create a edit_one_l using edit one letter function to words and switch(this is you recalled as first set)
After this you just need to use for condition.

for every cur word in edit_one_l, create a edit two set mentioning as s | t where s is your edit two set and t is the set condition edit_one_l for cur word and allow switches.

if you are still failing the test, then let paul see your other codes before the edit one letter, there might be a very minute error in conditioning for any of delete, replace or insert letter to detect.

paulinpaloalto · April 23, 2025, 4:14am

I have not figured out why yet, but the problem is in the replace_letter function. If I replace that code with my code and leave everything else the same, Chris’s edit_two_letters function passes the tests.

paulinpaloalto · April 23, 2025, 4:17am

Here is the code Chris has for that:

{moderator edit - solution code removed}

My code looks very different in that I used straight indexing and enumerations, not a doubly nested for loop.

Of course his code passes the unit test, so this is also a bug in the unit tests.

It’s getting late in my timezone and I’m not sure I will be able to spend more time on this tonight. Sorry, but I’ll see how far I get.

paulinpaloalto · April 23, 2025, 4:27am

I think that line is wrong. string replace will replace all instances of a given phrase. What if the letter being replaced occurs more than once? You need to do the operation by index, not by value, right?

Deepti_Prasad · April 23, 2025, 4:34am

thanks Paul this does help.

So here his code in wrong direction when he is putting those conditions for incorrect words and letters.

@Chris_Tsalakopoulos, in replace letter grade functions codes

first you need to avoid taking len of word and letters

while your split the word to general substring statement is correct, you required to follow the below code pattern

A list comprehension or for loop which form strings by replacing letters. This can be of the form:
[f(a,b,c) for a, b in splits if condition for c in string] recalling it as split_l

then next code instructs you to
Step 2: generate all possible strings by replacing one letter in the word

use the the same form of condition I mentioned before to replace one general substring but difference here would be choice of letter in a set string to the split_l

your thrid code line instructions mentions,
Step 3: remove the input word
here use the whole condition to check words in replace letter and you are suppose to .remove the word not append the new word.

Deepti_Prasad · April 23, 2025, 4:35am

also then letter condition choice paul. he seems to have not tried to replace all the possible strings. the conditionality is incorrect and not as per instructions below the exercise header

paulinpaloalto · April 23, 2025, 2:13pm

Sorry!!! I got confused last night and posted the replace_letter source code on the public thread, when I meant to post it on the DM thread that we have been working on in parallel.

Deepti_Prasad · April 23, 2025, 3:03pm

@Chris_Tsalakopoulos

take a fresh copy and re do the assignment. remember as paul mentioned although you codes might still get the you the expected results in some test, those are leading to bugs in next further test cells resulting mismatch unexpected output.

So try to stick write codes as per instructions already given in the assignment.

Let us know if you are still stuck, we are happy to help you.

paulinpaloalto · April 23, 2025, 4:46pm

You can also just start by rewriting the replace_letter code based more closely on the instructions that they give you. The evidence suggests that is the only real problem, although I notice that you also used the “nested loop” implementation in some of the other “letter” functions.

My next project is to come up with a test case that fails with your current version of the replace_letter code, so that I can file a bug to enhance the unit tests in the notebook to catch the problem. Not sure how much time I’ll have to work on that today, but I will let you know as soon as I have such a test case.

paulinpaloalto · April 23, 2025, 5:28pm

Consider this code:

word = "caa"
letter = 't'
new_word = word.replace(word[2], letter)
print(f"new_word {new_word}")

Running that gives this result:

new_word ctt

Needless to say, this is double plus ungood.

Now it turns out, for reasons that I cannot yet explain, if you use your broken implementation, the only missing word in both of the “cat” cases (switch True or False) is “tac”. So for every other word that would be eliminated by the above bug, there must be another path with edit distance two that gets you to the same result.

More news later, I hope …

Chris_Tsalakopoulos · April 24, 2025, 5:12am

Thanks Deepti and Paul for the suggestions.

I’ve had to attend to some job tasks most of yesterday and today, and haven’t had a chance to try the suggestions.

I plan to work on this tomorrow. I will let you know how I go.

Thanks again.

paulinpaloalto · April 24, 2025, 2:46pm

It is easy to come up with a test case that will fail with your implementation of replace_letter: just include a repeated letter.

Here’s what happens:

replace_l = replace_letter_chris(word='caa',
                              verbose=True)
replace_s = set(replace_l)
print(f"len(replace_l) {len(replace_l)}")
print(f"len(replace_s) {len(replace_s)}")

Running that gives:

Input word = caa 
split_l = [['', 'caa'], ['c', 'aa'], ['ca', 'a']] 
replace_l ['aaa', 'baa', 'cbb', 'cbb', 'ccc', 'ccc', 'cdd', 'cdd', 'cee', 'cee', 'cff', 'cff', 'cgg', 'cgg', 'chh', 'chh', 'cii', 'cii', 'cjj', 'cjj', 'ckk', 'ckk', 'cll', 'cll', 'cmm', 'cmm', 'cnn', 'cnn', 'coo', 'coo', 'cpp', 'cpp', 'cqq', 'cqq', 'crr', 'crr', 'css', 'css', 'ctt', 'ctt', 'cuu', 'cuu', 'cvv', 'cvv', 'cww', 'cww', 'cxx', 'cxx', 'cyy', 'cyy', 'czz', 'czz', 'daa', 'eaa', 'faa', 'gaa', 'haa', 'iaa', 'jaa', 'kaa', 'laa', 'maa', 'naa', 'oaa', 'paa', 'qaa', 'raa', 'saa', 'taa', 'uaa', 'vaa', 'waa', 'xaa', 'yaa', 'zaa']
len(replace_l) 75
len(replace_s) 50

Here’s the output I get with my code that passes the grader:

Input word = caa 
split_l = [('', 'caa'), ('c', 'aa'), ('ca', 'a')] 
replace_l ['aaa', 'baa', 'cab', 'cac', 'cad', 'cae', 'caf', 'cag', 'cah', 'cai', 'caj', 'cak', 'cal', 'cam', 'can', 'cao', 'cap', 'caq', 'car', 'cas', 'cat', 'cau', 'cav', 'caw', 'cax', 'cay', 'caz', 'cba', 'cca', 'cda', 'cea', 'cfa', 'cga', 'cha', 'cia', 'cja', 'cka', 'cla', 'cma', 'cna', 'coa', 'cpa', 'cqa', 'cra', 'csa', 'cta', 'cua', 'cva', 'cwa', 'cxa', 'cya', 'cza', 'daa', 'eaa', 'faa', 'gaa', 'haa', 'iaa', 'jaa', 'kaa', 'laa', 'maa', 'naa', 'oaa', 'paa', 'qaa', 'raa', 'saa', 'taa', 'uaa', 'vaa', 'waa', 'xaa', 'yaa', 'zaa']
len(replace_l) 75
len(replace_s) 75

Chris_Tsalakopoulos · April 25, 2025, 2:18am

@Deepti_Prasad & @paulinpaloalto ,

Yes, the ‘replace’ method was the issue. As Paul showed, the ‘replace’ method replaces ALL the occurrences of the same character in a word. Therefore, if I have a word ‘caa’, and I attempt a replacement (with ‘t’) based on the letter at, say, the 1st position, the method will interpret that to mean that all occurrences of the character ‘a’ in the original word need to be replaced with ‘t’. So, I get ‘ctt’, which is wrong.

I just changed the approach in the code in the replace_letter function to identity the index position in the word and do the replacement by concatenating the word parts back together with the new letter in place at the identified position.

This worked straight away. I kept everything else the same and passed all tests in Exercise 9.

I could also try replacing the nested for loops with the recommended approach in the exercise. But I’ll do that later.

So thanks again, this was a hidden issue that was initially difficult to identify, but we got there in the end.

Topic		Replies	Views
C2_W1 / Part3.2 NLP with Probabilistic Models	4	300	November 5, 2021
C2_W1 / Part3.2 allow switches NLP with Probabilistic Models week-module-1	6	592	January 6, 2024
UNQ_C9 GRADED FUNCTION. (edit_two_letters) NLP with Probabilistic Models week-module-1	8	562	May 8, 2023
W1Q9: The Reason You Get 2 Test Failed NLP with Probabilistic Models week-module-1	5	446	July 31, 2023
Possible bug in test C2W1 assignment UNQ_C9 - edit_two_letters function NLP with Probabilistic Models week-module-1	25	1146	February 26, 2024

C2_W1 - Exercise 9 - errors

Related topics