C4W2 Assignment: Issue implementing it on Personal machine

Shantimohan_Elchuri · April 4, 2022, 5:48pm

I was trying to implement Text Summarization outside the portal on my machine. I am able to set it up to run on my Mac successfully. But when I wanted to use my own data directories I got into the problem. The issue occurred with Cell 42 (2 cells down UNQ_C8). The code is:

!rm -f ~/model/model.pkl.gz
loop = training_loop(TransformerLM, train_batch_stream, eval_batch_stream)
loop.run(10)

Note the 1st line of code which deletes the model file from the home/model directory. Then the model is rebuilt in the same directory when you run this cell.

Then coming to Cell 43, the code is:

# Get the model architecture
model = TransformerLM(mode='eval')

# Load the pre-trained weights
model.init_from_file('model.pkl.gz', weights_only=True)

Here the pre-trained weights are not loaded from the model file in home/model directory but from the current code directory. So I changed it to pick the file from the home/model directory.

No it doesn’t work. It gives some index error. But it works with the model file in the code directory.

Does it mean that the model rebuilt in Cell 42 is corrupted, wrong or what else? Is there a way to correct this issue? This is preventing me from applying what I learnt to solve my own problems.

paulinpaloalto · April 4, 2022, 6:39pm

I am not familiar with how any of that code works, but there’s a pretty obvious thing to check:

My guess is simply that you are not properly understanding where the “current working directory” is for the various stages there. If you want to do things in your own environment, then you also need to develop the skills to solve this type of problem on your own. At the point that you get the failure, stop and run the following linux “find” command to figure out where the new model file actually got created:

find ~ -name model.pkl.gz -print

It’s probably getting created someplace different that you didn’t expect. The output of the find command should point out where it is.

Shantimohan_Elchuri · April 4, 2022, 10:02pm

I forgot to mention that I was running it on Mac. When I said “current working directory”, I meant the directory that is shown when you run the command ‘pwd’ in the terminal.

Secondly, I deleted the file model.pkl.gz in the pwd and the script threw file not found error. So the model was created in ‘~/model’ directory but the script was looking for the file in the pwd folder.

I verified it through ‘finder’ also that.

So I hope I understand what I am working with. And regarding developing skills to solve the problem on my own, I have never worked in AI earlier but had been working with computers, both Windows and Unix, for about 40 years.

ThanQ…

ai_curious · April 4, 2022, 10:38pm

I have not yet been able to install trax on my Mac. Did you build from source? Or, which channel did you find to install from? x86 or arm ?

As far as I can tell, the code expects to read the model.pkl.gz from the same directory the Jupyter notebook was launched from. In the archive I downloaded there is no model subdirectory. Unfortunately my subscription has ended so I can’t access the Coursera env and because of my trax hard stop I can’t run locally either.

ai_curious · April 5, 2022, 12:50pm

btw as far as I can tell from the trax documentation, the init_from_file function loads a predefined model into memory, but does not write to disk. That requires model. save_to_file (). So if you successfully deleted the pretrained model’s model.pkl.gz file in the previous line, then you’re out of luck. I think the only reason this code ever ran is that there is no /model subdirectory so the !rm commend was a no op.

Shantimohan_Elchuri · April 5, 2022, 2:31pm

I just ran C3-W1-Lab1-Introduction to Trax.ipynb after uncommenting the trax install statement in the 1st cell.

#!pip install trax==1.3.9 Use this version for this notebook

Just move the comment marker # to before ‘Use…’

That installed trax for me on Intel-Mac.

Shantimohan_Elchuri · April 5, 2022, 3:28pm

I have taken out only the necessary code out of the assignment file and tried to run when I ran into issues. Definitely I am missing something.

I have uploaded the notebook that I prepared. Can any of you run it find out what am I doing wrong?

Download it into a separate folder of its own and run it.

Yes my aim is to use some other data and eventually my own data for Text Summarization purpose.

Text Summarization with TRAX.ipynb (59.3 KB)

ai_curious · April 5, 2022, 3:31pm

Thanks for clarifying. Mine is an M1 and I haven’t yet resigned myself to a virtual env running emulated stuff

ai_curious · April 5, 2022, 3:42pm

looks like different trax versions have caused this same issue for other people. are you sure you’re running the same version locally as was used to create that pickle file originally?

Shantimohan_Elchuri · April 5, 2022, 3:53pm

Ah! I have been still undecided on M1-Mac. But why do you think you will have to use virtual envs.? I am using Anaconda env. Never know if it is virtual one?

Shantimohan_Elchuri · April 5, 2022, 3:59pm

Coursera recommended to use Trax ver 1.3.9 (as mentioned in the C3-W1-Lab1). So I have been using that. From the link you posted, it seems a ver earlier to 1.3.7 had resolved the issue. On my Anaconda UI, it doesn’t give any option to install another version. So I will have to do some research to do that.

Shantimohan_Elchuri · April 6, 2022, 2:09pm

@ai_curious I ‘!pip install trax==1.3.6’. It installed without error but now the jaxlib is corrupted. I can’d do anything with jaxlib in Anaconda Environment.

‘!pip install trax==1.3.9’ did install the required libs originally. But now doing same for ver 1.3.7 didn’t update the jaxlib.

So now my trax env has become unusable. So researching to repair my environment.

Do you have any ideas?

ai_curious · April 6, 2022, 2:20pm

I use miniconda from the command line, so no suggestions about Anaconda UI, sorry. Pretty confident pip uninstall can remove specific packages, then use install with a specific version to match trax level

Milena_Djordjevic · September 19, 2022, 11:15am

I have similar errors when implementing this code locally: it seems that none of the trax versions is compatible and I’ve tried 1.3.1 as mentioned in the assignments from the previous courses and 1.3.7, 1.3.9 and latest 1.4.1. Much appreciated If someone resolves this and posts it

Topic		Replies	Views
Pre-trained weights NLP with Attention Models week-2	19	808	May 10, 2022
Error when loading ‘model.pkl.gz’ NLP with Attention Models week-2	7	544	July 31, 2023
Assignment: should loading the pre-trained model in Part 4 take a really long time? NLP with Attention Models week-2	1	595	February 5, 2022
4 - Evaluation, getting stuck, browser crash, etc NLP with Attention Models week-2	1	475	May 12, 2023
Exercise 06 - UNQ_C9: pretrained model issue with next symbol NLP with Attention Models week-2	2	545	March 16, 2023

C4W2 Assignment: Issue implementing it on Personal machine

Related topics