This line of code converts the text into a tensor. So, you should use it instead of tf.constant() (there are no tf.constant in the cell above Exercise 5), as per my previous instruction:
Your ChatGPT must love you b/c it keeps giving me errors.
Nevertheless, thank you for this! I would have NEVER known to phrase my prompt this way.
No problem. My prompt is actually pretty weak, since I didn’t think for two seconds when creating it. Pondering longer on it would definitely give you better results.
If you want to learn more about it, you should check out this free course.
What error do you get?
Hello
NameError Traceback (most recent call last)
Cell In[43], line 6
3 temp = 0.0
4 original_sentence = “I love languages”
----> 6 translation, logit, tokens = translate(trained_translator, original_sentence, temperature=temp)
8 print(f"Temperature: {temp}\n\nOriginal sentence: {original_sentence}\nTranslation: {translation}\nTranslation tokens:{tokens}\nLogit: {logit:.3f}")
Cell In[42], line 24, in translate(model, text, max_length, temperature)
20 original_sentence = “I love languages”
22 # Assuming trained_translator is your trained model
23 # Ensure that the translate function applies text_vectorizer internally if needed
—> 24 translation, logit, tokens = translate(trained_translator, text_vectorizer, original_sentence, 50, 0.0)
25 ### END CODE HERE ###
26
27 # Concatenate all tokens into a tensor
28 tokens = tf.concat(tokens, axis=-1)
NameError: name ‘text_vectorizer’ is not defined
Following your guidance, I’ve successfully applied the hint from the previous cell to convert the input text into a tensor and add a batch dimension with the following line of code:
texts = tf.convert_to_tensor(eng_sentence)[tf.newaxis]
This step helped me understand the importance of formatting the text input correctly for TensorFlow processing, ensuring it’s in a tensor format with the necessary batch dimension.
However, I’ve run into an issue where implementing the complete logic for the translate
function and subsequently calling this function with the correct parameters seems to require modifications outside the designated START/END code comment tags. Specifically, the structure of my code includes defining the translate
function and then calling it with the necessary arguments, like so:
def translate(model, text_vectorizer, text, max_length=50, temperature=0.0):
# Function logic here
pass # Placeholder for implementation details
# Outside START/END tags
translation, logit, tokens = translate(trained_translator, text_vectorizer, original_sentence, 50, 0.0)
I understand the importance of adhering to the exercise’s constraints and guidelines, but I find myself uncertain about how to proceed without adjusting the code outside the specified comment tags to correctly define and utilize the translate
function as intended.
Should modifications outside the START/END comment tags be considered acceptable for correctly implementing and demonstrating the functionality of the translate
function, especially in light of the hint provided?
I’m just curious, as I’m not a mentor for this course. Would using ChatGPT to write your code be allowed under the Code of Conduct for this course?
Hi @TMosh
To be honest, I don’t know.
I don’t think this question is Course dependent (if ChatGPT is not allowed for DLS then it most probably not allowed on other Courses too).
My personal view is that it is a grey area. I see two aspects of this question - copyright and cheating.
As for copyright, in my view, ChatGPT is a tool (like Google Drive, GMail, VS Code etc.) and if you can save Assignment Notebooks in your gmail or gdrive, you should probably be ok with asking ChatGPT questions about your code (or get suggestions from Copilot in VS Code plugin, or using Copilot chat, etc.). In other words, it’s not sharing the material with public (it’s ok to have a copy on Google drive as long as it’s not visible to everyone on the internet), but with companies (or products that they build) and it’s probably ok.
I’m not a lawyer, maybe there are distinctions how the data is used (I rarely read or fully understand the Terms of Use etc.) and other things the lawyers might enlighten us about, so who knows
As for cheating, in my personal view, it’s again a grey area. It could be compared to using a calculator, documentation or stackoverflow… Maybe tools like these are not limited, for example, they can help with completing the Assignment entirely, or maybe with just understanding the code, maybe in some other ways. In other words, it’s not so clear if or when ChatGPT is cheating.
What are your thoughts on that? Maybe @Mubsi or @Staff could clarify that for us?
Hi @RyeToast
Take note what is being asked from you:
In other words, there is no ‘text_vectorizer’ but there is english_vectorizer
defined for you.
Cheers
Hi, @arvyzukai
My 6 attempts to resolve this within the allowed modification area have not been successful:
After following the provided instructions and your previous advice to use english_vectorizer
instead of text_vectorizer
, I successfully addressed the initial NameError
. However, I’ve run into a ValueError
related to processing the vectorized text through the LSTM layer of our model. The error message indicates that the LSTM layer cannot accept a RaggedTensor
, which is the output format of our english_vectorizer
when applied to the input text.
Given the constraints that we can only modify code within the ### START CODE HERE ###
and ### END CODE HERE ###
comments, I’m unsure how to proceed with converting the RaggedTensor
to a regular tensor suitable for the LSTM layer. I understand that the LSTM expects a uniform tensor format.
How to handle the RaggedTensor
produced by english_vectorizer
in this context?
In addition, the start-of-sequence (SOS) token’s ID. I understand that this ID is crucial for the translation process, but I’m unsure of how to correctly identify or define it in our current setup. I’ve tried to find it in the text processing pipeline.
Here’s the relevant part of the code where I’m facing the issue:
# INITIAL STATE OF THE DECODER
# First token should be SOS token with shape (1,1)
# You need to define or get the sos_id from somewhere in your code
sos_id = ... # replace this with the correct value
next_token = tf.fill((1, 1), sos_id)
I think the SOS token’s ID might be obtained from the english_vectorizer
or the TokenTextEncoder
, but I’m not sure how to apply this in our current context.
How do I correctly identify the SOS token’s ID?
If this assignment is like many others DLAI has created that use language models, then the SOS token is likely defined as a constant somewhere within the notebook.
Hi, @TMosh
Cc: @arvyzukai
Firstly, I’ve successfully identified the sos_token
within the notebook, which I understand is crucial for initializing the decoding process in our translation model. I found instances of ‘SOS’ and ‘sos_token’ mentioned throughout the notebook and have determined the correct usage of the sos_token
to initiate the translation function.
However, while progressing with the implementation of the translate
function, which utilizes the trained_translator
model to translate a given sentence from English to Portuguese, I encountered a NameError
indicating that the trained_translator
variable is not defined. This occurs at the step where the function is called with a test sentence:
translation, logit, tokens = translate(trained_translator, original_sentence, temperature=temp)
Despite reviewing the notebook to ensure that all steps related to defining or loading the trained_translator
model were correctly followed, I seem to be missing the specific instructions or steps necessary to either initialize or load the pre-trained model required for the translation function to execute properly.
In addition, we’re instructed to only modify the code between the START/END comment tags. However, I’m unsure about the exact changes I need to make within these tags. So far, I’ve replaced sos_token
with sos_id
to initialize the next_token
variable, as shown below:
# First token should be SOS token with shape (1,1)
next_token = tf.constant(sos_id, shape=(1, 1), dtype=tf.int32)
However, I believe there should be more code within the START/END tags to complete the translation process, such as feeding the input text into the model, generating the next token, and repeating this process until the EOS token is generated or the maximum length is reached.
Also, when I tried to call the translate
function, I received a NameError
indicating that trained_translator
is not defined. I understand that trained_translator
should be the trained model used for translation, but I’m unsure about where and how to define and train this model in relation to the translate
function.
How to complete the translate
function and how to define and train the trained_translator
model?
Sorry, I can only offer basic advice (such as "Don’t modify anything outside of the “START CODE HERE” sections), because I’m not a mentor for this course, and I do not have access to the course materials.
Hi @RyeToast
Once again, I would advise you to review my previous post on this thread.
Every single line you have to implement is compared with “See how it works by running the following cell:” cell which is given you as an example.
trained_translator
is defined for you in previous cells (section “3. Training”), so you either forgot to run it (you need to run all previous cells), or deleted it, or something similar. In other words, it’s a global variable that was defined for you.
Again, look at the “See how it works by running the following cell:”, you did not implement it the way it was done in this cell.
In summary, review my previous post which explains how you should implement the translate
function and how it compares to the “See how it works by running the following cell:”.
Cheers
Hi, @arvyzukai
I’ve run into issues with the translate function, particularly around the use of english_vectorizer
and handling input data formats for the LSTM layer within the model’s encoder.
Text Vectorization Issue: I understand that the english_vectorizer
is used to convert raw English text into a format that our model can understand. However, I’m unsure about the exact process of applying this vectorizer to the input text within the translate
function. Should I be applying the vectorizer to the entire input sentence at once, or should I be processing the sentence word by word?
LSTM Input Data Format: I’m also having trouble understanding the correct format for the input data to the LSTM layer within the model’s encoder. I know that LSTM layers expect input data in a 3D format, but I’m unsure how to reshape my vectorized text data to meet this requirement. Could you provide some guidance on this?
I’ve reviewed the course materials and example code thoroughly, but I’m still having trouble grasping these concepts.
Apologies for my ignorance.
I’ve been using GitHub Copilot to assist with the coding process, but unfortunately, it hasn’t been able to provide a solution for the issues I’m encountering within the START/END comment tags either.
The entire sentence. english_vectorizer
handles the sentence word by word or more precisely token by token.
This was implemented by you previously in Exercise 1 - Encoder, which later you included in Exercise 4 into the Translator (model). So now, you just need to call it as model.encoder
and it converts the tokenized sentence (the output of english_vectorizer
) into the format the LSTM needs and also processes with the same LSTM you defined in the Exercise 1.
Does that make sense?
I recommend against this, because it doesn’t necessarily understand what you’re trying to do, and the Code of Conduct says you should only submit your own work.
Shout out to @arvyzukai ! I would have NEVER figured this out without your hints! Thanks for the guidance and support throughout this challenging yet incredibly enlightening assignment. The journey through the neural translation model has been both intricate, fascinating, and ing frustrating.
Navigating through the nuances of text vectorization, encoder-decoder structures, and the LSTM layers presented a steep learning curve. With the examples you provided, I’ve gained a more profound understanding of how to preprocess input texts, manage the LSTM input data format, and iteratively generate translations.
Thank you, again!
I agree with @TMosh on this - preferably you should come up with solutions without any help and it should only be used as a last resort (after you tried multiple times, read through documentation, searched online, etc).
The goal of these Courses is learning and the learning process is important in achieving this goal.
In other words, if you just submitted all Assignment notebooks for ChatGPT to be completed for you and it did it for you perfectly, the “amount” of learning you would get from this process is surely way less than attempting to complete the Course yourself.
Cheers
Hello @arvyzukai
I was partially passing translate grader cell as my translate for temperature 0 was not matching but passed the assignment.
I somehow remember the instruction tf.zeros but ignored it as I followed the previous cell and reading your comment confirmed about my mistake.
The detail response to the learner in the post, does help a lot.
Regards
DP
Hello @arvyzukai
I was partially passing translate grader cell as my translate for temperature 0 was not matching but passed the assignment.
I somehow remember the instruction tf.zeros but ignored it as I followed the previous cell and reading your comment confirmed about my mistake.
The detail response to the learner in the post, does help a lot. Thank you.
Regards
DP