C1M3_Assignment - No sentence-transformers model found

Hello,

at the very begin of the notebook (on the second executable cell) I get the following error and cannot proceed any further with the assignment.

“No sentence-transformers model found with name BAAI/bge-base-en-v1.5. Creating a new one with mean pooling.”

I already tried multiple times to reboot kernel, retboot server and get latest version but nothing seems to work

:

Hi @bottarelli.lorenzo , welcome to the RAG course support zone :smile:
I’ve launched the C1M3_Assignment myself and currently the issue you noticed is not reproducible, so could be some temporary issues.. Notice you have both HF_HUB_OFFLINE and TRANSFORMERS_OFFLINE set to 1 (which is a default). I believe these env. variables control the caching mechanism. I’ve seen issues when setting these to 0 (but in your case this is not probably a case).
Can you please try to go to right top menu of the Jupyter notebook and restore the default environment and try again. I think that should resolve some problem with the existing data.
Let me pls. know if this has resolved it, I cold raise the issue that is visible to me back to developers still.

1 Like

@bottarelli.lorenzo

can you post the screenshot of full error, we did have issues labs, but updated just yesterday. This is seems to be completely new error especially with Local entry not found error.

as @ondrous mentioned Restoring to original version should get you the updated notebook. As another checkpoint, sometimes what I do, delete the old assignment file, and reboot the environment, so no previous cache files corrupt the new updated environment for the notebook codes to run successfully. Clearing your cache and browsing history also is better step before you restore updated notebook.

Clock on 3 dots on right upper corner and click Restore original version.

I have already tried multiple times restoring of the original version. I also deleted all the content of my lab to be sure that the restoring was creating all new files, but nothing seams to work.

As for the full error it is way too much to fit in a screenshot, here is a full copy:

No sentence-transformers model found with name BAAI/bge-base-en-v1.5. Creating a new one with mean pooling.

---------------------------------------------------------------------------
LocalEntryNotFoundError                   Traceback (most recent call last)
File /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:470, in cached_files(path_or_repo_id, filenames, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
    468 if len(full_filenames) == 1:
    469     # This is slightly better for only 1 file
--> 470     hf_hub_download(
    471         path_or_repo_id,
    472         filenames[0],
    473         subfolder=None if len(subfolder) == 0 else subfolder,
    474         repo_type=repo_type,
    475         revision=revision,
    476         cache_dir=cache_dir,
    477         user_agent=user_agent,
    478         force_download=force_download,
    479         proxies=proxies,
    480         resume_download=resume_download,
    481         token=token,
    482         local_files_only=local_files_only,
    483     )
    484 else:

File /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py:114, in validate_hf_hub_args.<locals>._inner_fn(*args, **kwargs)
    112     kwargs = smoothly_deprecate_use_auth_token(fn_name=fn.__name__, has_token=has_token, kwargs=kwargs)
--> 114 return fn(*args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1008, in hf_hub_download(repo_id, filename, subfolder, repo_type, revision, library_name, library_version, cache_dir, local_dir, user_agent, force_download, proxies, etag_timeout, token, local_files_only, headers, endpoint, resume_download, force_filename, local_dir_use_symlinks)
   1007 else:
-> 1008     return _hf_hub_download_to_cache_dir(
   1009         # Destination
   1010         cache_dir=cache_dir,
   1011         # File info
   1012         repo_id=repo_id,
   1013         filename=filename,
   1014         repo_type=repo_type,
   1015         revision=revision,
   1016         # HTTP info
   1017         endpoint=endpoint,
   1018         etag_timeout=etag_timeout,
   1019         headers=hf_headers,
   1020         proxies=proxies,
   1021         token=token,
   1022         # Additional options
   1023         local_files_only=local_files_only,
   1024         force_download=force_download,
   1025     )

File /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1115, in _hf_hub_download_to_cache_dir(cache_dir, repo_id, filename, repo_type, revision, endpoint, etag_timeout, headers, proxies, token, local_files_only, force_download)
   1114     # Otherwise, raise appropriate error
-> 1115     _raise_on_head_call_error(head_call_error, force_download, local_files_only)
   1117 # From now on, etag, commit_hash, url and size are not None.

File /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1636, in _raise_on_head_call_error(head_call_error, force_download, local_files_only)
   1635 if local_files_only:
-> 1636     raise LocalEntryNotFoundError(
   1637         "Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable"
   1638         " hf.co look-ups and downloads online, set 'local_files_only' to False."
   1639     )
   1640 elif isinstance(head_call_error, (RepositoryNotFoundError, GatedRepoError)) or (
   1641     isinstance(head_call_error, HfHubHTTPError) and head_call_error.response.status_code == 401
   1642 ):
   1643     # Repo not found or gated => let's raise the actual error
   1644     # Unauthorized => likely a token issue => let's raise the actual error

LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
Cell In[2], line 1
----> 1 from utils import (
      2     generate_with_single_input,
      3     print_object_properties,
      4     display_widget,
      5     kill_processes_on_ports
      6 )
      7 import unittests
      9 # Kill processes on ports before importing flask_app and weaviate_server
     10 # WARNING: Running this cell twice may kill the active kernel

File ~/work/utils.py:13
     10 from sentence_transformers import SentenceTransformer
     12 # Load a pretrained model from Hugging Face
---> 13 model = SentenceTransformer("BAAI/bge-base-en-v1.5", cache_folder = ".models")
     15 # Custom transport to bypass SSL verification
     16 transport = httpx.HTTPTransport(local_address="0.0.0.0", verify=False)

File /usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py:339, in SentenceTransformer.__init__(self, model_name_or_path, modules, device, prompts, default_prompt_name, similarity_fn_name, cache_folder, trust_remote_code, revision, local_files_only, token, use_auth_token, truncate_dim, model_kwargs, tokenizer_kwargs, config_kwargs, model_card_data, backend)
    327         modules, self.module_kwargs = self._load_sbert_model(
    328             model_name_or_path,
    329             token=token,
   (...)
    336             config_kwargs=config_kwargs,
    337         )
    338     else:
--> 339         modules = self._load_auto_model(
    340             model_name_or_path,
    341             token=token,
    342             cache_folder=cache_folder,
    343             revision=revision,
    344             trust_remote_code=trust_remote_code,
    345             local_files_only=local_files_only,
    346             model_kwargs=model_kwargs,
    347             tokenizer_kwargs=tokenizer_kwargs,
    348             config_kwargs=config_kwargs,
    349             has_modules=has_modules,
    350         )
    352 if modules is not None and not isinstance(modules, OrderedDict):
    353     modules = OrderedDict([(str(idx), module) for idx, module in enumerate(modules)])

File /usr/local/lib/python3.10/dist-packages/sentence_transformers/SentenceTransformer.py:2061, in SentenceTransformer._load_auto_model(self, model_name_or_path, token, cache_folder, revision, trust_remote_code, local_files_only, model_kwargs, tokenizer_kwargs, config_kwargs, has_modules)
   2058 tokenizer_kwargs = shared_kwargs if tokenizer_kwargs is None else {**shared_kwargs, **tokenizer_kwargs}
   2059 config_kwargs = shared_kwargs if config_kwargs is None else {**shared_kwargs, **config_kwargs}
-> 2061 transformer_model = Transformer(
   2062     model_name_or_path,
   2063     cache_dir=cache_folder,
   2064     model_args=model_kwargs,
   2065     tokenizer_args=tokenizer_kwargs,
   2066     config_args=config_kwargs,
   2067     backend=self.backend,
   2068 )
   2069 pooling_model = Pooling(transformer_model.get_word_embedding_dimension(), "mean")
   2070 if not local_files_only:

File /usr/local/lib/python3.10/dist-packages/sentence_transformers/models/Transformer.py:87, in Transformer.__init__(self, model_name_or_path, max_seq_length, model_args, tokenizer_args, config_args, cache_dir, do_lower_case, tokenizer_name_or_path, backend)
     84 if config_args is None:
     85     config_args = {}
---> 87 config, is_peft_model = self._load_config(model_name_or_path, cache_dir, backend, config_args)
     88 self._load_model(model_name_or_path, config, cache_dir, backend, is_peft_model, **model_args)
     90 if max_seq_length is not None and "model_max_length" not in tokenizer_args:

File /usr/local/lib/python3.10/dist-packages/sentence_transformers/models/Transformer.py:152, in Transformer._load_config(self, model_name_or_path, cache_dir, backend, config_args)
    148     from peft import PeftConfig
    150     return PeftConfig.from_pretrained(model_name_or_path, **config_args, cache_dir=cache_dir), True
--> 152 return AutoConfig.from_pretrained(model_name_or_path, **config_args, cache_dir=cache_dir), False

File /usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py:1197, in AutoConfig.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
   1194 trust_remote_code = kwargs.pop("trust_remote_code", None)
   1195 code_revision = kwargs.pop("code_revision", None)
-> 1197 config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
   1198 has_remote_code = "auto_map" in config_dict and "AutoConfig" in config_dict["auto_map"]
   1199 has_local_code = "model_type" in config_dict and config_dict["model_type"] in CONFIG_MAPPING

File /usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py:608, in PretrainedConfig.get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    606 original_kwargs = copy.deepcopy(kwargs)
    607 # Get config dict associated with the base config file
--> 608 config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
    609 if config_dict is None:
    610     return {}, kwargs

File /usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py:667, in PretrainedConfig._get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    663 configuration_file = kwargs.pop("_configuration_file", CONFIG_NAME) if gguf_file is None else gguf_file
    665 try:
    666     # Load from local folder or from cache or download from model Hub and cache
--> 667     resolved_config_file = cached_file(
    668         pretrained_model_name_or_path,
    669         configuration_file,
    670         cache_dir=cache_dir,
    671         force_download=force_download,
    672         proxies=proxies,
    673         resume_download=resume_download,
    674         local_files_only=local_files_only,
    675         token=token,
    676         user_agent=user_agent,
    677         revision=revision,
    678         subfolder=subfolder,
    679         _commit_hash=commit_hash,
    680     )
    681     if resolved_config_file is None:
    682         return None, kwargs

File /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:312, in cached_file(path_or_repo_id, filename, **kwargs)
    254 def cached_file(
    255     path_or_repo_id: Union[str, os.PathLike],
    256     filename: str,
    257     **kwargs,
    258 ) -> Optional[str]:
    259     """
    260     Tries to locate a file in a local folder and repo, downloads and cache it if necessary.
    261 
   (...)
    310     ```
    311     """
--> 312     file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs)
    313     file = file[0] if file is not None else file
    314     return file

File /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py:543, in cached_files(path_or_repo_id, filenames, cache_dir, force_download, resume_download, proxies, token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_gated_repo, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash, **deprecated_kwargs)
    540     # Here we only raise if both flags for missing entry and connection errors are True (because it can be raised
    541     # even when `local_files_only` is True, in which case raising for connections errors only would not make sense)
    542     elif _raise_exceptions_for_missing_entries:
--> 543         raise OSError(
    544             f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load the files, and couldn't find them in the"
    545             f" cached files.\nCheck your internet connection or see how to run the library in offline mode at"
    546             " 'https://huggingface.co/docs/transformers/installation#offline-mode'."
    547         ) from e
    548 # snapshot_download will not raise EntryNotFoundError, but hf_hub_download can. If this is the case, it will be treated
    549 # later on anyway and re-raised if needed
    550 elif isinstance(e, HTTPError) and not isinstance(e, EntryNotFoundError):

OSError: We couldn't connect to 'https://huggingface.co' to load the files, and couldn't find them in the cached files.
Check your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

I just ran down the codes after restoring original version, i didn’t encounter any error, please see the screenshot :backhand_index_pointing_down:t2:

Notice in the error feedback mentions there could be internet connection issue,

Could you try a different browser, and follow these steps.

  1. Clear cache and browsing history from your system/laptop.
  2. Relogin to your dlai learning platform id.
  3. Open the notebook from your the classroom page.
  4. Click on 3 dots on right upper corner, click restore original version. Do not switch any browser at this time, until the new updated notebook loads. Make sure not to keep refreshing instantaneously, rather wait for few seconds to refresh if notebook takes time to load.(it should probably take 30-50s max to load.
  5. now run each cell from beginning one by one, till the import cell. If you notice in that cell, it mentions trying to re-run cell instantaneously kills the kernel, so once you run that cell, wait for it run over, keeping the browser active.

Let me know if you still encounter this error.

regards

Dr. Deepti

Sorry I forgot to mention and maybe it is a foundamental information, but I am doing this course on Coursera. Maybe the problem is not solved yet on that platform ?

@bottarelli.lorenzo

That’s understandable with the confusion about platform.

It was informed to me last night only that all labs in Coursera as well as dlai learning platform including ungraded labs were updated. I also have run down lab for module 3 assignment successfully after restoring to original version as you can see here.

for coursera, please follow these steps.

First delete your cache and browsing history from your system/laptop.

Re-login in the Coursera, open notebook from your classroom page.

Delete the assignment file, by clicking on File==> Open section.

Click :red_question_mark:, then click Reboot, then click Get the Latest Version.

Again wait until the notebook loads in the environment and do not switch browsers.

Once loaded, connect your kernel.

Start running down cells individually one by one.

Remember the import cells usually takes time to run down completely, so wait for the cell to down completely, do not double click the running down cell, if you see * for longer time, as this kill the kernel and throws error.

Please try these steps and let me know.

Regards

Dr. Deepti.

I have tried a couple of times the steps you suggested but unfortunately the problem persist

i will inform staff and get back to you.

I am also having the exact same issue. DO we have a solution or a workaround?

kindly wait for staff update.

1 Like

Hi! I fixed this issue in Coursera today earlier, you will need to refresh your workspace in order to get the new utils.py file.

Let me know if this helps you.

1 Like

Sorry but it doesn’t work still.

Just to be clear:

  1. Using a terminal I have deleted ALL files and folders, including the hidden ones, in my home. See screenshot

  1. Rebooted

  2. Got the latest version. Now all files and folders have been regenated to the latest version. See screenshot

  3. The problem still persist. See screenshot

1 Like

I tried refreshing the workspace and the problem still persists

1 Like

Kind reminder that the problem is still there @Deepti_Prasad @lucas.coutinho

If it would help my Lab ID is omsckaqsrypy

@bottarelli.lorenzo

I am able to run the code successfully on dlai, sadly I am not able to have access on coursera right for this course to know why it is causing issue still for some learners.

But I wanted to try as best I could help.

So when I checked your earlier detailed error, it is pointing it couldn’t find any cache files to download the model when the import utils codes were run down.

I then checked utils.py file, to see if the model was downloaded offline, and it was.

So in dlai it shows it does download the cache file as cache folder, see the screenshot :backhand_index_pointing_down:t2:

So can you open the notebook from your classroom page, and open utils.py file, and take the same screenshot of the same part and post here, so I can check it there is any difference between the two, as to reasoning behind why you are still encountering this error.

Regards

Dr. Deepti

Here’s the screenshot of the utils.py. Looks like they are the same.

1 Like

Ohh :pensive_face:, thank you @mannyiyer for the image.

Then I think we will have to wait for @lucas.coutinho to look at this thread as I am still not able to open rag course on Coursera, i am unable to mimic the same error.

For sure the issue or error mention, to set local_dir to set to false, so the model can be downloaded online, that’s another way to work this code around, but lucas must have chosen this to be downloaded as offline for preventing killing of port processors.

So let us wait for lucas to respond.

If he doesn’t respond, one of you try to send the utils.py file from coursera via personal DM, I will check and try to look at the issue.

But I have :100: guarantee lucas will resolve this issue when he will check your both comments.

Apologies for this setback. as courses and codes are dependent of multiple channels, changes at one end causes a constant correction, reflection on recurring issues.

Regards

Dr. Deepti