L1 notebook throws error : Download Error

hbui · July 8, 2023, 5:40am

Here is the error I got when running this cell:
Cell starts on line 12:
url=“https://www.youtube.com/watch?v=jGwO_UgTS7I”
save_dir=“docs/youtube/”
loader = GenericLoader(
YoutubeAudioLoader([url],save_dir),
OpenAIWhisperParser()
)
docs = loader.load()

Error:

JSONDecodeError Traceback (most recent call last)
File /usr/local/lib/python3.9/site-packages/openai/api_requestor.py:669, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream)
668 try:
→ 669 data = json.loads(rbody)
670 except (JSONDecodeError, UnicodeDecodeError) as e:

File /usr/local/lib/python3.9/json/init.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
→ 346 return _default_decoder.decode(s)
347 if cls is None:

File /usr/local/lib/python3.9/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
333 “”“Return the Python representation of s (a str instance
334 containing a JSON document).
335
336 “””
→ 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()

File /usr/local/lib/python3.9/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
354 except StopIteration as err:
→ 355 raise JSONDecodeError(“Expecting value”, s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

APIError Traceback (most recent call last)
Cell In[12], line 7
2 save_dir=“docs/youtube/”
3 loader = GenericLoader(
4 YoutubeAudioLoader([url],save_dir),
5 OpenAIWhisperParser()
6 )
----> 7 docs = loader.load()

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/generic.py:90, in GenericLoader.load(self)
88 def load(self) → List[Document]:
89 “”“Load all documents.”“”
—> 90 return list(self.lazy_load())

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/generic.py:86, in GenericLoader.lazy_load(self)
84 “”“Load documents lazily. Use this when working at a large scale.”“”
85 for blob in self.blob_loader.yield_blobs():
—> 86 yield from self.blob_parser.lazy_parse(blob)

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/parsers/audio.py:51, in OpenAIWhisperParser.lazy_parse(self, blob)
49 # Transcribe
50 print(f"Transcribing part {split_number+1}!")
—> 51 transcript = openai.Audio.transcribe(“whisper-1”, file_obj)
53 yield Document(
54 page_content=transcript.text,
55 metadata={“source”: blob.source, “chunk”: split_number},
56 )

File /usr/local/lib/python3.9/site-packages/openai/api_resources/audio.py:57, in Audio.transcribe(cls, model, file, api_key, api_base, api_type, api_version, organization, **params)
55 requestor, files, data = cls._prepare_request(file, file.name, model, **params)
56 url = cls._get_url(“transcriptions”)
—> 57 response, _, api_key = requestor.request(“post”, url, files=files, params=data)
58 return util.convert_to_openai_object(
59 response, api_key, api_version, organization
60 )

File /usr/local/lib/python3.9/site-packages/openai/api_requestor.py:226, in APIRequestor.request(self, method, url, params, headers, files, stream, request_id, request_timeout)
205 def request(
206 self,
207 method,
(…)
214 request_timeout: Optional[Union[float, Tuple[float, float]]] = None,
215 ) → Tuple[Union[OpenAIResponse, Iterator[OpenAIResponse]], bool, str]:
216 result = self.request_raw(
217 method.lower(),
218 url,
(…)
224 request_timeout=request_timeout,
225 )
→ 226 resp, got_stream = self._interpret_response(result, stream)
227 return resp, got_stream, self.api_key

File /usr/local/lib/python3.9/site-packages/openai/api_requestor.py:619, in APIRequestor._interpret_response(self, result, stream)
611 return (
612 self._interpret_response_line(
613 line, result.status_code, result.headers, stream=True
614 )
615 for line in parse_stream(result.iter_lines())
616 ), True
617 else:
618 return (
→ 619 self._interpret_response_line(
620 result.content.decode(“utf-8”),
621 result.status_code,
622 result.headers,
623 stream=False,
624 ),
625 False,
626 )

File /usr/local/lib/python3.9/site-packages/openai/api_requestor.py:671, in APIRequestor._interpret_response_line(self, rbody, rcode, rheaders, stream)
669 data = json.loads(rbody)
670 except (JSONDecodeError, UnicodeDecodeError) as e:
→ 671 raise error.APIError(
672 f"HTTP code {rcode} from API ({rbody})", rbody, rcode, headers=rheaders
673 ) from e
674 resp = OpenAIResponse(data, rheaders)
675 # In the future, we might add a “status” parameter to errors
676 # to better handle the “error while streaming” case.

APIError: HTTP code 504 from API (

504 Gateway Time-out

)

samad19472002 · July 12, 2023, 9:31am

I am also getting this error , have you sort it out ??

alexmc · July 12, 2023, 7:43pm

I have this error too. Anyone can help? Thank you!

elbonsai · July 14, 2023, 6:04am

I also see the same exceptions when running the LangChain for LLMs / Document Loading notebook:

JSONDecodeError
APIError: HTTP code 504 from API

Trying a shorter 1-minute YouTube video didn’t resolve the 504 timeout. Any resolution? Thanks

simonmurray · July 31, 2023, 12:56pm

Hi,

I have the same error… any ideas on solutions ?

Simon

Shaheer_Fardan · August 5, 2023, 3:30am

Hello,
I am also facing same error. To me it looks like library mismatch issue - not sure on this. We dont have privileges to update the libraries. This is a powerful application. Please help rectify the issue at earliest. If you need me to help with troubleshooting, do reach out.

s-h-a-m-i-k · August 6, 2023, 11:29am

i don’t think anyone is looking into it. however, if someone does look into it then it will be great

thank you

ai2ys · August 15, 2023, 6:01pm

I am facing the same error.

Mun_Chung_Wong · August 17, 2023, 10:16pm

I am getting error 403 instead and just wonder anyone else has seen the same?

[download] Got error: HTTP Error 403: Forbidden

The full stack trace as follows:

[youtube] Extracting URL: https://www.youtube.com/watch?v=jGwO_UgTS7I
[youtube] jGwO_UgTS7I: Downloading webpage
[youtube] jGwO_UgTS7I: Downloading android player API JSON
[info] jGwO_UgTS7I: Downloading 1 format(s): 140
[dashsegments] Total fragments: 7
[download] Destination: docs/youtube//Stanford CS229： Machine Learning Course, Lecture 1 - Andrew Ng (Autumn 2018).m4a
[download] Got error: HTTP Error 403: Forbidden
---------------------------------------------------------------------------
DownloadError                             Traceback (most recent call last)
Cell In[15], line 7
      2 save_dir="docs/youtube/"
      3 loader = GenericLoader(
      4     YoutubeAudioLoader([url],save_dir),
      5     OpenAIWhisperParser()
      6 )
----> 7 docs = loader.load()

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/generic.py:90, in GenericLoader.load(self)
     88 def load(self) -> List[Document]:
     89     """Load all documents."""
---> 90     return list(self.lazy_load())

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/generic.py:85, in GenericLoader.lazy_load(self)
     81 def lazy_load(
     82     self,
     83 ) -> Iterator[Document]:
     84     """Load documents lazily. Use this when working at a large scale."""
---> 85     for blob in self.blob_loader.yield_blobs():
     86         yield from self.blob_parser.lazy_parse(blob)

File /usr/local/lib/python3.9/site-packages/langchain/document_loaders/blob_loaders/youtube_audio.py:45, in YoutubeAudioLoader.yield_blobs(self)
     42 for url in self.urls:
     43     # Download file
     44     with yt_dlp.YoutubeDL(ydl_opts) as ydl:
---> 45         ydl.download(url)
     47 # Yield the written blobs
     48 loader = FileSystemBlobLoader(self.save_dir, glob="*.m4a")

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:3369, in YoutubeDL.download(self, url_list)
   3366     raise SameFileError(outtmpl)
   3368 for url in url_list:
-> 3369     self.__download_wrapper(self.extract_info)(
   3370         url, force_generic_extractor=self.params.get('force_generic_extractor', False))
   3372 return self._download_retcode

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:3344, in YoutubeDL.__download_wrapper.<locals>.wrapper(*args, **kwargs)
   3341 @functools.wraps(func)
   3342 def wrapper(*args, **kwargs):
   3343     try:
-> 3344         res = func(*args, **kwargs)
   3345     except UnavailableVideoError as e:
   3346         self.report_error(e)

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:1507, in YoutubeDL.extract_info(self, url, download, ie_key, extra_info, process, force_generic_extractor)
   1505             raise ExistingVideoReached()
   1506         break
-> 1507     return self.__extract_info(url, self.get_info_extractor(key), download, extra_info, process)
   1508 else:
   1509     extractors_restricted = self.params.get('allowed_extractors') not in (None, ['default'])

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:1518, in YoutubeDL._handle_extraction_exceptions.<locals>.wrapper(self, *args, **kwargs)
   1516 while True:
   1517     try:
-> 1518         return func(self, *args, **kwargs)
   1519     except (DownloadCancelled, LazyList.IndexError, PagedList.IndexError):
   1520         raise

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:1615, in YoutubeDL.__extract_info(self, url, ie, download, extra_info, process)
   1613 if process:
   1614     self._wait_for_video(ie_result)
-> 1615     return self.process_ie_result(ie_result, download, extra_info)
   1616 else:
   1617     return ie_result

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:1674, in YoutubeDL.process_ie_result(self, ie_result, download, extra_info)
   1672 if result_type == 'video':
   1673     self.add_extra_info(ie_result, extra_info)
-> 1674     ie_result = self.process_video_result(ie_result, download=download)
   1675     self._raise_pending_errors(ie_result)
   1676     additional_urls = (ie_result or {}).get('additional_urls')

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:2779, in YoutubeDL.process_video_result(self, info_dict, download)
   2777 downloaded_formats.append(new_info)
   2778 try:
-> 2779     self.process_info(new_info)
   2780 except MaxDownloadsReached:
   2781     max_downloads_reached = True

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:3247, in YoutubeDL.process_info(self, info_dict)
   3243 dl_filename = existing_video_file(full_filename, temp_filename)
   3244 if dl_filename is None or dl_filename == temp_filename:
   3245     # dl_filename == temp_filename could mean that the file was partially downloaded with --no-part.
   3246     # So we should try to resume the download
-> 3247     success, real_download = self.dl(temp_filename, info_dict)
   3248     info_dict['__real_download'] = real_download
   3249 else:

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:2970, in YoutubeDL.dl(self, name, info, subtitle, test)
   2968 if new_info.get('http_headers') is None:
   2969     new_info['http_headers'] = self._calc_headers(new_info)
-> 2970 return fd.download(name, new_info, subtitle)

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/common.py:444, in FileDownloader.download(self, filename, info_dict, subtitle)
    441     self.to_screen(f'[download] Sleeping {sleep_interval:.2f} seconds ...')
    442     time.sleep(sleep_interval)
--> 444 ret = self.real_download(filename, info_dict)
    445 self._finish_multiline_status()
    446 return ret, True

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/dash.py:60, in DashSegmentsFD.real_download(self, filename, info_dict)
     56         return fd.real_download(filename, info_dict)
     58     args.append([ctx, fragments_to_download, fmt])
---> 60 return self.download_and_append_fragments_multiple(*args, is_fatal=lambda idx: idx == 0)

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/fragment.py:382, in FragmentFD.download_and_append_fragments_multiple(self, *args, **kwargs)
    380 max_progress = len(args)
    381 if max_progress == 1:
--> 382     return self.download_and_append_fragments(*args[0], **kwargs)
    383 max_workers = self.params.get('concurrent_fragment_downloads', 1)
    384 if max_progress > 1:

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/fragment.py:521, in FragmentFD.download_and_append_fragments(self, ctx, fragments, info_dict, is_fatal, pack_func, finish_func, tpe, interrupt_trigger)
    519     break
    520 try:
--> 521     download_fragment(fragment, ctx)
    522     result = append_fragment(
    523         decrypt_fragment(fragment, self._read_fragment(ctx)), fragment['frag_index'], ctx)
    524 except KeyboardInterrupt:

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/fragment.py:466, in FragmentFD.download_and_append_fragments.<locals>.download_fragment(fragment, ctx)
    463     self.report_retry(err, count, retries, frag_index, fatal)
    464     ctx['last_error'] = err
--> 466 for retry in RetryManager(self.params.get('fragment_retries'), error_callback):
    467     try:
    468         ctx['fragment_count'] = fragment.get('fragment_count')

File /usr/local/lib/python3.9/site-packages/yt_dlp/utils.py:6141, in RetryManager.__iter__(self)
   6139 yield self
   6140 if self.error:
-> 6141     self.error_callback(self.error, self.attempt, self.retries)

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/fragment.py:463, in FragmentFD.download_and_append_fragments.<locals>.download_fragment.<locals>.error_callback(err, count, retries)
    461 if fatal and count > retries:
    462     ctx['dest_stream'].close()
--> 463 self.report_retry(err, count, retries, frag_index, fatal)
    464 ctx['last_error'] = err

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/common.py:389, in FileDownloader.report_retry(self, err, count, retries, frag_index, fatal)
    387 """Report retry"""
    388 is_frag = False if frag_index is NO_DEFAULT else 'fragment'
--> 389 RetryManager.report_retry(
    390     err, count, retries, info=self.__to_screen,
    391     warn=lambda msg: self.__to_screen(f'[download] Got error: {msg}'),
    392     error=IDENTITY if not fatal else lambda e: self.report_error(f'\r[download] Got error: {e}'),
    393     sleep_func=self.params.get('retry_sleep_functions', {}).get(is_frag or 'http'),
    394     suffix=f'fragment{"s" if frag_index is None else f" {frag_index}"}' if is_frag else None)

File /usr/local/lib/python3.9/site-packages/yt_dlp/utils.py:6148, in RetryManager.report_retry(e, count, retries, sleep_func, info, warn, error, suffix)
   6146 if count > retries:
   6147     if error:
-> 6148         return error(f'{e}. Giving up after {count - 1} retries') if count > 1 else error(str(e))
   6149     raise e
   6151 if not count:

File /usr/local/lib/python3.9/site-packages/yt_dlp/downloader/common.py:392, in FileDownloader.report_retry.<locals>.<lambda>(e)
    387 """Report retry"""
    388 is_frag = False if frag_index is NO_DEFAULT else 'fragment'
    389 RetryManager.report_retry(
    390     err, count, retries, info=self.__to_screen,
    391     warn=lambda msg: self.__to_screen(f'[download] Got error: {msg}'),
--> 392     error=IDENTITY if not fatal else lambda e: self.report_error(f'\r[download] Got error: {e}'),
    393     sleep_func=self.params.get('retry_sleep_functions', {}).get(is_frag or 'http'),
    394     suffix=f'fragment{"s" if frag_index is None else f" {frag_index}"}' if is_frag else None)

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:1015, in YoutubeDL.report_error(self, message, *args, **kwargs)
   1010 def report_error(self, message, *args, **kwargs):
   1011     '''
   1012     Do the same as trouble, but prefixes the message with 'ERROR:', colored
   1013     in red if stderr is a tty file.
   1014     '''
-> 1015     self.trouble(f'{self._format_err("ERROR:", self.Styles.ERROR)} {message}', *args, **kwargs)

File /usr/local/lib/python3.9/site-packages/yt_dlp/YoutubeDL.py:955, in YoutubeDL.trouble(self, message, tb, is_error)
    953     else:
    954         exc_info = sys.exc_info()
--> 955     raise DownloadError(message, exc_info)
    956 self._download_retcode = 1

[download] Got error: HTTP Error 403: Forbidden

nilosreesengupta · August 17, 2023, 10:32pm

Hello @Mun_Chung_Wong ,

The team has updated our labs.
Could you kindly start a fresh again and let me know if issue still persists.

With regards,
Nilosree Sengupta

Mun_Chung_Wong · August 17, 2023, 11:28pm

Thanks Nilosree.

I’ve restarted the kernel but same http error 403 persists.
Is there a new version of notebook I should load? If so, I can’t find it when I do:
File → Open…
It just shows one and presumably the same notebook I have been using for this module.

Have also tried a different link to Andrew’s lecture 2 but same error:
“https://www.youtube.com/watch?v=4b4MUYve_U8”

Didnt mean to show that video preview page in my reply post

nilosreesengupta · August 18, 2023, 9:31pm

Hello @Mun_Chung_Wong ,

You’re welcome.
I also reciprocated on my end and got the same. I have reported this to our team.
This will be fixed soon!

With regards,
Nilosree Sengupta

shigaraki_G · August 20, 2023, 2:51pm

Hello I am facing the same error.
[download] Got error: HTTP Error 403: Forbidden

nilosreesengupta · August 20, 2023, 5:25pm

Hello @shigaraki_G ,

There are some permission issues. Our Team is working on it.
By the mean time, continue with other parts of the course.
Happy learning!

With regards,
Nilosree Sengupta

shigaraki_G · August 20, 2023, 6:31pm

Thank you for the reply. I have completed the course too.

nilosreesengupta · August 20, 2023, 9:00pm

Hello @shigaraki_G ,

You’re welcome!
That’s great!

With regards,
Nilosree Sengupta

nilosreesengupta · August 22, 2023, 1:53pm

Hello @Mun_Chung_Wong , @shigaraki_G ,

The version is updated now and the issue fixed by the team.

With regards,
Nilosree Sengupta

shigaraki_G · August 22, 2023, 3:01pm

Thankyou!

nilosreesengupta · August 22, 2023, 3:13pm

Hello @shigaraki_G ,

You’re welcome!
Happy Learning!

With regards,
Nilosree Sengupta

antaripg · August 23, 2023, 11:23am

Hi @nilosreesengupta,
I am facing this issue - although I see that it has fixed (As per the convo) in this thread.
Attaching Screenshot for reference:

Topic		Replies	Views
Document Loading, JSON Decode Error LangChain: Chat with Your Data	0	149	July 10, 2023
L1 notebook throws error : API Error LangChain: Chat with Your Data	1	91	August 25, 2023
Error using YoutubeAudioLoader LangChain: Chat with Your Data	3	663	May 17, 2024
Document Loading example code generates fatal error LangChain: Chat with Your Data	2	107	July 10, 2023
Error when loading YouTube file LangChain: Chat with Your Data	14	688	December 6, 2024

L1 notebook throws error : Download Error

Error:

504 Gateway Time-out

Related topics