I used the below code to transcribe you tube video to text as described in the course
from langchain_community.document_loaders.generic import GenericLoader, FileSystemBlobLoader
from langchain_community.document_loaders.parsers import OpenAIWhisperParser
from langchain_community.document_loaders.blob_loaders.youtube_audio import YoutubeAudioLoader
url=“”
save_dir=“”
loader = GenericLoader(
YoutubeAudioLoader([url],save_dir), # fetch from youtube
#FileSystemBlobLoader(save_dir, glob=“*.m4a”), #fetch locally
OpenAIWhisperParser()
)
docs = loader.load()
docs[0].page_content[0:500]
on running the code, i see the audio file created at the desired location
[youtube] Extracting URL:
[youtube] 5HcDJ8e9NwY: Downloading webpage
[youtube] 5HcDJ8e9NwY: Downloading tv client config
[youtube] 5HcDJ8e9NwY: Downloading tv player API JSON
[youtube] 5HcDJ8e9NwY: Downloading ios player API JSON
[youtube] 5HcDJ8e9NwY: Downloading m3u8 information
[info] 5HcDJ8e9NwY: Downloading 1 format(s): 140
[download] Data Quality Explained.m4a has already been downloaded
[download] 100% of 3.59MiB
[ExtractAudio] Not converting audio
Quality Explained.m4a; file is already in target format m4a
Transcribing part 1!
Transcribing part 1!
when i run the code docs[0].page_content[0:500] , i get error that NameError: name ‘docs’ is not defined
there is no txt file created with the audio transcription aslo
Please help as i am not sure what i am doing incorrect