Dataset for Stack Overflow questions and answers from BigQuery

Hello,

Could you please let me know how can I download the dataset of Stack Overflow questions and answers from BigQuery?
I would like to use this dataset locally.

Hi @Sujan_Chandra_Roy,

I believe google hosts the dataset on their own platform can be used only through their services. I might be wrong though. You can check this as a start.

Further google searches regarding this might be helpful to you.

Best,
Mubsi

Hi @Mubsi appreciate any help on the following:
ValueError: This library only supports credentials from google-auth-library-python. See google-auth — google-auth 1.30.0 documentation for help on authentication with this library.

When I try to run the cell >>> bq_client = bigquery.Client(project=PROJECT_ID,
credentials = credentials)

Hi @amiguel,

Are you trying to run this locally ?

Yes @Mubsi, I tried to run it locally. Never tried to run from the jupyter notebook at the lesson, which now seems to work correctly. Was wondering whether it’ll only work on such environment from the lesson? Can we reproduce it and use it elsewhere?

Hi @amiguel,

The notebooks on the platform use various APIs. Those APIs are hidden away from the learners for various reasons. The notebooks are designed in a way to load those hidden APIs.

You can run these notebooks locally or in any other environment you like, all you would have to do is download all the necessary packages required, get your own personal API key and you’d have to change/add the code in the notebook which would load your API credentials.

Understood. I guess I will have to delve into the inners of having an API for bigquery which presumably can be get from google cloud platform and then use it all just by swapping Project ID and all other client customized credentials, right?

Exactly.

The way the credentials are setup in the platform is somewhat different than how normally they are initialised.

Basically, you need to have your google cloud set up, and then authenticate your credentials in the notebook. Once those are set up, the rearming code would be the same as in the labs.

You can have a google search on how to authenticate your GCP APIs in a Jupyter notebook.


I had to delve into some Vertex AI Jupyter Notebook tutorials to further understand how to assemble the bits and pieces. Was able to set a ProjectID, an authentication via Google passcode, create buckets, and now access the flower images for training. I believe most of the tutorial resembles much of what has been ground covered within the short course syllabus.