C4_W4_Lab3_CelebA Problem with TPU

I have a problem with the TPUs on the C4_W4_Lab_3_CelebA_GAN_Experiments notebook. First of all tensorflow is not installed by default whenusing TPU, so I downloaded it using !pip install tensorflow (I also tried with ==2.16.1 and 2.18). Than I get this error:
TPUs not found in the cluster. Failed in initialization: No OpKernel was registered to support Op ‘ConfigureDistributedTPU’ used by {{node ConfigureDistributedTPU}} with these attrs: [tpu_cancellation_closes_chips=2, embedding_config=“”, tpu_embedding_config=“”, compilation_failure_closes_chips=false, enable_whole_mesh_compilations=false, is_global_init=false]
Registered devices: [CPU]
Registered kernels:
** **

** [[ConfigureDistributedTPU]] [Op:__inference__tpu_init_fn_4]**
When running this code:
try:
tpu_cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu=‘local’) # TPU detection
tf.config.experimental_connect_to_cluster(tpu_cluster_resolver)
tf.tpu.experimental.initialize_tpu_system(tpu_cluster_resolver)
print("All devices: ", tf.config.list_logical_devices(‘TPU’))
print(f’Running on a TPU w/{tpu_cluster_resolver.num_accelerators()[“TPU”]} cores’)
except ValueError:
raise BaseException(‘ERROR: Not connected to a TPU runtime; please make sure you have successfully chosen TPU runtime from the Edit/Notebook settings menu’)

strategy = tf.distribute.TPUStrategy(tpu_cluster_resolver)

I also tryed print(os.environ.get(‘COLAB_TPU_ADDR’)) and got None. I had the same problem with the official tensorflow notebook to set up TPUs at Google Colab
I have realy no idea how to solve this issue.

did you connect your colab notebook to tpu. there comes for tpu selection. based on the error it is stating your colab is not connected to tpu.

you should be able to see connect option which will give option to tpu high-ram option, select that.


With a slightly different version I get:

I selected TPU like shown in the image :frowning:
Any idea? Thank you so much, I am currently doing France’s youth AI olympiads and need to access powerfull machines.

Perhaps you have used all TPU free allowance!

I also tried it on another account that I never use and had the same problem. Same with one of my friends who tried with the Pro TPUs.

When I am using Colab, on the right side, there is a “Connect” dropdown menu, and it has a setting to “Change runtime type”. That’s where I see the option to select a TPU as a “hardware accelerator”.

If you need a lot of computing resource, it’s probably not going to be free. That’s what “Purchase additional compute units” means.

hi @Altrastorique

Based on @TMosh’s image I noticed you are using v2-8 TPU but from what I remember while work in g on tensorflow, we used to use T4 TPU.

Try to connect colab to T4 TPU and as mentors have mentioned even TPU has some assignment quote for every 12-hour cycle based on the model you are working on.

My response is only because your notebook provided an output that your notebook is not connected to TPU, but will it help you cover the model you are working ? can’t say!!! as we have no idea what you are working on.

Hope this resolves your connectivity issue.

Hi @Deepti_Prasad I am having the same issue. I went through your suggestions, but there is no T4 TPU option in my Colab. Here is a screenshot of the available options.

go with T4 GPU?

That doesn’t work either. It gives the following error.

are you logged into your account? it shows the kernel not registered? I didn’t encounter this issue. it also points on system configuration.

yes I am logged in my account. These are errors popping out.

No OpKernel was registered to support Op ‘ConfigureDistributedTPU’ used by {{node ConfigureDistributedTPU}} with these attrs: [compilation_failure_closes_chips=false, is_global_init=false, enable_whole_mesh_compilations=false, embedding_config=“”, tpu_embedding_config=“”, tpu_cancellation_closes_chips=2]
Registered devices: [CPU, GPU]
Registered kernels:

 [[ConfigureDistributedTPU]] [Op:__inference__tpu_init_fn_4]

NotFoundError: TPUs not found in the cluster. Failed in initialization: No OpKernel was registered to support Op ‘ConfigureDistributedTPU’ used by {{node ConfigureDistributedTPU}} with these attrs: [compilation_failure_closes_chips=false, is_global_init=false, enable_whole_mesh_compilations=false, embedding_config=“”, tpu_embedding_config=“”, tpu_cancellation_closes_chips=2]
Registered devices: [CPU, GPU]

can I know two things,

one are you running colab by the course provided or your own colab?

Also is your colab connected to your Google drive. the reason I am asking this, because the error you encountered could be due to either missing tensorflow package or version mismatch.

You can Also try to clear your browsing and cache history and relogin and try again.

I also remember I had encountered this issue, once in a while when I was using safari browser, I tried then chrome browser.

Honestly until we have your proper system configuration detail, not much I can do.

I will try to inform the l.t. of the course, probably he should be able to find the issue.

Hi everyone! Thank you for reporting this! We will look into the problem and update you asap.

I’m running my own saved copy of the lab. I cleared the browsing history and cache, then tried again. However, nothing has changed. Thank you so much for your assistance.

so you are running locally? and did you encounter this issue in course provided colab?

Yes, I ran the code in the Colab notebook provided by the course itself and encountered the same issue.

1 Like

can you check the tensorflow version and keras version it is running on once

TensorFlow version: 2.18.0
Keras version: 3.8.0