Ungraded Lab: KeyError: 'COLAB_TPU_ADDR'

When I execute

tpu_grpc_url = “grpc://” + os.environ[“COLAB_TPU_ADDR”]

I get error message:

KeyError: 'COLAB_TPU_ADDR'

Looks like there is no ‘COLAB_TPU_ADDR’ environment variable.
Hardware accelerator in TPU.

Any ideas?

Thanks

Try in the Connect Option in the upper right hand corner below Comment, click on the arrow and select a unit that has TPU in it. Maybe this fixes your problem.

I have tried using this approach, but the error remains!
Please, take a look at the error:

KeyError                                  Traceback (most recent call last)
<ipython-input-6-e93ad5be9dcf> in <cell line: 1>()
----> 1 tpu_grpc_url = "grpc://" + os.environ["COLAB_TPU_ADDR"]
      2 tpu_cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu_grpc_url)
      3 tf.config.experimental_connect_to_cluster(tpu_cluster_resolver)
      4 tf.tpu.experimental.initialize_tpu_system(tpu_cluster_resolver)
      5 strategy = tf.distribute.experimental.TPUStrategy(tpu_cluster_resolver)

/usr/lib/python3.10/os.py in __getitem__(self, key)
    678         except KeyError:
    679             # raise KeyError with the original key value
--> 680             raise KeyError(key) from None
    681         return self.decodevalue(value)
    682 

KeyError: 'COLAB_TPU_ADDR'

Thanks in advance,
Allan Freitas

Hello,

Which lab is this?

Have you tried reseting the kernel and re-running the lab again?

@allansdefreitas, it looks like the code for that lab is a little out of date. You no longer need to use the “COLAB_TPU_ADDR” to connect to the TPU.

Also, though, you need to choose the TPU v2 from the Edit/Notebook settings menu. It is hard to get free time on a TPU with Colab these days, so you may have to try multiple times and/or try different times of day.

I will put in a request to the staff to update the code to work with the current Colab backend, but in the meantime, you can comment out these two lines of code:

tpu_grpc_url = "grpc://" + os.environ["COLAB_TPU_ADDR"]
tpu_cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu_grpc_url)

and replace them with these lines:

try:
  tpu_cluster_resolver = tf.distribute.cluster_resolver.TPUClusterResolver()  # TPU detection
  print(f'Running on a TPU w/{tpu_cluster_resolver.num_accelerators()["TPU"]} cores')
except ValueError:
  raise BaseException('ERROR: Not connected to a TPU runtime; please make sure you have successfully chosen TPU runtime from the Edit/Notebook settings menu')

If you successfully connect to the TPU from the Edit/Notebook settings menu, this new code will connect you to the TPU and print a message telling you how many cores your TPU has - or it will print an error saying it couldn’t connect to the TPU.

To make your code even more up-to-date, you can also change the “strategy =…” line to remove the “.experimental”, like this:

strategy = tf.distribute.TPUStrategy(tpu_cluster_resolver) 

This isn’t entirely necessary, but will avoid a warning message saying you don’t need to use the .experimental any more.

2 Likes