C4 W2 Lab 4: Error: cannot install apache-beam

In the Colab “C4_W2_Lab_4_Apache_Beam_and_Tensorflow.ipynb”, at the pip install requirements step (shown immediately below), I get the error below it.

# Install the required packages
!pip install -r ./molecules/requirements.txt

The resulting error:

ERROR: Cannot install apache-beam[gcp]==2.40.0, apache-beam[gcp]==2.41.0, apache-beam[gcp]==2.42.0, apache-beam[gcp]==2.43.0, apache-beam[gcp]==2.44.0, apache-beam[gcp]==2.45.0, apache-beam[gcp]==2.46.0, apache-beam[gcp]==2.47.0, apache-beam[gcp]==2.48.0 and dill==0.3.3 because these package versions have conflicting dependencies.

However, I made no changes to the Colab file and simply ran each cell chronologically. What should I do to fix it?

Below is the full output from running the pip install cell:

Collecting tensorflow-transform==1.10.0
  Using cached tensorflow_transform-1.10.0-py3-none-any.whl (439 kB)
Collecting dill==0.3.3
  Using cached dill-0.3.3-py2.py3-none-any.whl (81 kB)
Collecting pydot<2,>=1.2
  Using cached pydot-1.4.2-py2.py3-none-any.whl (21 kB)
Collecting tensorflow-metadata<1.11.0,>=1.10.0
  Using cached tensorflow_metadata-1.10.0-py3-none-any.whl (50 kB)
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.48.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.5 MB)
Collecting tfx-bsl<1.11.0,>=1.10.0
  Using cached tfx_bsl-1.10.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (21.6 MB)
Collecting numpy<2,>=1.16
  Using cached numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Collecting absl-py<2.0.0,>=0.9
  Using cached absl_py-1.4.0-py3-none-any.whl (126 kB)
Collecting protobuf<4,>=3.13
  Using cached protobuf-3.20.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
Collecting pyarrow<7,>=6
  Using cached pyarrow-6.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25.6 MB)
Collecting tensorflow!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,<2.10,>=1.15.5
  Using cached tensorflow-2.9.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.8 MB)
Collecting fastavro<2,>=0.23.6
  Using cached fastavro-1.8.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB)
Collecting typing-extensions>=3.7.0
  Downloading typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Collecting httplib2<0.23.0,>=0.8
  Using cached httplib2-0.22.0-py3-none-any.whl (96 kB)
Collecting pymongo<5.0.0,>=3.8.0
  Using cached pymongo-4.4.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (586 kB)
Collecting pytz>=2018.3
  Using cached pytz-2023.3-py2.py3-none-any.whl (502 kB)
Collecting objsize<0.7.0,>=0.6.1
  Downloading objsize-0.6.1-py3-none-any.whl (9.3 kB)
Collecting hdfs<3.0.0,>=2.1.0
  Using cached hdfs-2.7.0-py3-none-any.whl (34 kB)
Collecting proto-plus<2,>=1.7.1
  Downloading proto_plus-1.22.3-py3-none-any.whl (48 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.1/48.1 KB 3.8 MB/s eta 0:00:00
Collecting orjson<4.0
  Using cached orjson-3.9.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (138 kB)
Collecting python-dateutil<3,>=2.8.0
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting requests<3.0.0,>=2.24.0
  Downloading requests-2.31.0-py3-none-any.whl (62 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 KB 7.4 MB/s eta 0:00:00
Collecting fasteners<1.0,>=0.3
  Downloading fasteners-0.18-py3-none-any.whl (18 kB)
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.47.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.5 MB)
Collecting httplib2<0.22.0,>=0.8
  Downloading httplib2-0.21.0-py3-none-any.whl (96 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.8/96.8 KB 11.1 MB/s eta 0:00:00
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.46.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.4 MB)
Collecting pymongo<4.0.0,>=3.8.0
  Downloading pymongo-3.13.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (506 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 506.0/506.0 KB 30.1 MB/s eta 0:00:00
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.45.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
  Using cached apache_beam-2.44.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.2 MB)
Collecting cloudpickle~=2.2.0
  Downloading cloudpickle-2.2.1-py3-none-any.whl (25 kB)
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.43.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.2 MB)
Collecting objsize<0.6.0,>=0.5.2
  Downloading objsize-0.5.2-py3-none-any.whl (8.2 kB)
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.42.0-cp37-cp37m-manylinux2010_x86_64.whl (11.0 MB)
Collecting cloudpickle~=2.1.0
  Downloading cloudpickle-2.1.0-py3-none-any.whl (25 kB)
Collecting apache-beam[gcp]<3,>=2.40
  Using cached apache_beam-2.41.0-cp37-cp37m-manylinux2010_x86_64.whl (10.9 MB)
  Using cached apache_beam-2.40.0-cp37-cp37m-manylinux2010_x86_64.whl (10.9 MB)
INFO: pip is looking at multiple versions of absl-py to determine which version is compatible with other requirements. This could take a while.
Collecting absl-py<2.0.0,>=0.9
  Using cached absl_py-1.3.0-py3-none-any.whl (124 kB)
INFO: pip is looking at multiple versions of dill to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of tensorflow-transform to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install apache-beam[gcp]==2.40.0, apache-beam[gcp]==2.41.0, apache-beam[gcp]==2.42.0, apache-beam[gcp]==2.43.0, apache-beam[gcp]==2.44.0, apache-beam[gcp]==2.45.0, apache-beam[gcp]==2.46.0, apache-beam[gcp]==2.47.0, apache-beam[gcp]==2.48.0 and dill==0.3.3 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested dill==0.3.3
    apache-beam[gcp] 2.48.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.47.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.46.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.45.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.44.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.43.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.42.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.41.0 depends on dill<0.3.2 and >=0.3.1.1
    The user requested dill==0.3.3
    apache-beam[gcp] 2.40.0 depends on dill<0.3.2 and >=0.3.1.1

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

I attempted to remove package version from dill, but it didn’t help and broke downstream code.

Hello @allan4040

Have you tried loosening the version range as the error message suggests? Instead of specifying dill==0.3.3, you can use dill>=0.3.3

If that fails you can also try to remove the dill==0.3.3 altogether from your requirements.txt . and let pip attempt to find compatible versions automatically.

Regards
Isaak

Thank you Isaak. To troubleshoot, I simply specified “dill” and nothing else, so this should cause pip to find the right version of dill, however it still failed downstream since it did not recognize the dill library later on when doing “import dill as pickle”. It gives a “ModuleNotFoundError: No module named dill”.

Maybe I missed resetting the runtime or something. At this point I’ll just move on. I’m finding that Course 4 has a lot of old libraries and dependencies, and really needs an update by the course authors.

Hello @allan4040
Did also the solution I listed earlier fail? You can then try to prevent pip from installing the dependencies:
pip install -r ./molecules/requirements.txt --no-deps

Yes, I tried what you recommended and they fail:

  • “dill>=0.3.3” fails.
  • Removing it fails.
  • Running without dependencies breaks everything downstream.

At this point I’m sorry I unfortunately don’t have more time to troubleshoot this and cannot spend more time debugging further. You’re welcome to try debug the Colab yourself, or the authors can debug it further. It simply does not run properly out of the box as of this writing, and I recommend the authors update/debug/verify the Colab. Many of the Course 4 labs unfortunately seem to have version issues and seem out of date.

@Isaak_Kamau same issue here. Is been close to a month and this issue hasn’t been addressed yet.

ERROR: Cannot install apache-beam[gcp]==2.40.0, apache-beam[gcp]==2.41.0, apache-beam[gcp]==2.42.0, apache-beam[gcp]==2.43.0, apache-beam[gcp]==2.44.0, apache-beam[gcp]==2.45.0, apache-beam[gcp]==2.46.0, apache-beam[gcp]==2.47.0, apache-beam[gcp]==2.48.0 and dill==0.3.3 because these package versions have conflicting dependencies.

Hi everyone. Thank you for bringing this to our attention. We are looking into this and will update the lab asap. Sorry for the inconvenience!

Hi everyone! The issue should now be fixed. Kindly reopen the lab from your classroom to see the changes. Thanks!