ValueError: bad marshal data (unknown type code)

I wanted to share how I fixed the marshall data issue (and anything else to do with compatibility problems for this course) - at least on Mac.

Note that I already had two versions of Python on my Mac - which is common as Mac’s need a 2.7.x version to work. To test your setup, run ‘python -V’ and ‘python3 -V’ in the Terminal. The first one on my machine returns 2.7.18; second returns 3.9.10 - if you don’t have this dual Python setup the below may not work quite the same for you.

What versions of Python or TensorFlow do we need. First we need to know what versions of Python and TensorFlow we actually need. To check, run the following in the Jupyter code so you know what to aim for.

import tensorflow as tf
print("Tensorflow version") 
print(tf.version.VERSION)

import sys 
print("Python version") 
print (sys.version)

For the issue in this post, we need Python 3.7.6 and TensorFlow 2.3.0.

Step 1: getting the right Python version. Install pyenv - a virtual Python environment that allows you to install multiple versions of Python and switch between them. I did this with Homebrew. To install Homebrew, run the following in Mac’s Terminal (cmd-space, then type Terminal):

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

To install pyenv, run:

brew install pyenv

To install the relevant version of python, run:

pyenv install 3.7.6

Issue 1: this didn’t work! First irritant: this failed for me - I got an error during installation (final bit of error was “ret = sendfile(in, out, offset, &sbytes, &sf, flags);”. Apparently this is something to do with Xcode. This GitHub page contained a magical bit of code which fixed this for me - to install successfully I adapted the following bit of code - which originally had --patch 3.8.0 but worked with me just changing to --patch 3.7.6:

CFLAGS="-I$(brew --prefix openssl)/include -I$(brew --prefix bzip2)/include -I$(brew --prefix readline)/include -I$(xcrun --show-sdk-path)/usr/include" LDFLAGS="-L$(brew --prefix openssl)/lib -L$(brew --prefix readline)/lib -L$(brew --prefix zlib)/lib -L$(brew --prefix bzip2)/lib" \
pyenv install --patch 3.7.6 < <(curl -sSL https://github.com/python/cpython/commit/8ea6353.patch\?full_index\=1)

This successfully installed the right Python version.

Step 2: setting this as the main Python version. To set as the global version run the following:

pyenv global 3.7.6

Issue 2: setting global pyenv didn’t work. To check whether the above has worked, run ‘python3 -V’ and you should hopefully get the number 3.7.6 as opposed to whatever your versions was beforehand (it also changes the Python call itself - i.e. ‘python -V’ now points to 3.7.6 but I try and stay away from the system’s Python calls so will continue to call python3).

To fix this, I found a second GitHub page with magical code on it (the pyenv page): run the following to temporarily set your python3 to point at this new pyenv-installed version of Python rather than your usual python3 version:

export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"    # if `pyenv` is not already on PATH
eval "$(pyenv init --path)"
eval "$(pyenv init -)"

Check this works with ‘python3 -V’ and you should now get 3.7.6.

Step 3: getting the right TensorFlow. When you switch to a pyenv-installed new version of Python you need to re-install all your packages. Python package installer comes with distributions of Python, so you can simply call ‘pip3 install package-name’ to get packages you want. You’ll need to install any other ones you need for the assignment (like pandas in this case).

Make sure pip3 is upgraded otherwise you likely won’t find the right TensorFlow version - running:

pip3 install --upgrade pip

To install the correct TensorFlow version:

pip3 install tensorflow==2.3.0

Note the order here matters: TensorFlow 2.3.0 isn’t compatible with Python 3.8 and on - so you have to downgrade to Python 3.7.6 first, get your system to point at the right install and then install the downgraded TensorFlow.

Step 4: testing your code. Finally test your code by running the following from within your assignment 4 directory. Note you will need all the files from that folder to work locally. This post by @paulinpaloalto details how to do that. If your terminal doesn’t have a command prompt see here as well.

python3 file_name.py

Warning 1: in my case at least, the Python version defaults back to it’s system normals (i.e. 2.7.18 for python calls and 3.9.10 for python3 calls) when I close the terminal. You have to set the pyenv global and run the four lines of code above to get back to working with 3.7.6 (or whatever version you need).

Warning 2: I could likely fix this but haven’t - I generally work in Sublime Text - but that looks at python 3.9.10 regardless of what I’m doing. So for the purposes of the assignments where the version of Python or Tensorflow matters - I just work out of the Terminal.