C1W3: ProteinGAN ValueError: The parameter init="pca" cannot be used with metric="precomputed"

The ProtienGAN is a very cool idea.

I get an error in cell 13

here is the code

from sklearn.manifold import TSNE

#Loading calculated distances
distance_matrix = pd.read_csv("dist_out.dist", delimiter='\s+', skiprows=[0],header=None,index_col=0)
distance_matrix.columns = distance_matrix.index.values

#Using TSNE to compress all pair wise distances between sequences into two components which then could be plotted.
tsne = TSNE(n_components=2, metric='precomputed')
coordinates_2d = tsne.fit_transform(distance_matrix.values)

here is complete stack trace

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-df2fbea3129e> in <cell line: 9>()
      7 #Using TSNE to compress all pair wise distances between sequences into two components which then could be plotted.
      8 tsne = TSNE(n_components=2, metric='precomputed')
----> 9 coordinates_2d = tsne.fit_transform(distance_matrix.values)

1 frames
/usr/local/lib/python3.10/dist-packages/sklearn/manifold/_t_sne.py in _fit(self, X, skip_num_points)
    864         if self.metric == "precomputed":
    865             if isinstance(self.init, str) and self.init == "pca":
--> 866                 raise ValueError(
    867                     'The parameter init="pca" cannot be used with metric="precomputed".'
    868                 )

ValueError: The parameter init="pca" cannot be used with metric="precomputed".

Any idea how I work this this?

Andy

1 Like

Hi @Andy_Davidson,
It looks like sklearn changed the default for the TSNE init parameter to ‘pca’ in version 1.2, and pca doesn’t work with precomputed.

To fix, edit the TSNE constructor call to add init=‘random’, like this:

tsne = TSNE(n_components=2, metric='precomputed', init='random')

I’ll let the staff know so they can update the notebook.

Thanks Wendy, your fix works

1 Like