C4W3_Lab1_KubeflowPipelines kubectl-f apply Fail

Hello I am having trouble installing Kubeflow Pipelines on top on my Running Kubernetes Cluster in Kind.

Cluster was created using:
kind cluster create --image kindest/node:v1.21.2

Cluster is successfully up and running, but when I try to run the following sequence of commands:

export PIPELINE_VERSION=1.7.0
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"

I get a timeout error:

I printed kustomize version, kind version, kubectl version, kubectl get nodes
terminal output for any version issues I might have to be realized. Any input would be great-as to what the problem might be/how to fix it?

Thanks,
Evan

After launching dockerd, please create your kind cluster like this and then run rest of the steps:
kind create cluster --image=kindest/node:v1.21.2

Hello balaji,

Thanks for your response but as you can see in my post post above that is the initial way I created my kind cluster with the image kindest/node:v1.21.2 that you mentioned but the kubectl -f apply command for kubeflow pipeline manifest was still failing.

Evan

Thanks for the follow up.

Your screenshot isn’t aligned with the commands you shared on the original post. See the --timeout=60 flag. It’s not there in
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION" but your screenshot has it.

Please remove that flag and try again.

Still not working

If I try to go to the github url in the browser, I keep getting a 404 Error Webpage not found. Has the link to the .yaml files changed?

This is the path for browsing:

Could you please try with the following?
kfp==1.8.13 (this means that the export PIPIELINE_VERSION should also change)

$ pip install -U kfp

My
kind == 0.14.0
kustomize = 4.5.4
kubectl = 1.24.2

Running in existing python virtual environment: successful kfp install version 1.8.13
%pip install -U kfp

Then following up with kubectl apply -f command I am still getting the same Error:

The url you used points to master.

Please run the following commands:

kind delete cluster
kind create cluster --image=kindest/node:v1.21.2
export PIPELINE_VERSION=1.8.13
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"

Following the commands you sent in the last post I get the output:

But the path that you sent over earlier in the hyperlink above points to this link:

The path in the notebook points to this: this receives a 404 ERROR
github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources

Both of them are failing which link is it 1 or 2?

Let’s stay with the last set of commands I gave you.

Can you try modifying one command based on this?

kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION&timeout=300"

Thank you for your quick help, I appreciate it I had been stuck on this for a bit!

You’re welcome. I’ll ask the staff to add this to the notebook as well.

Ok great I’m sure it will help some others out.
Another thing was I am not sure if ‘pip install -U kfp’ was necessary to make it work but I did not see that mentioned anywhere in the Notebook.

Thanks again!

After applying the previous commands I waited all day for the pods to be all up and running but after 6+ hours they never reached the running status:

not sure what the problem was.

What is your hardware configuration in terms of RAM and number of CPU cores?

I’m on Apple M1 Silicon 2020
8 Core CPU
16 GB RAM

Could we try the following?

$ kubectl get pods -n kubeflow

For each pod that’s not ready, see the logs like this (let’s start with mysql):

$ kubectl logs mysql-f7b9b7dd4-2478k -n kubeflow
# I've manually trimmed to only the last few lines
2022-08-04T06:14:04.628127Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2022-08-04T06:14:04.628157Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2022-08-04T06:14:04.629407Z 0 [Note] InnoDB: 5.7.33 started; log sequence number 12664279
2022-08-04T06:14:04.630284Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2022-08-04T06:14:04.630656Z 0 [Note] Plugin 'FEDERATED' is disabled.
2022-08-04T06:14:04.636129Z 0 [Note] InnoDB: Buffer pool(s) load completed at 220804  6:14:04
2022-08-04T06:14:04.640858Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2022-08-04T06:14:04.640882Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2022-08-04T06:14:04.641970Z 0 [Warning] CA certificate ca.pem is self signed.
2022-08-04T06:14:04.642032Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2022-08-04T06:14:04.642805Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2022-08-04T06:14:04.642860Z 0 [Note] IPv6 is available.
2022-08-04T06:14:04.642883Z 0 [Note]   - '::' resolves to '::';
2022-08-04T06:14:04.642912Z 0 [Note] Server socket created on IP: '::'.
2022-08-04T06:14:04.646930Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2022-08-04T06:14:04.663204Z 0 [Note] Event Scheduler: Loaded 0 events
2022-08-04T06:14:04.663530Z 0 [Note] mysqld: ready for connections.
Version: '5.7.33'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  MySQL Community Server (GPL)

Hello balaji,

5 troublesome pods with large amounts of restarts.
3 that were currently not running.
and 2 pods that failed to ever run stuck in pending state (mysql/workflow controller).

I ran the cluster again and applied the troubleshooting method you recommended with observing the logs this is the output:










Why does mysql pod not have any logs?