Hello I am having trouble installing Kubeflow Pipelines on top on my Running Kubernetes Cluster in Kind.
Cluster was created using:
kind cluster create --image kindest/node:v1.21.2
Cluster is successfully up and running, but when I try to run the following sequence of commands:
export PIPELINE_VERSION=1.7.0
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
I get a timeout error:
I printed kustomize version, kind version, kubectl version, kubectl get nodes
terminal output for any version issues I might have to be realized. Any input would be great-as to what the problem might be/how to fix it?
Thanks,
Evan
After launching dockerd
, please create your kind cluster like this and then run rest of the steps:
kind create cluster --image=kindest/node:v1.21.2
Hello balaji,
Thanks for your response but as you can see in my post post above that is the initial way I created my kind cluster with the image kindest/node:v1.21.2 that you mentioned but the kubectl -f apply command for kubeflow pipeline manifest was still failing.
Evan
Thanks for the follow up.
Your screenshot isn’t aligned with the commands you shared on the original post. See the --timeout=60
flag. It’s not there in
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
but your screenshot has it.
Please remove that flag and try again.
If I try to go to the github url in the browser, I keep getting a 404 Error Webpage not found. Has the link to the .yaml files changed?
This is the path for browsing:
Could you please try with the following?
kfp==1.8.13 (this means that the export PIPIELINE_VERSION should also change)
$ pip install -U kfp
My
kind == 0.14.0
kustomize = 4.5.4
kubectl = 1.24.2
Running in existing python virtual environment: successful kfp install version 1.8.13
%pip install -U kfp
Then following up with kubectl apply -f command I am still getting the same Error:
The url you used points to master.
Please run the following commands:
kind delete cluster
kind create cluster --image=kindest/node:v1.21.2
export PIPELINE_VERSION=1.8.13
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
Following the commands you sent in the last post I get the output:
-
But the path that you sent over earlier in the hyperlink above points to this link:
-
The path in the notebook points to this: this receives a 404 ERROR
github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources
Both of them are failing which link is it 1 or 2?
Let’s stay with the last set of commands I gave you.
Can you try modifying one command based on this?
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION&timeout=300"
Thank you for your quick help, I appreciate it I had been stuck on this for a bit!
You’re welcome. I’ll ask the staff to add this to the notebook as well.
Ok great I’m sure it will help some others out.
Another thing was I am not sure if ‘pip install -U kfp’ was necessary to make it work but I did not see that mentioned anywhere in the Notebook.
Thanks again!
After applying the previous commands I waited all day for the pods to be all up and running but after 6+ hours they never reached the running status:
not sure what the problem was.
What is your hardware configuration in terms of RAM and number of CPU cores?
I’m on Apple M1 Silicon 2020
8 Core CPU
16 GB RAM
Could we try the following?
$ kubectl get pods -n kubeflow
For each pod that’s not ready, see the logs like this (let’s start with mysql):
$ kubectl logs mysql-f7b9b7dd4-2478k -n kubeflow
# I've manually trimmed to only the last few lines
2022-08-04T06:14:04.628127Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2022-08-04T06:14:04.628157Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2022-08-04T06:14:04.629407Z 0 [Note] InnoDB: 5.7.33 started; log sequence number 12664279
2022-08-04T06:14:04.630284Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2022-08-04T06:14:04.630656Z 0 [Note] Plugin 'FEDERATED' is disabled.
2022-08-04T06:14:04.636129Z 0 [Note] InnoDB: Buffer pool(s) load completed at 220804 6:14:04
2022-08-04T06:14:04.640858Z 0 [Note] Found ca.pem, server-cert.pem and server-key.pem in data directory. Trying to enable SSL support using them.
2022-08-04T06:14:04.640882Z 0 [Note] Skipping generation of SSL certificates as certificate files are present in data directory.
2022-08-04T06:14:04.641970Z 0 [Warning] CA certificate ca.pem is self signed.
2022-08-04T06:14:04.642032Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2022-08-04T06:14:04.642805Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2022-08-04T06:14:04.642860Z 0 [Note] IPv6 is available.
2022-08-04T06:14:04.642883Z 0 [Note] - '::' resolves to '::';
2022-08-04T06:14:04.642912Z 0 [Note] Server socket created on IP: '::'.
2022-08-04T06:14:04.646930Z 0 [Warning] Insecure configuration for --pid-file: Location '/var/run/mysqld' in the path is accessible to all OS users. Consider choosing a different directory.
2022-08-04T06:14:04.663204Z 0 [Note] Event Scheduler: Loaded 0 events
2022-08-04T06:14:04.663530Z 0 [Note] mysqld: ready for connections.
Version: '5.7.33' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MySQL Community Server (GPL)
Hello balaji,
5 troublesome pods with large amounts of restarts.
3 that were currently not running.
and 2 pods that failed to ever run stuck in pending state (mysql/workflow controller).
I ran the cluster again and applied the troubleshooting method you recommended with observing the logs this is the output:
Why does mysql pod not have any logs?