Kubeflow deployment: part 1

ROBIN DONG 2021-07-30 08:02

By following thedocument, I tried to deploy the management cluster of Kubeflow. But after runningmake apply-clusterit reported:

The management cluster name "kubeflow-mgmt" is valid.
# Delete the directory so any resources that have been removed
# from the manifests will be pruned
rm -rf build/cluster
mkdir -p build/cluster
kustomize build ./cluster -o build/cluster
# Create the cluster
anthoscli apply -f build/cluster
I0723 14:53:19.329785   24546 main.go:230] reconcile serviceusage.cnrm.cloud.google.com/Service container.googleapis.com
I0723 14:53:23.236897   24546 main.go:230] reconcile container.cnrm.cloud.google.com/ContainerCluster kubeflow-mgmt
Unexpected error: error reconciling objects: error reconciling ContainerCluster:gcp-wow-rwds-ai-mlchapter-dev/kubeflow-mgmt: error creating GKE cluster kubeflow-mgmt: googleapi: Error 400: Project "gcp-wow-rwds-ai-mlchapter-dev" has no network named "default".
make: *** [apply-cluster] Error 1

The reason for this error is that Kubeflow could only use the network with the name “default” in GCP as its VPC. Thisissueis still open and has been pointed toanthos.

Workaround: Create a new GKE cluster manually, and set MGMT_NAME to the existed cluster name

export MGMT_NAME=kubeflow-exp

Then themake apply-clusterwould work properly.

[返回] [原文链接]