GitOps For Kubeflow Using Argo CD
This guide describes how to setup Kubeflow using a GitOps methodology by using Argo-CD.
What is GitOps?
GitOps is a Continuous Delivery methodology centered around using Git as a single source of truth for declarative infrastructure and application code. The Git repo defines the desired state of an application using declarative specifications, and a GitOps tool like Argo CD will reconcile the differences between the manifest defined by the git repo and the live system. As a result, GitOps enforces an operating model where all changes are observable and verifiable through git commits. The declarative specifications streamline deployments as developers do not need to write scripts to build and deploy their application. Once the application is deployed, debugging is simplified as developers have a clear set of change logs through their Git commits history. Even if the live system has drifted away from the source repo’s desired state, the GitOps methodology provides the tools to converge the real system with the desired state through the declarative specs. Finally, once the breaking commit is found, rollback becomes as simple as syncing a previously good git commit. All these benefits reduce the amount of work developers have to spend on managing deployments to allow them to focus on other features.
Argo CD and GitOps
Argo CD is a Kubernetes-native Declarative Continuous Delivery tool that follows the GitOps methodology. Along with all the benefits of using GitOps, Argo CD offers:
- Integrations with templating tools like Ksonnet, Helm, and Kustomize in addition to plain yaml files to define the desired state of an application
- Automated or manual syncing of applications to its desired state
- Intuitive UI to provide observability into the state of applications
- Extensive CLI to integrate Argo CD with any CI system
- Enterprise-ready features like auditability, compliance, security, RBAC, and SSO
Before you start
This guide assumes you have a kubernetes cluster. If you don’t, follow this guide to create one.
Install Argo CD
Follow the argo cd getting started guide up to the ‘Create an application from a git repository location’ step.
- For example, if your local machine is OSX and you want to deploy Argo CD and Kubeflow to the same GKE cluster, you could run:
- Install Argo CD to the Kubernetes Cluster
ARGO_CD_LATEST=$(curl --silent "https://api.github.com/repos/argoproj/argo-cd/releases/latest" | grep '"tag_name"' | sed -E 's/.*"([^"]+)".*/\1/')
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/$ARGO_CD_LATEST/manifests/install.yaml
- Install Argo CD Cli
brew install argoproj/tap/argocd
- Set extra permissions for Argo CD Since the GKE cluster has RBAC enabled, you will need to grant your account the ability to create new cluster roles by running:
kubectl create clusterrolebinding YOURNAME-cluster-admin-binding --clusterrole=cluster-admin --user=YOUREMAIL@gmail.com
- Expose the Argo CD API server
kubectl port-forward service/argocd-server 8080:443
You can read about other options to connect to your Argo CD instance here.
-
Login using the CLI as admin user
The initial password for the admin user is autogenerated to be the pod name of the ArgoCD API server. This can be retrieved with the command:
kubectl get pods -n argocd -l app=argocd-server -o name | cut -d'/' -f 2
Using the above password, login to ArgoCD by running:
argocd login localhost:8080
After logging in, change the password using the command:
argocd account update-password argocd relogin
Create Kubeflow deployment repo
-
Create a git repo to store your Kubeflow configuration.
-
If you did not use the
kfctl.sh
script to create your kubernetes cluster and generate the kubernetes resources:-
Run the following script to download
kfctl.sh
:mkdir ${KUBEFLOW_SRC} cd ${KUBEFLOW_SRC} export KUBEFLOW_TAG=<a href="https://github.com/kubeflow/kubeflow/releases/tag/v0.4.1">v0.4.1</a> curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash
- KUBEFLOW_SRC directory where you want kubeflow source to be downloaded
- KUBEFLOW_TAG a tag corresponding to the version to checkout such as
master
for latest code. - Note you can also just clone the repository using git.
-
Run the following scripts to set up your Kubeflow KS application:
${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform none cd ${KFAPP} ${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s
- KFAPP the name of a directory where you want kubeflow configurations to be stored. This directory will be created when you run init.
-
-
Add the environment to your ksonnet application:
- If you are deploying kubeflow in the same cluster as Argo CD, run:
cd ks_app ks env add default --server https://kubernetes.default.svc --namespace kubeflow
otherwise run:
ks env add default argocd cluster add CONTEXTNAME
- CONTEXTNAME Context name of the Kubernetes cluster you want to deploy into.
-
Add the contents of the KSAPP directory to git repo you created in the first step.
Deploying Kubeflow
Run the following commands to create the Kubeflow application in Argo CD and then sync the manifests in your git repo to your cluster:
export KUBEFLOW_SRC_URL='Replace with a ssh or https git endpoint'
argocd app create kubeflow --name kubeflow --repo $KUBEFLOW_SRC_URL --path ks_app --env default
argocd app sync kubeflow
- You can view the Kubeflow application by running:
argocd app get kubeflow
or from the UI:
- NOTE: There is a known issue with the IAP component that prevents the envoy service from becoming synced and causes all subsequent syncs to fail. As a workaround for this issue, we recommend that you sync individual resources by adding the resource flag to your sync command.
Once the sync has finished, you can then access your Kubeflow UI by going to https://<KFAPP>.endpoint.<PROJECT>.cloud.googl/
- It can take 10–15 minutes for the endpoint to become available. Kubeflow needs to provision a signed SSL certificate and register a DNS name.
Going Forward
When you commit a change that modifies the ksonnet application directory of your Kubeflow repository (the ks_app directory if you used the kfctl.sh script), Argo CD will detect that your application is out of sync with your git repo. To sync the new resource, you can run
argocd app sync kubeflow --resource GROUP:KIND:NAME
or from the UI:
More Argo CD configuration
Please go to the Argo CD documentation to read more about how to configure other features like auto-sync, SSO, RBAC, and more!