3 Install Onyxia¶
Once we have installed the k8s cluster with reverse proxy, we can start to intall Onyxia
3.1 What is Onyxia?¶
Onyxia is a web application that provides a catalog of service that help users to run supported services on a k8s cluster. For now the catalog contains maily data oriented services (analytics, data viz, machine learning, etc.). You can find the official git repo of Onyxia here.
Onyxia contains two main components:
- onyxia-api, in charge of the API of Onyxia (wrapper for k8s api for helm install).
- onyxia-web, in charge of the UI of Onyxia.
3.2 Why Onyxia?¶
The services in Onyxia catalog can be deployed into any k8s cluster even without Onyxia. But configuring and deploying these service via command line tools is not very user friendly for data professionial such as analyste and scientist. Onyxia provides a user friendly web interface which abstract the complexity of managing services in Kubernetes cluster. It can allow data scientists to focus on data analyzing task other than k8s cluster exploitation.
3.3 What service Onyxia provides?¶
- Quick and easy deployment of all service in catalog (e.g. jupyter, r-studio, mlflow, etc.). Api calls on helm install.
- All course available in the formation page.
- Flexible resource configuration on all services (e.g. CPU, GPU, RAM, and disk (ephemeral or PVCs)).
- Possiblity to add initialization scripts for the deployments;
- Possibility to create your own service catalog.
- Flexible user authentication system via Keycloak that can use various backend (e.g. AD, openldap, rdbms, etc.), and provids SSO OpenID Connect;
- MinIO for S3 object storage;
- Vault for secret management.
3.4 Requirements¶
Onyxia runs inside a k8s cluster as other services (deployed via a helm chart). So you can easily deploy an instance of Onyxia via helm install. Before diving into the deployment of Onyxia, several requirements must be met. These requirements are detailed in the following sections of the Onyxia installation.
- A domain name must be registered in the form of a wildcard DNS. This is necessary to allow users to access the deployed services on the k8s cluster via proxy/reverse-proxy (ingress-nginx).
- Keycloak must be installed and configured with the proper settings to store users in a PostgreSQL database, and to use
assumeRoleWithWebIdentityallowing for single sign-on (SSO) with Onyxia, MinIO, and Vault. - MinIO must be installed and configured with the correct policy to allow Onyxia to create buckets and objects. (Optional)
- Vault must be installed and configured with the correct policy to allow Onyxia to view, access, and manage secrets. (Optional)
3.5 Install onyxia¶
Below command will install an instance of onyxia with the minimun configuration basic.yaml
helm repo add inseefrlab https://inseefrlab.github.io/helm-charts
# the url datalab.casd.fake will be the domaine name that allow us to connect to onyxia
helm install onyxia inseefrlab/onyxia -f basic.yaml
3.6 Access Onyxia web UI¶
Note all the network traffic to access services which are exposed by the ingress in k8s cluster are managed by the nginx proxy/reverse-proxy. So you need to send all your query to the node that host the worker which runs the nginx service. Because we choose
For test purpose, if you don't have a dns server, you can change your /etc/hosts to fake it.
The ip address must be one of the worker which has ingress running.
Add below line to /etc/hosts
10.50.5.59 datalab.casd.local
To test it you can run
curl datalab.casd.fake
This workaround is not recommended for production, because it does not support wildcard. It means for the newly created services that is not in the /etc/hosts are not accesible.
Check Chapiter 4 for installing a wildcard dns server.
3.7 Enable OIDC in onyxia¶
You need to have a OIDC servcie up and running. If not, please follow the 06.Setup_oidc.md to install a keycloak instance. After that you need to add new configuration into the helm deployment.
You can find a full example in oidc.yaml
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: datalab.casd.local
ui:
image:
name: inseefrlab/onyxia-web
env:
KEYCLOAK_REALM: casd-onyxia
KEYCLOAK_CLIENT_ID: onyxia-client
KEYCLOAK_URL: https://keycloak.datalab.casd.local/auth
api:
env:
keycloak.realm: casd-onyxia
keycloak.disable-trust-manager: "true"
keycloak.auth-server-url: https://keycloak.datalab.casd.local/auth
authentication.mode: "openidconnect"
springdoc.swagger-ui.oauth.clientId: onyxia-client
Then run below command to apply the new configuration
helm upgrade onyxia inseefrlab/onyxia -f oidc.yaml
This will set the onyxia-api to connect to the keycloak server for authentication. You have to check two points 1. If your certificate is auto signed, you need to add keycloak.disable-trust-manager: "true" to ask onyxia-api disable certificat check 2. Make sure the onyxia-api is able to connect to (https://keycloak.datalab.casd.local/auth). DNS must be setup correctlly.