Skip to content

3 Install Onyxia

Once we have installed the k8s cluster with reverse proxy, we can start to intall Onyxia

3.1 What is Onyxia?

Onyxia is a web application that provides a catalog of service that help users to run supported services on a k8s cluster. For now the catalog contains maily data oriented services (analytics, data viz, machine learning, etc.). You can find the official git repo of Onyxia here.

Onyxia contains two main components:

  • onyxia-api, in charge of the API of Onyxia (wrapper for k8s api for helm install).
  • onyxia-web, in charge of the UI of Onyxia.

3.2 Why Onyxia?

The services in Onyxia catalog can be deployed into any k8s cluster even without Onyxia. But configuring and deploying these service via command line tools is not very user friendly for data professionial such as analyste and scientist. Onyxia provides a user friendly web interface which abstract the complexity of managing services in Kubernetes cluster. It can allow data scientists to focus on data analyzing task other than k8s cluster exploitation.

3.3 What service Onyxia provides?

  • Quick and easy deployment of all service in catalog (e.g. jupyter, r-studio, mlflow, etc.). Api calls on helm install.
  • All course available in the formation page.
  • Flexible resource configuration on all services (e.g. CPU, GPU, RAM, and disk (ephemeral or PVCs)).
  • Possiblity to add initialization scripts for the deployments;
  • Possibility to create your own service catalog.
  • Flexible user authentication system via Keycloak that can use various backend (e.g. AD, openldap, rdbms, etc.), and provids SSO OpenID Connect;
  • MinIO for S3 object storage;
  • Vault for secret management.

3.4 Requirements

Onyxia runs inside a k8s cluster as other services (deployed via a helm chart). So you can easily deploy an instance of Onyxia via helm install. Before diving into the deployment of Onyxia, several requirements must be met. These requirements are detailed in the following sections of the Onyxia installation.

  • A domain name must be registered in the form of a wildcard DNS. This is necessary to allow users to access the deployed services on the k8s cluster via proxy/reverse-proxy (ingress-nginx).
  • Keycloak must be installed and configured with the proper settings to store users in a PostgreSQL database, and to use assumeRoleWithWebIdentity allowing for single sign-on (SSO) with Onyxia, MinIO, and Vault.
  • MinIO must be installed and configured with the correct policy to allow Onyxia to create buckets and objects. (Optional)
  • Vault must be installed and configured with the correct policy to allow Onyxia to view, access, and manage secrets. (Optional)

3.5 Install onyxia

Below command will install an instance of onyxia with the minimun configuration basic.yaml

helm repo add inseefrlab https://inseefrlab.github.io/helm-charts
# the url datalab.casd.fake will be the domaine name that allow us to connect to onyxia
helm install onyxia inseefrlab/onyxia -f basic.yaml

3.6 Access Onyxia web UI

Note all the network traffic to access services which are exposed by the ingress in k8s cluster are managed by the nginx proxy/reverse-proxy. So you need to send all your query to the node that host the worker which runs the nginx service. Because we choose

For test purpose, if you don't have a dns server, you can change your /etc/hosts to fake it. The ip address must be one of the worker which has ingress running.

Add below line to /etc/hosts

10.50.5.59   datalab.casd.local

To test it you can run

curl datalab.casd.fake

This workaround is not recommended for production, because it does not support wildcard. It means for the newly created services that is not in the /etc/hosts are not accesible.

Check Chapiter 4 for installing a wildcard dns server.

3.7 Enable OIDC in onyxia

You need to have a OIDC servcie up and running. If not, please follow the 06.Setup_oidc.md to install a keycloak instance. After that you need to add new configuration into the helm deployment.

You can find a full example in oidc.yaml

ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - host: datalab.casd.local
ui:
  image:
    name: inseefrlab/onyxia-web
  env:
    KEYCLOAK_REALM: casd-onyxia
    KEYCLOAK_CLIENT_ID: onyxia-client
    KEYCLOAK_URL: https://keycloak.datalab.casd.local/auth
api:
  env:
    keycloak.realm: casd-onyxia
    keycloak.disable-trust-manager: "true"
    keycloak.auth-server-url: https://keycloak.datalab.casd.local/auth
    authentication.mode: "openidconnect"
    springdoc.swagger-ui.oauth.clientId: onyxia-client

Then run below command to apply the new configuration

helm upgrade onyxia inseefrlab/onyxia -f oidc.yaml

This will set the onyxia-api to connect to the keycloak server for authentication. You have to check two points 1. If your certificate is auto signed, you need to add keycloak.disable-trust-manager: "true" to ask onyxia-api disable certificat check 2. Make sure the onyxia-api is able to connect to (https://keycloak.datalab.casd.local/auth). DNS must be setup correctlly.