JupyterHub Deployment

Spawning Single-User Notebooks Servers

Published

March 12, 2024

Modified

June 13, 2024

Components

Disambiguation for the Jupyter ecosystem:

Components Descriptions
Jupyter Notebook Specification for .ipynb file …self-hosted graphical user-interface (GUI)
JupyterLab Web GUI server for Jupyter notebooks …support extensions (self-hosted)
JupyterHub Multi-user remote access to JuputerLab (as well as Jupyter notebooks)

JupyterHub 1 …multi-user service…

  • …provides remote access to a shared pre-configured Jupyter resource
  • …manage multiple users sessions of interactive computing
  • …connects users with the back-end infrastructure required for their sessions
  • …supports pluggable authentication with several authentication protocols
  • …enables to host the user sessions in a containerized environment

Jupyter server 2 3single-user notebook server, started on-demand per user

  • …back-end to Jupyter web applications …APIs & REST endpoint
  • …core service …client for kernels
  • …launches & monitors kernels on behalf of the user
  • …one protocol for web apps & kernels

Installation

Basic installation 4

dnf install -y python3 python3-pip nodejs npm
echo 'export PATH=$PATH:/usr/local/bin' > /etc/profile.d/uselocal.sh
python3 -m pip install jupyterhub
npm install -g configurable-http-proxy
# needed if running the notebook servers in the same environment
python3 -m pip install jupyterlab notebook  

Create a dummy user account …JupyterHub authentication configuration 5

# create a dummy user account
groupadd dummy
useradd -g dummy dummy
passwd dummy

# generate default configuration
mkdir {/etc,/srv}/jupyterhub
cd /etc/jupyterhub
jupyterhub --generate-config

# allow user dummy access
echo 'c.Authenticator.allowed_users = {"dummy"}' >> /etc/jupyterhub/jupyterhub_config.py

# start the service in foreground
jupyterhub -f /etc/jupyterhub/jupyterhub_config.py

Architecture

Workflow how a single user Jupyter server is started:

  1. Users login to JupyterHub to request a Jupyter notebook server
  2. JupyterHub requests resources from the resource management system
  3. A Spawner starts a Jupyter notebook on the allocated resource
  4. Users connect to the notebook server running on a resource
  5. Users may share the notebook URL-address with co-workers

JupyterHub sub-systems…

  • Hub
    • …manage user accounts and authentication
    • …uses a Spawner to launch single-user notebook servers
  • Proxy
    • …public-facing component
    • …dynamically routes HTTP requests to the Hub or notebook servers
  • Spawner
    • …responsible to launch a single-user notebook server
    • …different spawners for resources like Docker, Slurm, etc.

Security

Designed to semi-trusted users

  • …single-user servers are placed in a single domain, behind a proxy
  • …protections are not applied between single-user servers (single security domain)

The security overview 6 describes how to protect users from each other

Spawners

Single-user notebook server launched & monitored by a spawner 7

  • …user notebook spawned (started) after login to JupyterHub
  • …server instance owned by the login user (…started by the spawner)
  • The spawner represents an abstract interface 8 able to take three actions…
    • …start a server
    • …poll weather a process is running
    • …stop a server
  • JUPYTERHUB_SERVICE_URL 9 environment variable…
    • …used to pass IP-address and port to the single-user notebook
    • …single-user notebook returns full URL the JupyterHub for connection
    • every single-user notebook instance requires a dedicated port

Remove lingering single-user notebooks with the jupyterhub-idle-culler 10

A selection of different spawner implementations are available…

Spawner Description
warpspawner 11 Runtime parametrization of multiple spawners
systemdspawner 12 Spawn a single-user notebook servers using systemd
dockerspawner Launch a single-user notebook in a Docker container
sshspawner Spawn single-user notebook servers on over SSH
batchspawner Starts single-user notebooks on a HPC cluster resource

JupyterHub Remote Spawn 13

  • …prerequisite …connect with TCP port 443 of JupyterHub service
  • …allows users to spawn a single-user notebook server on any machine
  • …mounts home with FUSE based sshfs …data transfer with SFTP
  • …uses Chisel 14 to setup a TCP tunnel over HTTP secured via SSH

Docker

dockerspawner 15 launch a notebook server using a Docker container

Simple deployment of a Docker based demo environment 16

dnf config-manager \
  --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf install -y git docker-ce docker-ce-cli containerd.io docker-compose-plugin
systemctl enable --now docker
git clone https://github.com/jupyterhub/jupyterhub-deploy-docker
cd jupyterhub-deploy-docker/basic-example
docker pull quay.io/jupyter/base-notebook:latest
docker compose up -d

SSH

sshspawner 17 spawns single-user notebook servers on a remote host over SSH

  • …user AsyncSSH 18 for SSH login on a remote host
  • …requires per user SSH keys to login to the remote host
  • Script get_port.py 19
    • …executed by the SSHSpawner.remote_random_port
    • …selects unoccupied port on the remote host
    • …return IP-address & port for a single-user notebook
  • SSHSpawner.start uses returned IP-address & port…
    • …uses options --hub-api-url to set JUPYTERHUB_API_URL
    • …creates a script to be executed with bash -s on the remote

Batch

batchspawner 20 used to interface with HPC resource management systems…

  • Support multiple workload management systems (including Slurm)
    • …requires a local client to the resource manager (sbatch, etc.)
    • …or additional component to us SSH to launch notebooks 21
  • MOSS 22 …extension to the batchspawner.SlurmSpawner
    • …allows the user to select Slurm resources to use
    • …presents a web-interface to select Slurm resource options

Install using Python PIP:

pip3 install batchspawner

Configuration JupyterHub to use batchspawner for Slurm…

  • …it is required to import batchspawner before using it
  • …details in the batchspawner.SlurmSpawner 23 implementation
  • Basically an abstraction to interface typical resource managers
    • …including a submit commands, job state evaluation and cancellation
    • …lunch of a singel-user notebook facilitated with a custom job script
/etc/jupyterhub/jupyterhub_config.py
import batchspawner
c.JupyterHub.spawner_class = 'batchspawner.SlurmSpawner'
c.SlurmSpawner.start_timeout = 7200
c.SlurmSpawner.startup_poll_interval = 5.0
c.SlurmSpawner.http_timeout = 7200
c.SlurmSpawner.batch_script = """#!/bin/bash
#SBATCH --output={{homedir}}/jupyterhub_slurmspawner_%j.log
#SBATCH --job-name=spawner-jupyterhub
#SBATCH --chdir={{homedir}}
#SBATCH --export={{keepvars}}
#SBATCH --get-user-env=L
set -euo pipefail
trap 'echo SIGTERM received' TERM
{{prologue}}
export JUPYTERHUB_SERVICE_URL=http://0.0.0.0:12345
/usr/local/bin/jupyterhub-singleuser
{{epilogue}}
"""

Footnotes

  1. JupyterHub
    https://jupyterhub.readthedocs.io/en/stable/
    https://github.com/jupyterhub/jupyterhub
    https://discourse.jupyter.org↩︎

  2. Single-user Server, JupyterHub
    https://jupyterhub.readthedocs.io/en/latest/explanation/singleuser.html↩︎

  3. Jupyter Server, Github
    https://github.com/jupyter-server/jupyter_server↩︎

  4. Tutorial Quickstart, JupyterHub Documentation
    https://jupyterhub.readthedocs.io/en/stable/tutorial/quickstart.html↩︎

  5. Authentication and User Basics, JupyterHub Documentation
    https://jupyterhub.readthedocs.io/en/stable/tutorial/getting-started/authenticators-users-basics.html↩︎

  6. Security Overview, JupyterHub Project
    https://jupyterhub.readthedocs.io/en/latest/explanation/websecurity.html#web-security↩︎

  7. JupyterHub Spawner Documentation
    https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html↩︎

  8. JupyterHub Spawner Default Implementation, GitHub
    https://github.com/jupyterhub/jupyterhub/blob/HEAD/jupyterhub/spawner.py↩︎

  9. Note on IPs and ports, JupyterHub Documentation
    https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html#note-on-ips-and-ports↩︎

  10. JupyterHub Idle Culler Service, GitHub
    https://github.com/jupyterhub/jupyterhub-idle-culler↩︎

  11. wrapspawner for Jupyterhub, GitHub
    https://github.com/jupyterhub/wrapspawner↩︎

  12. systemdspawner for JupyterHub, GitHub
    https://github.com/jupyterhub/systemdspawner↩︎

  13. Remote Spawn, RWTH Aachen
    https://git.rwth-aachen.de/jupyter/remote-spawn↩︎

  14. Chisel, GitHub
    https://github.com/jpillora/chisel↩︎

  15. Dockerspawner for JupyterHub, GitHub
    https://github.com/jupyterhub/dockerspawner↩︎

  16. JupyterHub Docker Example, GitHub
    https://github.com/jupyterhub/jupyterhub-deploy-docker↩︎

  17. sshspawner NERSC, GitHub
    https://github.com/NERSC/sshspawner↩︎

  18. AsyncSSH: Asynchronous SSH for Python
    https://asyncssh.readthedocs.io/en/latest/index.html↩︎

  19. sshspawner Scripts, NERSC, GitHub
    https://github.com/NERSC/sshspawner/blob/master/scripts/get_port.py↩︎

  20. batchspawner for Jupyterhub, GitHub
    https://github.com/jupyterhub/batchspawner↩︎

  21. batchspawner Comet Extension, GitHub
    https://gist.github.com/zonca/55f7949983e56088186e99db53548ded↩︎

  22. JupyterHub MOdular Slurm Spawner, GitHub
    https://github.com/silx-kit/jupyterhub_moss↩︎

  23. Implementation Slurm Spawner, GitHub
    https://github.com/jupyterhub/batchspawner/blob/main/batchspawner/batchspawner.py#L675↩︎