gitea-pages/jupyterhub.md at 205f174ba7e20b8cc6311d03eaf571bbb26baaaf

Files

Spencer Bliven 205f174ba7 Jupyterhub docs: discuss adding #SBATCH options

2022-07-11 14:06:28 +02:00

4.3 KiB

Raw Blame History

title, last_updated, sidebar, permalink

title	last_updated	sidebar	permalink
Jupyterhub on Merlin	31 July 2019	merlin6_sidebar	/merlin6/jupyterhub.html

Jupyterhub provides jupyter notebooks that are launched on cluster nodes of merlin and can be accessed through a web portal.

Accessing Jupyterhub and launching a session

The service is available inside of PSI (or through a VPN connection) at

https://merlin-jupyter.psi.ch:8000

Login: You will be presented with a Login web page for authenticating with your PSI account.
Spawn job: The Spawner Options page allows you to specify the properties (Slurm partition, running time,...) of the batch jobs that will be running your jupyter notebook. Once you click on the Spawn button, your job will be sent to the Slurm batch system. If the cluster is not currently overloaded and the resources you requested are available, your job will usually start within 30 seconds.

Jupyter software environments - running different kernels

Your notebooks can run within different software environments which are offered by a number of available Jupyter kernels.

E.g. in this test installation we provide two environments targeted at data science

tensorflow-1.13.1_py37: contains Tensorflow, Keras, scikit-learn, Pandas, numpy, dask, and dependencies. Stable
talos_py36: also contains the Talos package. This environment is experimental and subject to updates and changes.

When you create a new notebook you will be asked to specify which kernel you want to use. It is also possible to switch the kernel of a running notebook, but you will lose the state of the current kernel, so you will have to recalculate the notebook cells with this new kernel.

These environments are also available for standard work in a shell session. You can activate an environment in a normal merlin terminal session by using the module (q.v. using Pmodules) command to load anaconda python, and from there using the conda command to switch to the desired environment

module use unstable
module load anaconda/2019.07
conda activate tensorflow-1.13.1_py36

When the anaconda module has been loaded, you can list the available environments by executing

conda info -e

You can get more info on the use of the conda package management tool at its official [https://conda.io/projects/conda/en/latest/commands.html](documentation site).

Using your own custom made environments with jupyterhub

Python environments can take up a lot of space due to the many dependencies that will be installed. You should always install your extra environments to the data area belonging to your account, e.g. /data/user/${YOUR-USERNAME}/conda-envs

In order for jupyterhub (and jupyter in general) to recognize the provided environment as a valid kernel, make sure that you include the nb_conda_kernels package in your environment. This package provides the necessary activation and the dependencies.

Example:

conda create -c conda-forge -p /data/user/${USER}/conda-envs/my-test-env python=3.7 nb_conda_kernels

After this, your new kernel will be visible as my-test-env inside of your jupyterhub session.

Requesting additional resources

The Spawner Options page covers the most common options. These are used to create a submission script for the jupyterhub job and submit it to the slurm queue. Additional customization can be implemented using the 'Optional user defined line to be added to the batch launcher script' option. This line is added to the submission script at the end of other #SBATCH lines. Parameters can be passed to SLURM by starting the line with #SBATCH, like in Running Slurm Scripts. Some ideas:

Request additional memory

#SBATCH --mem=100G

Request multiple GPUs (gpu partition only)

#SBATCH --gpus=2

Log additional information

hostname; date; echo $USER

Output is found in ~/jupyterhub_batchspawner_<jobid>.log.

Contact

In case of problems or requests, please either submit a PSI Service Now incident containing "Merlin Jupyterhub" as part of the subject, or contact us by mail through merlin-admins@lists.psi.ch.

4.3 KiB Raw Blame History