4.3 KiB
title, last_updated, sidebar, permalink
title | last_updated | sidebar | permalink |
---|---|---|---|
Jupyterhub on Merlin | 31 July 2019 | merlin6_sidebar | /merlin6/jupyterhub.html |
Jupyterhub provides jupyter notebooks that are launched on cluster nodes of merlin and can be accessed through a web portal.
Accessing Jupyterhub and launching a session
The service is available inside of PSI (or through a VPN connection) at
https://merlin-jupyter.psi.ch:8000
- Login: You will be presented with a Login web page for authenticating with your PSI account.
- Spawn job: The Spawner Options page allows you to
specify the properties (Slurm partition, running time,...) of
the batch jobs that will be running your jupyter notebook. Once
you click on the
Spawn
button, your job will be sent to the Slurm batch system. If the cluster is not currently overloaded and the resources you requested are available, your job will usually start within 30 seconds.
Jupyter software environments - running different kernels
Your notebooks can run within different software environments which are offered by a number of available Jupyter kernels.
E.g. in this test installation we provide two environments targeted at data science
- tensorflow-1.13.1_py37: contains Tensorflow, Keras, scikit-learn, Pandas, numpy, dask, and dependencies. Stable
- talos_py36: also contains the Talos package. This environment is experimental and subject to updates and changes.
When you create a new notebook you will be asked to specify which kernel you want to use. It is also possible to switch the kernel of a running notebook, but you will lose the state of the current kernel, so you will have to recalculate the notebook cells with this new kernel.
These environments are also available for standard work in a shell session. You can activate an environment in a normal merlin terminal session by using the module
(q.v. using Pmodules) command to load anaconda python, and from there using the conda
command to switch to the desired environment
module use unstable
module load anaconda/2019.07
conda activate tensorflow-1.13.1_py36
When the anaconda
module has been loaded, you can list the available environments by executing
conda info -e
You can get more info on the use of the conda
package management tool at its official [https://conda.io/projects/conda/en/latest/commands.html](documentation site).
Using your own custom made environments with jupyterhub
Python environments can take up a lot of space due to the many dependencies that will be installed. You should always install your extra environments to the data area belonging to your account, e.g. /data/user/${YOUR-USERNAME}/conda-envs
In order for jupyterhub (and jupyter in general) to recognize the provided environment as a valid kernel, make sure that you include the nb_conda_kernels
package in your environment. This package provides the necessary activation and the dependencies.
Example:
conda create -c conda-forge -p /data/user/${USER}/conda-envs/my-test-env python=3.7 nb_conda_kernels
After this, your new kernel will be visible as my-test-env
inside of your jupyterhub session.
Requesting additional resources
The Spawner Options page covers the most common options. These are used to
create a submission script for the jupyterhub job and submit it to the slurm
queue. Additional customization can be implemented using the 'Optional user
defined line to be added to the batch launcher script' option. This line is
added to the submission script at the end of other #SBATCH
lines. Parameters can
be passed to SLURM by starting the line with #SBATCH
, like in Running Slurm
Scripts. Some ideas:
Request additional memory
#SBATCH --mem=100G
Request multiple GPUs (gpu partition only)
#SBATCH --gpus=2
Log additional information
hostname; date; echo $USER
Output is found in ~/jupyterhub_batchspawner_<jobid>.log
.
Contact
In case of problems or requests, please either submit a PSI Service Now incident containing "Merlin Jupyterhub" as part of the subject, or contact us by mail through merlin-admins@lists.psi.ch.