refactor CSCS and Meg content add merlin6 quick start update merlin6 nomachine docs give the userdoc its own color scheme we use the Materials default one refactored slurm general docs merlin6 add merlin6 JB docs add software support m6 docs add all files to nav vibed changes #1 add missing pages further vibing #2 vibe #3 further fixes
79 lines
3.1 KiB
Markdown
79 lines
3.1 KiB
Markdown
# Jupyterhub Troubleshooting
|
|
|
|
In case of problems or requests, please either submit a **[PSI Service
|
|
Now](https://psi.service-now.com/psisp)** incident containing *"Merlin
|
|
Jupyterhub"* as part of the subject, or contact us by mail through
|
|
<merlin-admins@lists.psi.ch>.
|
|
|
|
## General steps for troubleshooting
|
|
|
|
### Investigate the Slurm output file
|
|
|
|
Your jupyterhub session runs as a normal batch job on the cluster, and each
|
|
launch will create a slurm output file in your *home* directory named like
|
|
`jupyterhub_batchspawner_{$JOBID}.log`, where the `$JOBID` part is the slurm job
|
|
ID of your job. After a failed launch, investigate the contents of that file.
|
|
An error message will usually be found towards the end of the file, often
|
|
including a python backtrace.
|
|
|
|
### Investigate python environment interferences
|
|
|
|
Jupyterhub just runs a jupyter notebook executable as your user inside the
|
|
batch job. A frequent source of errors consists of a user's local python
|
|
environment definitions getting mixed up with the environment that jupyter
|
|
needs to launch.
|
|
|
|
- setting PYTHONPATH inside of the ~/.bash_profile or any other startup script
|
|
- having installed packages to your local user area (e.g. using `pip install
|
|
--user <some-package>`). Such installation will interfere with the
|
|
environment offered by the `module` system on our cluster (based on
|
|
anaconda). You can list such packages by executing `pip list user`.
|
|
They are usually located in `~/.local/lib/pythonX.Y/...`.
|
|
|
|
You can investigate the launching of a notebook interactively, by logging in to
|
|
Merlin6 and running a jupyter command in the correct environment.
|
|
|
|
```bash
|
|
module use unstable
|
|
module load anaconda/2019.07
|
|
conda activate jupyterhub-1.0.0_py36
|
|
jupyter --paths
|
|
```
|
|
|
|
## Known Problems and workarounds
|
|
|
|
### Spawner times out
|
|
|
|
If the cluster is very full, it may be difficult to launch a session. We always
|
|
reserve some slots for interactive Jupyterhub use, but it may be that these
|
|
slots have been taken or that the resources you requested are currently not
|
|
available.
|
|
|
|
Inside of a Merlin6 terminal shell, you can run the standard commands like
|
|
`sinfo` and `squeue` to get an overview of how full the cluster is.
|
|
|
|
### Your user environment is not among the kernels offered for choice
|
|
|
|
Refer to our documentation about [using your own custom made
|
|
environments with jupyterhub](jupyterhub.md).
|
|
|
|
### Cannot save notebook - *xsrf argument missing*
|
|
|
|
You cannot save your notebook anymore and you get this error:
|
|
|
|
```text
|
|
'_xsrf' argument missing from POST
|
|
```
|
|
|
|
This issue occurs very seldomly. There exists the following workaround:
|
|
|
|
Go to the jupyterhub file browsing window and just open another
|
|
notebook using the same kernel in another browser window. The issue
|
|
should then go away. For more information refer to [this github
|
|
thread](https://github.com/nteract/hydrogen/issues/922#issuecomment-405456346)
|
|
|
|
<!-- ## Error HTTP 500 when starting the spawner -->
|
|
|
|
<!-- The spawner screen shows after launching an error message like the following: -->
|
|
<!-- `Internal server error (Spawner failed to start [status=..]. The logs may contain details)` -->
|