Anaconda
For all python-based applications and analysis scripts we use use virtual environments administrated using conda/mamba.
Activate base environment
To bootstrap the base environment as a starting point use source /sf/cristallina/applications/conda/envs/miniconda/bin/activate. With a base environment now loaded we can initialize our shell using conda init so this environment is available at the next login.
Cristallina environments
All conda environments for cristallina are located at /sf/cristallina/applications/conda/envs. Their respective descriptions in /sf/cristallina/applications/conda/env_specs, which is also tracked in git.psi.ch. At the moment there is no automatic tracking of changes of the environments to updated specifications.
The main environments are:
- slic (
/sf/cristallina/applications/conda/envs/slic) - analysis_edge (
/sf/cristallina/applications/conda/envs/analysis_edge)
for instrument control and data analysis. We are trying to track recent upstream packages but please only update if you ensure a working state afterwards and document the changes.
To be able to use the shortcut for activation, e.g. conda activate slic the environment directory needs to be added to your .condarc file as follows:
envs_dirs:
- /sf/cristallina/applications/conda/envs
Temoporary env fix
For the moment, we create new environments with Sven's conda installation. We source the conda with source /sf/daq/source-conda. For the moment we save these new enviroments in /sf/cristallina/applications/it/envs/
To create a new env there from a yml file, we use mamba env create --prefix ./{environment name} -f /sf/cristallina/applications/conda/env_specs/{environment yaml}.yml
To use these new envs, one needs to add /sf/cristallina/applications/it/envs/ to the envs_dirs in ~/.condarc
Sometimes if one installed multiple environments, mamba does not solve the new env from yaml or fails to install with a message File not valid: SHA256 doesn't match expectation .... Locally cashed packages in the home directory are causing issues. Then it's best to clear your home folder with mamba clean -a and possibly also delete the cashed packages in ~/.mamba/pkgs/
Bitshuffle
Only specific bitshuffle builds are working well with specific python versions, so they need to be explicitly stated when installing. For example python=3.12 needs bitshuffle=0.5.2=py312h5fdea32_5 to work well. When you want to update python, check /sf/daq/config/env-yml for ymls of other beamlines, maybe somebody has already done the work to find out which one is good.
You can also check if your bitshuffle is good by running /sf/daq/bin/shitbuffle with your environment activated.
Keeping things up-to-date
As the complete ecosystem, both from external python packages and internal ones (e.g. sf_datafiles, jungfrau-utils,...) moves rather quickly we want to track those upstream bug fixes and changes in a timely matter.
This requires a bit of careful work though, because we generally use a rather large set of interdependent packages and their compatibility is not always guaranteed. Fortunately the toolchain around conda, mamba and conda-forge is largely taking care of this. If incompatibilities arise most upstream maintainers are also happy to accept pull request for small compatibility improvements rather quickly.
So a good compromise is to monthly check if there are only small updates (e.g. from numpy 1.26.1 to numpy 1.26.2) or larger changes (e.g. from numpy 1.26.2 to numpy 2.0). In case of small updates we can just perform them, in case of larger updates we need to generate a snapshot of the environment (conda env export > this_environment.yml) to be able to go back. Then we perform the update and need to check for incompatibilities, i.e. are the basic components still running as they should. For now this is saved at /sf/cristallina/applications/conda/env_specs/ .
In the past difficult candidates where the jungfrau-utils because they require bitshuffle, larger jupyter updates can cause difficulties with extensions, jupyter-collaboration still has rough edges and other items with large version changes. Don't hesitate though, many more bugs gets fixed than new ones introduced, so it is almost always worth keeping up-to-date.
Commands
To update we use mamba as a faster solver for conda:
mamba update -n analysis_edge --all -c conda-forge
File privileges
By default newly created files are read-only for everyone else. To allow other people 'group' (e.g. p-group in RA) to write, add umask 0002 into ~/.bashrc and source it with source ~/.bashrc to update. This does not change the files already created, for that one needs to chmod them (e.g. chmod 777 filename for everyone).
Available space
sf/cristallina drive only has 250GB quota (including the backups, which takes some time to appear!). Therefore, having too many environments or saving other larger stuff there can cause issues. To check current available space: df -h | grep cristallina .