# Alphafold Alphafold contains two parts: 1. A conda environment containing dependencies 2. The alphafold module itself, containing the current code and submission scripts. 3. The Database ## DataBase Data All the download scripts work from merlin, the only one not working is the pdb-mmcif script. as it is using rsync. The port provided by alphafold is closed by PSI and the US mirror does not work nicely. Alternative that works: rsync -rlpt -v -z --info=progress2 --delete rsync.ebi.ac.uk::pub/databases/pdb/data/structures/divided/mmCIF/ $DIR Tip: Make sure to use tmux sessions for the downloads. Tip: Double check reading permissions for users after copying/downloading the database, was causing errors last time! ## Conda Environment Alphafold installed based on Spencers instructions from older installs and original git repo. Change: No central conda env anymore, rather a version-based conda env setup. The conda env should be installed from the environment.yml file, which has combinations of conda-forge, bioconda and pip installations, unfortuantely no environment .yml file provided by alphafold deepmind so far. Also, miniconda is used to do so, using the central conda installation might cause problems (openAFS hardlink issues), if the central version is used make sure to install from an openAFS host. (pmod7 e.g) Also, the central conda is super old, needs to be updated? After using the yml file jax and jaxlib need to be installed into the conda, does not work directly from the environment.yml file. (so far) Also, there are a lot of contradicting descriptions in the orignal git repo concerning the jaxlib versions at current state. ``` OLD VERSIONS conda create --name alphafold python==3.8 conda update -n base conda source miniconda3/etc/profile.d/conda.sh conda activate alphafold conda install -y -c conda-forge openmm==7.5.1 cudnn==8.2.1.32 cudatoolkit==11.0.3 pdbfixer==1.7 conda install -y -c bioconda hmmer==3.3.2 hhsuite==3.3.0 kalign2==2.04 pip install absl-py==0.13.0 biopython==1.79 chex==0.0.7 dm-haiku==0.0.4 \ dm-tree==0.1.6 immutabledict==2.0.0 jax==0.2.14 ml-collections==0.1.0 \ numpy==1.19.5 scipy==1.7.0 tensorflow==2.5.0 pandas==1.3.4 pip install --upgrade jax jaxlib==0.1.69+cuda111 \ -f https://storage.googleapis.com/jax-releases/jax_releases.html NEW VERSION (2.3.2 current state) ``` create the conda env from the environments.yml file , content: ``` channels: - pytorch - conda-forge - defaults - anaconda - bioconda dependencies: - python==3.10 - pip - openmm==7.7.0 - cudnn # Change version if not compatible with current system - cudatoolkit - pdbfixer - hmmer==3.4 - hhsuite==3.3.0 - kalign2==2.04 - pip: - absl-py==1.0.0 - biopython==1.79 - chex==0.0.7 - dm-haiku==0.0.10 - immutabledict==2.0.0 - ml-collections==0.1.0 - numpy==1.24.3 - scipy==1.11.1 - tensorflow-cpu==2.13.0 - jax==0.4.14 - pandas==2.0.3 - dm-tree==0.1.8 ``` ##Alphafold CODE In the file run_alphafold.py, the flag --use_gpu_relax needs to be set to true, so far done manually! Not sure if this is really neccessary. ``` flags.DEFINE_boolean('use_gpu_relax', None , 'Whether to relax on GPU. ' TO: flags.DEFINE_boolean('use_gpu_relax', True, 'Whether to relax on GPU. ' ``` ## Alphafold module Add version to files/variants. The version number should match a github tag (e.g. `v2.0.1`) or else have the commit hash as `$V_RELEASE`. As admin user: ``` cd MX/alphafold ./build ``` ## Testing Here's an example sequence: ``` mkdir example cd example cat > query.fasta <dummy_sequence GWSTELEKHREELKEFLKKEGITNVEIRIDNGRLEVRVEGGTERLKRFLEELRQKLEKKGYTVDIKIE EOF module use MX unstable module load alphafold/2.1.1 sbatch alphafold_merlin.sh query.fasta ```