AlphaFold
AlphaFold is an artificial intelligence (AI) system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment.
License
This is not an officially supported Google product.
Copyright 2022 DeepMind Technologies Limited.
AlphaFold Code License
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Installation
AlphaFold is available on the cluster either as a module or as a Singularity container.
Databases are in /ceph/hpc/software/alphafold
Singularity containers are already available at standard directory:
ls /ceph/hpc/software/containers/singularity/images/alpha*
Output:
/ceph/hpc/software/containers/singularity/images/alphafold-2.2.4.sif
/ceph/hpc/software/containers/singularity/images/alphafold-2.3.2.sif
Example with a container:
export SING=/ceph/hpc/software/containers/singularity/images/alphafold-2.2.4.sif
export ALPHAFOLD_DATA_DIR=/ceph/hpc/software/alphafold
singularity run --env TF_FORCE_UNIFIED_MEMORY=1,XLA_PYTHON_CLIENT_MEM_FRACTION=4.0,OPENMM_CPU_THREADS=14,ALPHAFOLD_DATA_DIR=/ceph/hpc/software/alphafold \
-B /ceph/hpc/software/alphafold \
--pwd /app/alphafold \
--nv $SING\
--uniprot_database_path=${ALPHAFOLD_DATA_DIR}/uniprot/uniprot.fasta \
--pdb_seqres_database_path=${ALPHAFOLD_DATA_DIR}/pdb_seqres/pdb_seqres.txt \
--uniref90_database_path=${ALPHAFOLD_DATA_DIR}/uniref90/uniref90.fasta \
--mgnify_database_path=${ALPHAFOLD_DATA_DIR}/mgnify/mgy_clusters_2018_12.fa \
--template_mmcif_dir=${ALPHAFOLD_DATA_DIR}/pdb_mmcif/data_dir \
--obsolete_pdbs_path=${ALPHAFOLD_DATA_DIR}/pdb_mmcif/obsolete.dat \
--bfd_database_path=${ALPHAFOLD_DATA_DIR}/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata \
--pdb70_database_path=${ALPHAFOLD_DATA_DIR}/pdb70 \
--uniclust30_database_path=${ALPHAFOLD_DATA_DIR}/uniclust30/uniclust30_2018_08 \
--use_gpu_relax=True \
--fasta_paths=/ceph/hpc/home/user/af_test/actn.fasta --max_template_date=2020-05-14 --model_preset=monomer --data_dir=$ALPHAFOLD_DATA_DIR--db_preset=full_dbs --output_dir=$PWD"$@"
or in SBATCH
#!/bin/bash
#SBATCH -p gpu
#SBATCH --gres=gpu:4
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 10:00:00
export SING=/ceph/hpc/software/containers/singularity/images/alphafold-2.2.4.sif
export ALPHAFOLD_DATA_DIR=/ceph/hpc/software/alphafold
singularity run --env TF_FORCE_UNIFIED_MEMORY=1,XLA_PYTHON_CLIENT_MEM_FRACTION=4.0,OPENMM_CPU_THREADS=14,ALPHAFOLD_DATA_DIR=/ceph/hpc/software/alphafold \
-B /ceph/hpc/software/alphafold \
--pwd /app/alphafold \
--nv $SING\
--uniprot_database_path=${ALPHAFOLD_DATA_DIR}/uniprot/uniprot.fasta \
--pdb_seqres_database_path=${ALPHAFOLD_DATA_DIR}/pdb_seqres/pdb_seqres.txt \
--uniref90_database_path=${ALPHAFOLD_DATA_DIR}/uniref90/uniref90.fasta \
--mgnify_database_path=${ALPHAFOLD_DATA_DIR}/mgnify/mgy_clusters_2018_12.fa \
--template_mmcif_dir=${ALPHAFOLD_DATA_DIR}/pdb_mmcif/data_dir \
--obsolete_pdbs_path=${ALPHAFOLD_DATA_DIR}/pdb_mmcif/obsolete.dat \
--bfd_database_path=${ALPHAFOLD_DATA_DIR}/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_a3m.ffdata \
--pdb70_database_path=${ALPHAFOLD_DATA_DIR}/pdb70 \
--uniclust30_database_path=${ALPHAFOLD_DATA_DIR}/uniclust30/uniclust30_2018_08 \
--use_gpu_relax=True \
--fasta_paths=/ceph/hpc/home/user/af_test/actn.fasta --max_template_date=2020-05-14 --model_preset=monomer --data_dir=$ALPHAFOLD_DATA_DIR--db_preset=full_dbs --output_dir=$PWD"$@"
Availaible modules:
module spider Alpha
...
Output:
----- /cvmfs/sling.si/modules/el7/modules/all ----------------------
AlphaFold/2.1.2-foss-2021a
AlphaFold/2.2.2-foss-2021a-CUDA-11.3.1
AlphaFold/2.3.0-foss-2021b-CUDA-11.4.1
AlphaFold/2.3.4-foss-2022a-CUDA-11.7.0-ColabFold (D)
##Example with a module for the GPU partition:
```shell
#!/bin/bash
#SBATCH -p gpu
#SBATCH --gres=gpu:4
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 10:00:00
module load AlphaFold/2.2.2-foss-2021a-CUDA-11.3.1
alphafold \
--data_dir=/ceph/hpc/software/alphafold/ \
--fasta_paths=$(pwd)/sample.fasta \
--output_dir=$(pwd) \
--model_preset=multimer \
--db_preset=full_dbs \
--use_gpu_relax=True \
--max_template_date=2020-06-05
Example with a module for the CPU partition:
#!/bin/bash
#SBATCH -p cpu
#SBATCH -N 1
#SBATCH -c 8
#SBATCH -t 10:00:00
module load AlphaFold/2.1.2-foss-2021a
alphafold \
--data_dir=/ceph/hpc/software/alphafold/ \
--fasta_paths=$(pwd)/sample.fasta \
--output_dir=$(pwd) \
--model_preset=multimer \
--db_preset=full_dbs \
--use_gpu_relax=False \
--max_template_date=2020-06-05