Guidance to Evaluation

To evaluate the generation, we use designability, diversity and novelty. Here is the way to calculate these metrics from inference output.

Install tools

We use Foldseek in our evaluation. Follow the instructions to install it.

Install FoldSeek

FoldSeek is used to evaluate novelty: search for simular protein. Github repo: https://github.com/steineggerlab/foldseek

To start with, follow the instruction in the github page. Check using means of you see avx2 in the output, your device supports avx2. Using avx2 is recommended if your device supports avx2 and sse2 simutaniously.

# Linux AVX2 build (check using: cat /proc/cpuinfo | grep avx2)
wget https://mmseqs.com/foldseek/foldseek-linux-avx2.tar.gz; tar xvzf foldseek-linux-avx2.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Linux SSE2 build (check using: cat /proc/cpuinfo | grep sse2)
wget https://mmseqs.com/foldseek/foldseek-linux-sse2.tar.gz; tar xvzf foldseek-linux-sse2.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Linux ARM64 build
wget https://mmseqs.com/foldseek/foldseek-linux-arm64.tar.gz; tar xvzf foldseek-linux-arm64.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# MacOS
wget https://mmseqs.com/foldseek/foldseek-osx-universal.tar.gz; tar xvzf foldseek-osx-universal.tar.gz; export PATH=$(pwd)/foldseek/bin/:$PATH

# Conda installer (Linux and macOS)
conda install -c conda-forge -c bioconda foldseek

To start, create the database. It will download files from https://www.rcsb.org. Here, tmp is a temp folder, and the files will be downloaded to the parent folder of tmp. I will use path/to/FoldSeek_PDB_Database as example.Get to the parent folder (Remember this folder, useful later!)

cd path/to/FoldSeek_PDB_Database

And create database. PDB is the name for database, pdb is a 'nickname' (will be used in all_metric_calculation.py as database). tmp is the folder to store temp file.

foldseek databases PDB pdb tmp

then run

foldseek

If you see the information of foldseek, congratulations!

Start Evaluation

all_metric_calculation.py is designed for calculate the metrics. Start via running:

cd ReQFlow
python analysis/all_metric_calculation.py --inference_dir path/to/dir  --script_path abs/path/to/run_foldseek_parallel.sh  --dataset_dir /path/to/FoldSeek_PDB_Database

The script requires several command-line arguments for configuration. Here's a breakdown:

--inference_dir (Required): The directory containing the inference results from QFlow. e.g. path/to/ReQFlow/Inference_Results/15D_11M_2024Y_12h_00m_52s
--script_path (Required): The ABSOLUTE path to the run_foldseek_parallel.sh script. e.g. path/to/ReQFlow/analysis/run_foldseek_parallel.sh
--dataset_dir (Required): The directory containing the FoldSeek dataset as mentioned in Foldesek installation. e.g. path/to/FoldSeek_PDB_Database
--database (Optional): The database to use for FoldSeek (e.g., pdb). Defaults to pdb.
--type (Optional): Type of evaluation (qflow, FrameFlow, FoldFlow, FrameDiff, Genie2, RFdiffusion). Defaults to qflow.

According to the issue of FoldSeek mentioned in https://github.com/steineggerlab/foldseek/issues/323, we use the E-value column to report the TM-score.

Notice, to avoid potential issues, the script would delete unrelavent files in inference directory ('Reset' the folder). Besides, running foldseek would use a huge amount of CPU cores. By default, the script utilize 50% of all cores. You can modify in line 96 of run_foldseek_parallel.sh.

The script will generate some files. Important ones include:

Metrics.txt, the final results. Most important.
All_Results_Origin.csv, containing all original information (e.g. length, modeled pdb path, rmsd, tm-score, ESMF path)
All_Sampled_PDB.txt, containing all pdbs generated by model. Mainly use for foldseek.
All_Sampled_PDB_Designable.txt, similar to former but only contains designable ones.
All_Sampled_PDB_and_Length.csv, containing length and pdbs. Use for concate.
All_Sampled_PDB_and_Length_Designable.csv, similar to former but only contains designable ones.
Average_Times_per_Sample.png, a visualization of sampling time.
summary_tmscore.csv, generated by FoldSeek, the max tm-score between give protein and which in pdb database.