diff --git a/data/readme.md b/data/readme.md index 285920d..a59c214 100644 --- a/data/readme.md +++ b/data/readme.md @@ -2,5 +2,5 @@ 1. You can download the sequence pair dataset used to train **PPLM** through [pplm_dataset](https://drive.google.com/file/d/1Xdb3SG0CRY49WqH4jUJhM-yqsLOejz7_/view?usp=share_link)
2. You can access the original **protein-protein interaction** dataset from [D-SCRIPT](https://github.com/samsledje/D-SCRIPT/tree/main/data). The corrected pair lists by remove duplicate, erroneous, and invalid negative samples are provided in the **ppi** folder.
-3. You can access the original **protein-protein binding affinity** dataset from [PPB-Affinity]{https://github.com/ChenPy00/PPB-Affinity}. To prevent potential data leakage, we resplited the five-fold cross-validation list by considering the structure similarity, and the list of PDB IDs for each fold is provided in the **affinity** folder
. +3. You can access the original **protein-protein binding affinity** dataset from [PPB-Affinity](https://github.com/ChenPy00/PPB-Affinity). To prevent potential data leakage, we resplited the five-fold cross-validation list by considering the structure similarity, and the list of PDB IDs for each fold is provided in the **affinity** folder
. 4. You can download the **inter-protein contact prediction** dataset through [contact_dataset](https://drive.google.com/file/d/1Xdb3SG0CRY49WqH4jUJhM-yqsLOejz7_/view?usp=share_link)
.