fixed the issue related to the cmap-mode type

This commit is contained in:
Kapil Devkota
2023-01-04 03:45:19 -05:00
parent a8c4e471c5
commit 5fd07bc9fa
163 changed files with 181852 additions and 13 deletions

0
.coverage Normal file → Executable file
View File

BIN
.environment.yml.swp Executable file

Binary file not shown.

0
.flake8 Normal file → Executable file
View File

0
.github/workflows/autorun-tests.yml vendored Normal file → Executable file
View File

0
.github/workflows/pypi_publish.yml vendored Normal file → Executable file
View File

0
.gitignore vendored Normal file → Executable file
View File

0
.pre-commit-config.yaml Normal file → Executable file
View File

0
.readthedocs.yml Normal file → Executable file
View File

0
CHANGELOG.md Normal file → Executable file
View File

0
CITATION.cff Normal file → Executable file
View File

0
LICENSE Normal file → Executable file
View File

0
README.md Normal file → Executable file
View File

0
backup/train_bak.py Normal file → Executable file
View File

0
backup/train_bak_1.py Normal file → Executable file
View File

View File

@@ -0,0 +1 @@
kdevko01@minotaur.csail.mit.edu.3391666:1648490224

0
bash_files/.train_foldseek_after.sh.swp Normal file → Executable file
View File

0
bash_files/.train_original.sh.swp Normal file → Executable file
View File

0
bash_files/.train_original_foldseek_emb.sh.swp Normal file → Executable file
View File

View File

0
bash_files/fseek_after_human_model_dscript/results.log Normal file → Executable file
View File

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 24 KiB

After

Width:  |  Height:  |  Size: 24 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Can't render this file because it is too large.

View File

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Can't render this file because it is too large.

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

File diff suppressed because it is too large Load Diff

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Can't render this file because it is too large.

File diff suppressed because it is too large Load Diff

View File

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

View File

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

View File

9814
bash_files/original_human_model_dscript-v2/results.log Normal file → Executable file

File diff suppressed because it is too large Load Diff

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

View File

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

View File

Can't render this file because it is too large.

View File

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 22 KiB

After

Width:  |  Height:  |  Size: 22 KiB

View File

Can't render this file because it is too large.

0
bash_files/original_human_model_dscript/results.log Normal file → Executable file
View File

View File

View File

@@ -0,0 +1,6 @@
[2023-01-04-03:41:40] D-SCRIPT Version 0.2.2
[2023-01-04-03:41:40] Called as: /net/scratch3.mit.edu/scratch3-3/kdevko01/conda/.conda/envs/main/bin/dscript train --train seqs-pairs/pairs/human_train.tsv --test seqs-pairs/pairs/human_test.tsv --embedding embeddings/human.h5 --topsy-turvy --glider-weight 0.2 --glider-thres 0.925 -o topsyturvy_cmap-regression/results.log --save-prefix topsyturvy_cmap-regression/ep_ --lr 0.0005 --lambda 0.05 --num-epoch 10 --weight-decay 0 --batch-size 25 --pool-width 9 --kernel-width 7 --dropout-p 0.2 --projection-dim 100 --hidden-dim 50 --kernel-width 7 --device 7
[2023-01-04-03:41:40] Using CUDA device 7 - NVIDIA A100 80GB PCIe
[2023-01-04-03:41:40] Loaded 843584 training pairs
[2023-01-04-03:41:40] Loaded 52725 test pairs
[2023-01-04-03:41:40] Loading embeddings...

View File

@@ -4,17 +4,20 @@ TOPSY_TURVY=
TRAIN=seqs-pairs/pairs/human_train.tsv
TEST=seqs-pairs/pairs/human_test.tsv
EMBEDDING=embeddings/human.h5
OUTPUT_FOLDER=fseek_after_human_model_dscript_cmap_ot
OUTPUT_FOLDER=dscript
#fseek_after_human_model_dscript_cmap
OUTPUT_PREFIX=results-
FOLDSEEK_FASTA=../../foldseek_emb/r1_foldseekrep_seq.fa
FOLDSEEK_VOCAB=../data/foldseek_vocab.json
SAMPLER=../data/models/sampler/iter_9.sav
SAMPLER="../data/models/sampler/sampler-run-Mon-26-Dec-2022-12:07:02-PM-EST/iter_999.sav"
CMAP_TRAIN=../data/pairs/cmap_train_lt_400.tsv
CMAP_TEST=../data/pairs/cmap_test_lt_400.tsv
CMAP_LANG_EMB=../lynnfiles/new_cmap_embed
CMAP_EMB=../data/embeddings/cmap_filtered_lt_400.h5
CMAP_MODE=ot
while getopts "d:t:T:e:vo:p:s:c:C:l:m:" args; do
FOLDSEEK_CMD=
CMAP_CMD=
while getopts "d:t:T:e:vo:p:s:Xc:C:l:m:fF:M:" args; do
case $args in
d) DEVICE=${OPTARG}
;;
@@ -24,7 +27,7 @@ while getopts "d:t:T:e:vo:p:s:c:C:l:m:" args; do
;;
e) EMBEDDING=${OPTARG}
;;
v) TOPSY_TURVY="--topsy-turvy --glider-weight 0.2 --glider-thres 0.925"; OUTPUT_FOLDER=fseek_after_human_model_topsyturvy_cmap_ot;
v) TOPSY_TURVY="--topsy-turvy --glider-weight 0.2 --glider-thres 0.925"; OUTPUT_FOLDER=topsyturvy; #fseek_after_human_model_topsyturvy_cmap;
;;
o) OUTPUT_FOLDER=${OPTARG}
;;
@@ -32,6 +35,8 @@ while getopts "d:t:T:e:vo:p:s:c:C:l:m:" args; do
;;
s) SAMPLER=${OPTARG}
;;
X) CMAP_CMD="--run-cmap"
;;
c) CMAP_TRAIN=${OPTARG}
;;
C) CMAP_TEST=${OPTARG}
@@ -40,9 +45,28 @@ while getopts "d:t:T:e:vo:p:s:c:C:l:m:" args; do
;;
m) CMAP_EMB=${OPTARG}
;;
M) CMAP_MODE=${OPTARG}
;;
f) FOLDSEEK_CMD="--allow_foldseek"
;;
F) FOLDSEEK_FASTA=${OPTARG}
;;
esac
done
# Setup the foldseek command
if [ ! -z ${FOLDSEEK_CMD} ]; then FOLDSEEK_CMD="${FOLDSEEK_CMD} --foldseek_fasta ${FOLDSEEK_FASTA} --foldseek_vocab ${FOLDSEEK_VOCAB} --add_foldseek_after_projection"; OUTPUT_FOLDER="${OUTPUT_FOLDER}_fseek-after" ;fi
# Setup the cmap command
if [ ! -z ${CMAP_CMD} ]
then
CMAP_CMD="${CMAP_CMD} --contact-map-train ${CMAP_TRAIN} --contact-map-test ${CMAP_TEST} --contact-map-mode ${CMAP_MODE} --contact-map-embedding ${CMAP_LANG_EMB} --contact-maps ${CMAP_EMB} --contact-map-lr 0.0001 --contact-map-lambda 0.1"
if [ ${CMAP_MODE} = "ot" ]; then CMAP_CMD="${CMAP_CMD} --contact-map-sampler ${SAMPLER} --ot-cmap-nsamples 100"; fi
OUTPUT_FOLDER="${OUTPUT_FOLDER}_cmap-${CMAP_MODE}"
fi
# Create the output folder
if [ ! -d ${OUTPUT_FOLDER} ]; then mkdir -p $OUTPUT_FOLDER; fi
#./train_foldseek_after-cmap-ot.sh -v -s ../data/models/sampler/sampler-run-Mon-26-Dec-2022-12\:07\:02-PM-EST/iter_999.sav -d 3
@@ -53,4 +77,14 @@ dscript train --train $TRAIN --test $TEST --embedding $EMBEDDING $TOPSY_TURVY \
--lr 0.0005 --lambda 0.05 --num-epoch 10 \
--weight-decay 0 --batch-size 25 --pool-width 9 \
--kernel-width 7 --dropout-p 0.2 --projection-dim 100 \
--hidden-dim 50 --kernel-width 7 --device $DEVICE --run-cmap --contact-map-train ${CMAP_TRAIN} --contact-map-test ${CMAP_TEST} --contact-map-mode ot --contact-map-embedding ${CMAP_LANG_EMB} --contact-maps ${CMAP_EMB} --contact-map-sampler ${SAMPLER} --ot-cmap-nsamples 100 --contact-map-lr 0.0001 --contact-map-lambda 0.1 # --allow_foldseek --foldseek_fasta ${FOLDSEEK_FASTA} --foldseek_vocab ${FOLDSEEK_VOCAB} --add_foldseek_after_projection ## need to add the foldseek part
--hidden-dim 50 --kernel-width 7 --device $DEVICE ${CMAP_COMMANDS} ${FOLDSEEK_CMD}
# Training CMAP commands: Without FOLDSEEK
## OT
# Topsy turvy: ./train_foldseek_after-cmap-ot.sh -v -X -M ot -d 3
# D-SCRIPT: ./train_foldseek_after-cmap-ot.sh -v -X -M ot -d 3
## Regression
# Topsy turvy: ./train_foldseek_after-cmap-ot.sh -v -X -M regression -d 3
# D-SCRIPT: ./train_foldseek_after-cmap-ot.sh -v -X -M regression -d 3

0
data/foldseek_vocab.json Normal file → Executable file
View File

0
data/models/sampler/logs.txt Normal file → Executable file
View File

0
data/pairs/cmap_debug_test.tsv Normal file → Executable file
View File

0
data/pairs/cmap_debug_train.tsv Normal file → Executable file
View File

0
data/pairs/cmap_test.tsv Normal file → Executable file
View File

0
data/pairs/cmap_test_lt_400.tsv Normal file → Executable file
View File

0
data/pairs/cmap_train.tsv Normal file → Executable file
View File

0
data/pairs/cmap_train_lt_400.tsv Normal file → Executable file
View File

0
data/pairs/ecoli_test.tsv Normal file → Executable file
View File

Can't render this file because it is too large.

0
data/pairs/fly_test.tsv Normal file → Executable file
View File

Can't render this file because it is too large.

0
data/pairs/human_test.tsv Normal file → Executable file
View File

Can't render this file because it is too large.

0
data/pairs/human_test_40.tsv Normal file → Executable file
View File

Some files were not shown because too many files have changed in this diff Show More