Delete README.ipynb

This commit is contained in:
Saoge123
2023-06-16 20:54:32 +08:00
committed by GitHub
parent 10f0046066
commit 2d2cb76753

View File

@@ -1,176 +0,0 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "be277d37-8457-46b8-9291-416e190a13b4",
"metadata": {},
"source": [
"# PocketFlow: an autoregressive flow model incorporated with chemical acknowledge for generating drug-like molecules inside protein pockets"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "90caacb8-99ec-4ce1-831a-9ed0ff2e5a34",
"metadata": {},
"source": [
"Requirements:\n",
"* Python 3.8\n",
"* pytorch 1.12\n",
"* Pytorch_Geometric 2.1.0\n",
"* RDKit\n",
"* Openbabel\n",
"* PyMol"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "9f56bea6-1554-406b-8837-9d0a8ca321b8",
"metadata": {
"tags": []
},
"source": [
"### Molecular generation\n",
"The molecule can be generated by running the following command, where the pocket pdb file and the model parameter file are required, and the rest of the parameters are optional\n",
"~~~\n",
"python main_generate.py -pkt test_samples/test_pocket10/1bvr_C_rec_pocket10-surf.pdb --ckpt ckpt/ZINC-pretrained-255000.pt -n 100 -d cuda:0 --root_path gen_results --name 1bvr -at 1.0 -bt 1.0 --max_atom_num 35 -ft 0.5 -cm True --with_print True\n",
"~~~\n",
"All parameters of generation:\n",
"~~~\n",
"usage: main_generate.py [-h] [-pkt POCKET] [--ckpt CKPT] [-n NUM_GEN] [--name NAME] [-d DEVICE] [-at ATOM_TEMPERATURE] [-bt BOND_TEMPERATURE] [--max_atom_num MAX_ATOM_NUM] [-ft FOCUS_THRESHOLD] [-cm CHOOSE_MAX]\n",
" [--min_dist_inter_mol MIN_DIST_INTER_MOL] [--bond_length_range BOND_LENGTH_RANGE] [-mdb MAX_DOUBLE_IN_6RING] [--with_print WITH_PRINT] [--root_path ROOT_PATH] [--readme README]\n",
"\n",
"optional arguments:\n",
" -h, --help show this help message and exit\n",
" -pkt POCKET, --pocket POCKET\n",
" the pdb file of pocket in receptor\n",
" --ckpt CKPT the path of saved model\n",
" -n NUM_GEN, --num_gen NUM_GEN\n",
" the number of generateive molecule\n",
" --name NAME receptor name\n",
" -d DEVICE, --device DEVICE\n",
" cuda:x or cpu\n",
" -at ATOM_TEMPERATURE, --atom_temperature ATOM_TEMPERATURE\n",
" temperature for atom sampling\n",
" -bt BOND_TEMPERATURE, --bond_temperature BOND_TEMPERATURE\n",
" temperature for bond sampling\n",
" --max_atom_num MAX_ATOM_NUM\n",
" the max atom number for generation\n",
" -ft FOCUS_THRESHOLD, --focus_threshold FOCUS_THRESHOLD\n",
" the threshold of probility for focus atom\n",
" -cm CHOOSE_MAX, --choose_max CHOOSE_MAX\n",
" whether choose the atom that has the highest prob as focus atom\n",
" --min_dist_inter_mol MIN_DIST_INTER_MOL\n",
" inter-molecular dist cutoff between protein and ligand.\n",
" --bond_length_range BOND_LENGTH_RANGE\n",
" the range of bond length for mol generation.\n",
" -mdb MAX_DOUBLE_IN_6RING, --max_double_in_6ring MAX_DOUBLE_IN_6RING\n",
" --with_print WITH_PRINT\n",
" whether print SMILES in generative process\n",
" --root_path ROOT_PATH\n",
" the root path for saving results\n",
" --readme README, -rm README\n",
" description of this genrative task\n",
"\n",
"~~~"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "099520c6-01f4-48f7-a0e9-c7ad600746ba",
"metadata": {
"tags": []
},
"source": [
"### Spliting Pocket\n",
"Based on the pose of the ligand, the pocket structure can be splited from the protein structure\n",
"~~~python\n",
"from pocket_flow import SplitPocket, Protein, Ligand\n",
"\n",
"pro = Protein('/path/to/protein.pdb')\n",
"lig = Ligand('/path/to/ligand.sdf')\n",
"dist_cutoff = 10\n",
"pocket_block, _ = SplitPocket._split_pocket_with_surface_atoms(pro, lig, dist_cutoff)\n",
"open('/path/to/pocket.pdb','w').write(pocket_block)\n",
"~~~"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "ced6f00a",
"metadata": {},
"source": []
},
{
"attachments": {},
"cell_type": "markdown",
"id": "e431e26f-5154-4944-b088-548ed48e4f63",
"metadata": {},
"source": [
"### Dataset\n",
"The raw [CrossDocked2020](https://bits.csb.pitt.edu/files/crossdock2020/) dataset is large, which need about 50G disk space. You can donwload the processed data from [Pocket2Mol](https://github.com/pengxingang/Pocket2Mol/blob/main/data/README.md)\n",
"\n",
"~~~python\n",
"from pocket_flow import CrossDocked2020\n",
"\n",
"unexpected_sample = [\n",
" line.split()[-1] for line in open('data/unexcept_element_sample_new.csv').read().split('\\n')\n",
" ]\n",
"cs2020 = CrossDocked2020(\n",
" './data/crossdocked_pocket10/',\n",
" './data/crossdocked_pocket10/index.pkl',\n",
" unexpected_sample=unexpected_sample\n",
" )\n",
"cs2020.run(\n",
" dataset_name='crossdocked_pocket10_processed_35Atoms.lmdb',\n",
" max_ligand_atom=35,\n",
" only_backbone=False,\n",
" lmdb_path='./data/'\n",
" )\n",
"~~~"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "a8a1d9d7-76aa-4566-b74f-49f52d987f82",
"metadata": {},
"source": [
"The pretraining datase of PocketFlow was choosed from [ZINC 3D](https://zinc.docking.org/tranches/home/). You can download ZINC 3D, and then use make_pretrain_data.py to produce the pretraining dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f74ff65e-dc40-40f9-a45a-0a3f7d272bac",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}