Compare commits

...

171 Commits
3.1.4 ... 4.1

Author SHA1 Message Date
Peter Schmidtke
9c7be61644 Merge pull request #86 from jordansafer/master
Fix header typo for easier parsing
2023-02-22 04:16:15 -05:00
Peter Schmidtke
f05cdb7cdd Merge pull request #97 from Discngine/mypocket
Adding support for definition of explicit pockets
2023-02-22 04:04:43 -05:00
pschmidtke
115128de62 fix for mmcif structures & explicit pockets 2023-02-22 10:01:48 +01:00
pschmidtke
ee1cc7373f missing voronoi.c 2023-02-20 14:20:43 +01:00
pschmidtke
ac18b146ff functional in PDB format 2023-02-20 14:16:06 +01:00
pschmidtke
9c2cbca91a parameter parsing finished 2023-02-20 08:00:22 +01:00
pschmidtke
c00ab205b6 backup 2023-02-18 22:39:41 +01:00
pschmidtke
86a4a97ac7 first steps in fparams 2023-02-16 07:23:36 +01:00
pschmidtke
ce4997d6e6 introducing P parameter (WIP) 2023-02-13 09:27:34 +01:00
pschmidtke
a0bbee64f8 Explicit pockets with various alternates handling 2023-02-02 11:43:24 +01:00
Peter Schmidtke
d82104b99a Dropping for security reasons 2022-10-24 13:47:59 +02:00
Jordan Safer
5ef19079e9 Fix header typo for easier parsing 2022-10-11 12:18:32 -04:00
pschmidtke
cd00b961e6 adding linux built molfileplugin 2022-05-10 11:13:07 +02:00
pschmidtke
84967334d8 fixes on alternate locations in cif & output as cif 2022-05-10 10:58:46 +02:00
pschmidtke
9906237e6d dropping dependency on /tmp 2021-11-17 21:37:18 +01:00
pschmidtke
71637e923f update for osx 2021-11-17 16:21:03 +01:00
pschmidtke
127c60a49b last fixes for cif support for models 2021-11-16 21:47:15 +01:00
pschmidtke
3a60a4c86e adding mac 2021-11-16 21:41:09 +01:00
pschmidtke
91e3d2dbbe debug env with molfileplugin for linux 2021-11-16 21:40:22 +01:00
pschmidtke
f99a6ed96d pdbxpluging update 2021-11-15 21:46:24 +01:00
pschmidtke
855bad8cce update molfilepluging linux & osx 2021-11-10 15:16:39 +01:00
pschmidtke
5408f01005 support for model flag even no model is available 2021-11-09 22:02:24 +01:00
pschmidtke
2acab799f4 adding sample NMR structure 2021-11-09 22:01:36 +01:00
pschmidtke
edfd87c188 support for OSX & residue selection on mmcif 2021-11-09 21:10:03 +01:00
pschmidtke
5e3b4f31dc adding new dockerfile 2021-11-04 07:47:56 +01:00
pschmidtke
8bcf81635e adding dockerfile for debian-slim distrib 2021-11-03 18:31:49 +01:00
Peter Schmidtke
38a1c49440 Update README.md 2021-02-02 21:47:44 +01:00
Peter Schmidtke
efd7b0ebb7 Update README.md 2021-02-02 21:46:03 +01:00
Peter Schmidtke
0a84309db4 Merge pull request #59 from Discngine/mmcif
Documentation stuff missing
2021-02-02 21:44:54 +01:00
pschmidtke
6e80f9203c Merge branch 'master' into mmcif 2021-02-02 21:43:55 +01:00
pschmidtke
e22d38054a adding missing doc in fparams 2021-02-02 21:42:35 +01:00
Peter Schmidtke
439da234a3 Merge pull request #55 from HTian1997/master
Update GETTINGSTARTED.md
2021-02-02 21:14:01 +01:00
Peter Schmidtke
b68d547fd5 Merge pull request #53 from Discngine/mmcif
Enabling Mmcif support in fpocket
2021-02-02 21:11:49 +01:00
pschmidtke
e57afc2bad working now? 2021-02-02 21:07:36 +01:00
pschmidtke
9617bda214 missing asa header in descriptors 2021-02-02 20:59:37 +01:00
pschmidtke
8e81c0c807 minor ui adjustments 2021-02-02 20:52:24 +01:00
Hao Tian
7b500b4dd6 Update GETTINGSTARTED.md 2020-08-27 12:40:59 -05:00
ShorkarMael
209f93aa69 Update c-cpp.yml
working version for pytest
2020-07-31 09:13:18 +02:00
ShorkarMael
c50e921077 Update c-cpp.yml
test
2020-07-31 09:11:00 +02:00
ShorkarMael
a2c1745242 Update c-cpp.yml
test
2020-07-31 09:00:40 +02:00
ShorkarMael
877014bd60 Update c-cpp.yml 2020-07-31 08:54:03 +02:00
ShorkarMael
9d11cbb2fb Update c-cpp.yml
test
2020-07-31 08:42:50 +02:00
ShorkarMael
296fb4789c Update c-cpp.yml
test
2020-07-31 00:35:03 +02:00
ShorkarMael
e89a312df2 Update c-cpp.yml
python version
2020-07-31 00:28:00 +02:00
ShorkarMael
0d1a74253c Update c-cpp.yml
test
2020-07-31 00:21:43 +02:00
ShorkarMael
f4208748ed Update c-cpp.yml
test
2020-07-31 00:19:04 +02:00
ShorkarMael
b91c92d5b4 Update c-cpp.yml
test
2020-07-31 00:11:04 +02:00
ShorkarMael
e191b547dc Update c-cpp.yml
test
2020-07-31 00:05:44 +02:00
ShorkarMael
73ce917ea0 Update c-cpp.yml
test
2020-07-30 23:57:19 +02:00
ShorkarMael
10fd5e5cef Update c-cpp.yml
v2
2020-07-30 23:52:22 +02:00
ShorkarMael
b715bec165 Update c-cpp.yml
test
2020-07-30 23:51:17 +02:00
ShorkarMael
2a1702fceb Update c-cpp.yml
test
2020-07-30 22:20:37 +02:00
ShorkarMael
28031f4cf8 Update c-cpp.yml
1
2020-07-30 21:11:20 +02:00
ShorkarMael
84af7678e1 Update c-cpp.yml
pytest
2020-07-30 21:02:10 +02:00
ShorkarMael
f58b5ffa43 Update c-cpp.yml
test
2020-07-30 20:53:11 +02:00
ShorkarMael
1f01eefe25 Update c-cpp.yml
test
2020-07-30 20:49:07 +02:00
ShorkarMael
17d63260d5 Update c-cpp.yml
Rmv init
2020-07-30 20:02:26 +02:00
ShorkarMael
1572782a39 Update c-cpp.yml
source
2020-07-30 19:51:30 +02:00
ShorkarMael
888a6eee19 Update c-cpp.yml
test
2020-07-30 19:46:35 +02:00
ShorkarMael
76d21c256b Update c-cpp.yml
init
2020-07-30 19:40:31 +02:00
ShorkarMael
fad1337ad5 Update c-cpp.yml
init
2020-07-30 19:37:20 +02:00
ShorkarMael
fce9c66825 Update c-cpp.yml
source
2020-07-30 19:31:32 +02:00
ShorkarMael
cffeb8f14d Update c-cpp.yml
conda setup
2020-07-30 19:26:34 +02:00
ShorkarMael
3363536b08 Update c-cpp.yml
test
2020-07-30 19:23:28 +02:00
ShorkarMael
267e63050c Update c-cpp.yml
test
2020-07-30 19:18:41 +02:00
ShorkarMael
620f3431c6 Update c-cpp.yml
test
2020-07-30 19:09:35 +02:00
ShorkarMael
74471a1123 Update c-cpp.yml
conda changes
2020-07-30 19:03:18 +02:00
ShorkarMael
927c988b09 Merge branch 'mmcif' of https://github.com/Discngine/fpocket into mmcif 2020-07-30 08:36:22 -07:00
ShorkarMael
0a92618189 bug fixes for writing 2020-07-30 08:36:06 -07:00
pschmidtke
06e8ee4ab0 update osx version 2020-07-30 11:37:08 +02:00
pschmidtke
9a7a1d321c shared library update OSX86_64 2020-07-30 11:34:06 +02:00
pschmidtke
f31205a0b8 correct corrupted makefile 2020-07-30 11:29:25 +02:00
Peter Schmidtke
ff35953eb5 Update makefile
enabling proper make fpocket
2020-07-30 11:24:05 +02:00
ShorkarMael
1565cc0701 Merge branch 'mmcif' of https://github.com/Discngine/fpocket into mmcif 2020-07-30 02:15:36 -07:00
ShorkarMael
653472798e writepocket.h upadate 2020-07-30 02:15:31 -07:00
ShorkarMael
bbb138378e Merge branch 'master' into mmcif 2020-07-29 23:16:44 +02:00
ShorkarMael
0e15bc672d added a test for multiple chains as ligand 2020-07-29 13:43:09 -07:00
ShorkarMael
521f6a2701 added mmcif doc and multiple chains as ligand 2020-07-29 10:15:23 -07:00
ShorkarMael
faf35bb9c0 added long chain args and update log 2020-07-23 17:18:41 -07:00
ShorkarMael
98fdedb1c5 tests for mmcif added 2020-07-23 09:14:33 -07:00
ShorkarMael
e635a3e359 PYMOL and VMD reading special 2020-07-23 02:07:48 -07:00
ShorkarMael
70917d144b documentation on writing params 2020-07-22 09:25:31 -07:00
ShorkarMael
5b177bc7bd write mode selection 2020-07-22 09:14:26 -07:00
ShorkarMael
01b2099469 writing mmcif fonctionnal and separate pockets 2020-07-22 05:16:21 -07:00
ShorkarMael
ae1f26089c insertion code reading 2020-07-17 05:01:36 -07:00
ShorkarMael
fca9e379c5 longer chain names 2020-07-17 01:18:55 -07:00
ShorkarMael
3ef844937b take into account 2 letters chain names 2020-07-16 11:31:45 -07:00
ShorkarMael
6b87b664ab show info 2020-07-16 02:28:30 -07:00
ShorkarMael
81618554df updated pdbxplugin 2020-07-16 02:12:02 -07:00
ShorkarMael
78a6e541fb added missing sample 2020-07-15 06:42:38 -07:00
ShorkarMael
4287ca1a27 added file format detection in main 2020-07-15 06:36:21 -07:00
ShorkarMael
d0e25956ab fatal error fixed 2020-07-13 09:01:34 -07:00
ShorkarMael
62d1a12438 mmcif_reading works but outputs fatal error 2020-07-13 07:58:46 -07:00
ShorkarMael
e82a452044 molfile_plugin src modif 2020-07-08 15:46:15 -07:00
Peter Schmidtke
de94aada19 Merge pull request #51 from Discngine/explicit-pocket
Explicit pocket and keep chains
2020-07-08 11:28:54 +02:00
ShorkarMael
858073a4e1 fixed typo and deleted mod 2POR 2020-07-05 11:53:26 -07:00
ShorkarMael
93772970c9 deleted the prints 2020-07-03 17:44:01 -07:00
ShorkarMael
f0e7efa86d small changes 2020-07-03 17:03:16 -07:00
ShorkarMael
642bc8f12b reading mmcif with read_mmcif.c 2020-07-03 02:47:32 -07:00
ShorkarMael
e5b2d2d9af deleted useless print 2020-06-25 09:27:06 -07:00
ShorkarMael
294b1a6e07 added doc ,tests and keep chains 2020-06-25 09:21:45 -07:00
ShorkarMael
70de53b855 params added to desciption 2020-06-25 01:38:16 -07:00
ShorkarMael
df613de93a chain as a ligand working (pytest included) 2020-06-24 09:54:54 -07:00
ShorkarMael
7f0b864de2 del the chain chose as ligand 2020-06-22 01:48:45 -07:00
ShorkarMael
370a8aa016 first release explicit chain as ligand 2020-06-18 11:57:03 -07:00
Peter Schmidtke
a0eaccea0a Update c-cpp.yml 2020-06-16 00:00:32 +02:00
Peter Schmidtke
3e96d21c37 Update c-cpp.yml 2020-06-15 23:58:37 +02:00
Peter Schmidtke
e4c728043d Update c-cpp.yml 2020-06-15 23:56:03 +02:00
Peter Schmidtke
fe4dd1c68f Update c-cpp.yml 2020-06-15 23:50:03 +02:00
Peter Schmidtke
846465d8c3 adding pytest to github actions
lilely not working
2020-06-15 23:48:38 +02:00
Peter Schmidtke
a5c7a69125 Update c-cpp.yml 2020-06-15 23:45:08 +02:00
Peter Schmidtke
6e0d0330f3 install netcdf 2020-06-15 23:43:38 +02:00
Peter Schmidtke
0f3bf034ff making only fpocket 2020-06-15 23:37:15 +02:00
Peter Schmidtke
32aaabe217 integration of github actions
linux and mac builds - without netcdf support for now
2020-06-15 23:34:39 +02:00
pschmidtke
5ae071daf4 adding git attributes 2020-06-15 23:27:22 +02:00
Peter Schmidtke
2bfc2a15a6 Merge pull request #46 from Discngine/testToDelete
Test to delete
2020-06-15 23:20:50 +02:00
Peter Schmidtke
2b97e041f7 gitter badge update 2020-06-15 15:49:32 +02:00
ShorkarMael
7766583153 Merge branch 'master' into testToDelete 2020-06-15 02:54:35 -07:00
ShorkarMael
ebcca8a0aa test in the same function 2020-06-15 02:51:09 -07:00
ShorkarMael
7a334f524a added a test for long parameter drop chains 2020-06-15 02:26:09 -07:00
ShorkarMael
546c99a41b fixed the long option for drop chains 2020-06-15 02:12:51 -07:00
ShorkarMael
180e480a4b fixed the 0 pocket issue 2020-06-14 14:13:44 -07:00
pschmidtke
e22d9ae636 dropping deprecated maintainer from dockerfile 2020-06-14 13:28:54 +02:00
Peter Schmidtke
e2f4513ec8 Merge pull request #49 from Discngine/docker
docker file and documentation update
2020-06-13 23:41:11 +02:00
pschmidtke
ed7ff872e4 docker file and documentation update 2020-06-13 23:40:21 +02:00
Peter Schmidtke
db835c175b trigger build only on PR's on master 2020-06-13 19:45:57 +02:00
Peter Schmidtke
64f94d710a correction on gitter badge placement 2020-06-13 18:53:43 +02:00
Peter Schmidtke
2d4c972862 adding gitter badge 2020-06-13 18:51:16 +02:00
pschmidtke
bd700f5c2c adding explicit pocket to test cases 2020-06-13 18:27:10 +02:00
Peter Schmidtke
9b973a7470 Adding build status badges 2020-06-13 18:11:50 +02:00
Peter Schmidtke
933f62c93d Update azure-pipelines.yml for Azure Pipelines 2020-06-13 18:09:09 +02:00
Peter Schmidtke
fcbad192f1 Update azure-pipelines.yml for Azure Pipelines 2020-06-13 18:06:03 +02:00
Peter Schmidtke
ca599f8be0 Update azure-pipelines.yml for Azure Pipelines 2020-06-13 18:02:43 +02:00
Peter Schmidtke
3f9d19f2dd Update azure-pipelines.yml for Azure Pipelines 2020-06-13 17:59:08 +02:00
Peter Schmidtke
18b6dd8144 Update azure-pipelines.yml for Azure Pipelines 2020-06-13 17:53:56 +02:00
Peter Schmidtke
2531c98bb4 Update azure-pipelines.yml for Azure Pipelines 2020-06-13 17:48:11 +02:00
Peter Schmidtke
65380b5ecd Update azure-pipelines.yml for Azure Pipelines 2020-06-13 17:45:51 +02:00
Peter Schmidtke
246181b712 adding conda pytest to Pipeline 2020-06-13 17:34:31 +02:00
Peter Schmidtke
e1ad76a8a7 installing netcdf dependency 2020-06-13 17:30:59 +02:00
Peter Schmidtke
e99e1574ea Set up CI with Azure Pipelines
[skip ci]
2020-06-13 17:24:06 +02:00
ShorkarMael
23ff71386e documentation update with delete chains 2020-06-12 08:21:13 -07:00
Peter Schmidtke
dbe7470906 Update issue templates 2020-06-12 11:33:44 +02:00
ShorkarMael
ea2dfc4960 test 2P0R 2020-06-11 06:16:47 -07:00
ShorkarMael
cd84135740 Merge branch 'master' into testToDelete 2020-06-11 05:32:46 -07:00
ShorkarMael
110f92f8b5 droping multiple chains 2020-06-11 05:09:34 -07:00
Peter Schmidtke
587003c60a Merge pull request #45 from Discngine/testing
Testing
2020-06-09 23:00:38 +02:00
pschmidtke
a32bd136bc adding reference results to repo 2020-06-09 22:59:09 +02:00
pschmidtke
9687d920f1 test cases added 2020-06-09 22:58:46 +02:00
pschmidtke
ada630181b readme update for testing 2020-06-09 22:58:36 +02:00
ShorkarMael
fe804441c9 deletes one chain and find pockets works 2020-06-09 07:28:09 -07:00
ShorkarMael
8b1e011254 delete 1 selected chain still has bugs 2020-06-08 16:51:26 -07:00
ShorkarMael
35d3f17520 Merge branch 'testToDelete' of https://github.com/Discngine/fpocket into testToDelete 2020-06-07 07:37:04 -07:00
ShorkarMael
39a447bac6 test1 2020-06-07 07:36:58 -07:00
pschmidtke
cc5ca7595d simple test case 2020-06-06 21:56:58 +02:00
pschmidtke
56e9ea2735 reference output 1uyd 2020-06-06 21:56:29 +02:00
pschmidtke
0bd608de42 adding conda environment for testing 2020-06-06 19:22:26 +02:00
pschmidtke
43acf4898e putting back the readme 2020-06-06 19:22:16 +02:00
pschmidtke
c75129ebcc cleaning up 2020-06-06 19:12:44 +02:00
Peter Schmidtke
bfeeb399a9 Merge pull request #43 from Discngine/documentation
adding conda install instructions
2020-06-06 16:14:40 +02:00
pschmidtke
e274275fd8 adding conda install instructions 2020-06-06 16:13:54 +02:00
Peter Schmidtke
39f2df9ba9 Merge pull request #42 from Discngine/documentation
Documentation
2020-06-06 16:07:31 +02:00
pschmidtke
16ed338bc1 moving old documentation to deprecated 2020-06-06 16:04:20 +02:00
pschmidtke
fdb4de240a documentation test 2020-06-06 16:00:27 +02:00
pschmidtke
e30a9f3641 fpocket advanced doc 2020-06-06 14:24:58 +02:00
pschmidtke
9f10e283c5 mdpocket basic features 2020-06-05 23:43:38 +02:00
pschmidtke
6e7e62d0f6 documentation update 2020-06-05 22:46:51 +02:00
pschmidtke
8f72305ee4 adding dpocket sample files & images 2020-06-05 21:47:06 +02:00
pschmidtke
707de5ad96 update main readme 2020-06-05 20:49:13 +02:00
pschmidtke
f91bd7a1fc documentation revamp 1 2020-06-05 20:34:34 +02:00
ShorkarMael
6a54364b67 test 2020-06-02 09:26:03 -07:00
pschmidtke
65c809d04b explaining explicit pocket detection params 2020-06-02 17:36:07 +02:00
1954 changed files with 468739 additions and 57017 deletions

1
.dockerignore Normal file
View File

@@ -0,0 +1 @@
*.o

2
.gitattributes vendored Normal file
View File

@@ -0,0 +1,2 @@
*.tcl linguist-detectable=false
*.c linguist-detectable=true

38
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,38 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
**Smartphone (please complete the following information):**
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
**Additional context**
Add any other context about the problem here.

View File

@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

10
.github/ISSUE_TEMPLATE/question.md vendored Normal file
View File

@@ -0,0 +1,10 @@
---
name: Question
about: Custom question
title: ''
labels: question
assignees: ''
---

28
.github/workflows/c-cpp.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
name: C/C++ CI
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build-and-test-linux:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: netcdf
run: sudo apt-get install libnetcdf-dev
- name: make
run: make fpocket
- name: Set up Python 3.7
uses: actions/setup-python@v2
with:
python-version: 3.7
- name: create conda environment
run: conda env update -f ./tests/environment.yml
- name : activate
run : |
eval "$(conda shell.bash hook)"
conda activate fpocket_test
pytest

7
.gitignore vendored
View File

@@ -1,5 +1,8 @@
nbproject
*.o
*_out
.vscode
src/qhull/bin/
src/qhull/bin/
src/qhull/lib/libqhullstatic_r.a
src/qhull/lib/libqhullstatic.a
*.pyc
vmd/plugins

22
Dockerfile-debian-slim Normal file
View File

@@ -0,0 +1,22 @@
FROM debian:bullseye-slim
RUN groupadd -r fpocket && useradd --no-log-init -r -g fpocket fpocket
RUN apt update -y && apt install -y gcc g++ make libnetcdf-dev && rm -rf /var/lib/apt/lists/*
# all of this mess is essentially to have a minimalistic build at the end
COPY makefile /opt/fpocket/
COPY src /opt/fpocket/src
COPY man /opt/fpocket/man
COPY headers /opt/fpocket/headers
COPY obj /opt/fpocket/obj
COPY scripts /opt/fpocket/scripts
COPY bin /opt/fpocket/bin
COPY plugins/LINUXAMD64 /opt/fpocket/plugins/LINUXAMD64
COPY plugins/include /opt/fpocket/plugins/include
COPY plugins/noarch /opt/fpocket/plugins/noarch
WORKDIR /opt/fpocket
RUN make && make install && make clean
USER fpocket
WORKDIR /tmp
CMD ["fpocket"]

29
Dockerfile-molfile-debug Normal file
View File

@@ -0,0 +1,29 @@
FROM ubuntu:latest
RUN groupadd -r fpocket && useradd --no-log-init -r -g fpocket fpocket
ENV DEBIAN_FRONTEND=noninteractive
ENV PLUGINDIR=compiled
RUN apt update -y && apt install -y vim gdb gcc g++ make libnetcdf-dev && rm -rf /var/lib/apt/lists/*
# all of this mess is essentially to have a minimalistic build at the end
COPY vmd /vmd
WORKDIR /vmd/plugins
RUN make LINUXAMD64 && make distrib
COPY makefile /opt/fpocket/
COPY src /opt/fpocket/src
COPY data/sample /opt/fpocket/sample
COPY man /opt/fpocket/man
COPY headers /opt/fpocket/headers
COPY obj /opt/fpocket/obj
COPY scripts /opt/fpocket/scripts
COPY bin /opt/fpocket/bin
COPY plugins/LINUXAMD64 /opt/fpocket/plugins/LINUXAMD64
COPY plugins/include /opt/fpocket/plugins/include
COPY plugins/noarch /opt/fpocket/plugins/noarch
WORKDIR /opt/fpocket
RUN cp -r /vmd/plugins/molfile_plugin/compiled/LINUXAMD64/molfile/* plugins/LINUXAMD64/molfile/
#RUN make && make install && make clean
USER fpocket
WORKDIR /tmp
CMD ["fpocket"]

173
README.md
View File

@@ -1,19 +1,31 @@
# fpocket project
![fpocket logo](doc/images/fpocket_logo.png)
[![Build Status](https://dev.azure.com/3decision/fpocket/_apis/build/status/Discngine.fpocket?branchName=master)](https://dev.azure.com/3decision/fpocket/_build/latest?definitionId=2&branchName=master)
[![Join the chat at https://gitter.im/fpocket/community](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/fpocket-official/community?utm_source=badge&utm_medium=badge&utm_content=badge)
The fpocket suite of programs is a very fast open source protein pocket detection algorithm based on Voronoi tessellation. The platform is suited for the scientific community willing to develop new scoring functions and extract pocket descriptors on a large scale level.
Detailed documentation is available here: [User Manual](doc/MANUAL.md).
The documentation below here is just a quick & rough overview.
## Content
fpocket: the original pocket prediction on a single protein structure
mdpocket: extension of fpocket to analyse conformational ensembles of proteins (MD trajectories for instance)
dpocket: extract pocket descriptors
tpocket: test your pocket scoring function
* __fpocket__ : the original pocket prediction on a single protein structure
* __mdpocket__ : extension of fpocket to analyse conformational ensembles of proteins (MD trajectories for instance)
* __dpocket__ : extract pocket descriptors
* __tpocket__ : test your pocket scoring function
## What's new compared to fpocket 2.0 (old sourceforge repo)
fpocket:
__fpocket__:
- fpocket now supports mmCIF as input and output format together with the classical PDB format
- support was added to define / delete and handle protein chains or sets of them to enable characterization of protein protein binding epitopes
- is now able to consider explicit pockets when you want to calculate properties for a known binding site
- cli changed a bit
- pocket flexibility using temperature factors is better considered (less very flexible pockets on very solvent exposed areas)
- druggability score has been reoptimized vs original paper. Yields now slightly better results than the original implementation.
- compiler bug on newer compilers fixed
mdpocket:
- can now read Gromacs XTC, netcdf and dcd trajectories
- can also read prmtop topologies
@@ -22,12 +34,11 @@ mdpocket:
## Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
### Prerequisites
### Prerequisites (if you want to compile it)
The most recent versions (starting with fpocket 3.0) make use of the molfile plugin from VMD. This plugin is shipped with fpocket. However, now you need to install the netcdf library on your system. This is typically called netcdf-devel or so, depending on you linux distribution.
fpocket needs to be compiled to run on your machine. For this you'll need the gnu c compiler (or another one, but didn't test with others than GCC).
fpocket needs to be compiled to run on your machine. For this you'll need the gnu c compiler (or another one).
install netcdf-devel on ubuntu type :
```
sudo apt-get install libnetcdf-dev
@@ -37,6 +48,44 @@ on a RHEL based distribution something like this should do:
sudo yum install netcdf-devel.x86_64
```
on OSX:
Install MacPorts https://www.macports.org/ for instance (needed for netcdf install)
```bash
sudo port install netcdf
export LIBRARY_PATH=/opt/local/lib
```
### Docker Image
#### Using the official fpocket docker image
The following command will pull the latest fpocket docker image from the dockerhub.
```bash
docker pull fpocket/fpocket
```
#### Building the docker image
You can create a docker image with fpocket using the provided Dockerfile of the repo (obviously you'd need docker to do that):
```bash
docker build -t fpocket/fpocket .
```
#### Using the docker image
This will build fpocket into your local fpocket/fpocket image. You can then run fpocket/mdpocket etc using:
```bash
docker run -v `pwd`:/WORKDIR fpocket/fpocket fpocket -f data/sample/1UYD.pdb
```
Here you mount your current directory with your input files into the preconfigured `/WORKDIR` in the docker container and then run fpocket on a file in that mounted folder.
### Installing
Download the sources from github via the website or using git clone and then build and deploy fpocket using the following commands.
@@ -51,93 +100,91 @@ sudo make install
```
#### Compiling on Mac
Install MacPorts https://www.macports.org/ for instance (needed for netcdf install)
```
sudo port install netcdf
export LIBRARY_PATH=/opt/local/lib
git clone https://github.com/Discngine/fpocket.git
cd fpocket
make ARCH=MACOSXX86_64
sudo make install
```
#### Using conda
There's also a conda package of fpocket available thanks to Simon Bray. You can install fpocket using conda with:
```
conda config --add channels conda-forge
conda install fpocket
```
End with an example of getting some data out of the system or using it for a little demo
#### Testing your installation
## Running the tests
In order to test if the compilation went well you can compare results from fpocket sample files to reference results shipped with fpocket. The easiest way to do that is by using pytest. If you do not have pytest yet, you can install the required library using the conda environment file in the tests folder:
The source code of fpocket is shipped with samples. They can be found in the data/sample folder. Try to run fpocket against the 1uyd sample to check if it's running OK.
```bash
conda env create -f tests/environment.yml
conda activate fpocket_test
```
Once your conda environment activated you can run
```
cd data/sample
fpocket -f 1UYD.pdb
```
fpocket should state when it's beginning to search pocket and also when it's ending the search. Upon completion the folder should now contain a folder called 1UYD_out. Check whether the folder exists and the pdb files contain data and the pocket info file contains results.
## User Manual
For now the user manual (still the one from fpocket 2.0) can be found in the doc folder. When I have some time to kill (or if somebody else has) we could add that here somewhere.
## Frequent issues encountered
### netcdf issues
```
cannot find -lnetcdf
```
mdpocket supports reading and writing NETCDF formatted files. In order to use this you need to install the netcdf development libraries on your system.
In centos this can be achieved like this :
```
yum install -y epel-release #if the epel repo is not yet activated on your system
yum install -y netcdf-devel
pytest
```
Run make again after installing this library. Mdpocket should build just fine now.
If everything works fine you should get something like this output here:
```bash
fpocket_test) Mac-Pro:fpocket peter$ pytest
============================================================= test session starts ==============================================================
platform darwin -- Python 3.7.7, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
rootdir: /Users/peter/Documents/Work/fpocket_git/fpocket
collected 4 items
tests/test_fpocket.py .... [100%]
============================================================== 4 passed in 40.92s ==============================================================
### stdc++ issues
```
cannot find -lstdc++
```
You need to install the stc++ static libraries to build fpocket & mdpocket. On centos 7.4 this can be done like this :
```
yum install -y libstc++-static
If something fails in there you'll have a rather verbose and red output ... trust me you'll notice and panic ;)
### Running fpocket
You can run fpocket using the following command line as an example:
```bash
fpocket -f 1uyd.pdb
```
### linking to molfile plugin issues
If you observe an error similar to this one
fpocket now also eats cif as input, so this would work as well. Make sure to use proper file extensions
```bash
fpocket -f 1uyd.cif
```
ld: warning: ignoring file plugins/MACOSXX86/molfile/libmolfile_plugin.a, file was built for archive which is not the architecture being linked (x86_64): plugins/MACOSXX86/molfile/libmolfile_plugin.a
Undefined symbols for architecture x86_64:
"_molfile_parm7plugin_init", referenced from:
_read_topology in topology.o
"_molfile_parm7plugin_register", referenced from:
_read_topology in topology.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [bin/fpocket] Error 1
make: *** [all] Error 2
This will detect all pockets on the input pdb file, named 1uyd.pdb
If you want to get all command line args for fpocket, simply type `fpocket``
### Running mdpocket
To detect all pockets and create a pocket frequency grid on a sample input trajectory in an xtc format for instance you can run:
```bash
mdpocket --trajectory_file input.xtc --trajectory_format xtc -f topology.pdb
```
then statically built libmolfile_plugin is not compatible with your machine. First check out that the ARCH variable set in the first line of the Makefile of fpocket actually reflects the architecture you want. For now I'm trying to support linux 64 bit systems and OSX 64 (LINUXAMD64) bit systems built with clang (MACOSXX86). So both should work out of the box. If they do not, you might need to build the molfile plugin for your architecture. All available system architectures for the molfile plugin can be found in the plugins folder tree : [plugins directory](https://github.com/Discngine/fpocket/tree/master/plugins).
Here you can find more information on how to build the molfile plugin on CentOS 7.4:
[compile molfile plugin on centos 7.4 - Discngine blog post](https://www.discngine.com/blog/2019/5/25/building-the-vmd-molfile-plugin-from-source-code)
Once built, copy the architecture folder into the fpocket/plugins directory and make sure to declare this architecture in the ARCH variable in the Makefile. Finally run make again.
If you manage to build for other architectures and it works, I'd be happy to accept PR's with the relevant plugin architectures as I cannot build all of them on my own ;).
## Detailed User Manual
You can access the detailed user manual here * [User Manual](doc/MANUAL.md)
## Contributing
Please read [CONTRIBUTING.md](https://gist.github.com/PurpleBooth/b24679402957c63ec426) for details on our code of conduct, and the process for submitting pull requests to us.
## Authors
* **Peter Schmidtke** - *Initial work* - [pschmidtke](https://github.com/pschmidtke)
* **Vincent Le Guilloux** - *Initial work* - [leguilv](https://github.com/leguilv)
* **Mael Shorkar** - *Chain handling, MMCIF support* - [shorkarmael](https://github.com/shorkarmael)
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details
## Acknowledgments
* to be filled

49
azure-pipelines.yml Normal file
View File

@@ -0,0 +1,49 @@
# C/C++ with GCC
# Build your C/C++ project with GCC using make.
# Add steps that publish test results, save build artifacts, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/apps/c-cpp/gcc
trigger: none # will disable CI builds entirely i.e. merges, check-ins do not trigger this build
pr:
- master # trigger on a PR to master
jobs:
- job: build_and_test_linux
pool:
vmImage: 'ubuntu-latest'
steps:
- script: conda env create --file tests/environment.yml --name fpocket_test
displayName: Create Anaconda environment
- script: conda env list
displayName: environment installation verification
- task: Bash@3
inputs:
targetType: 'inline'
script: |
eval "$(conda shell.bash hook)"
sudo apt-get install libnetcdf-dev
conda activate fpocket_test
make
pytest
displayName: Active
- job: build_and_test_mac
pool:
vmImage: 'macOS-10.15'
steps:
- bash: sudo chown -R $USER $CONDA
- script: conda env create --file tests/environment.yml --name fpocket_test
displayName: Create Anaconda environment
- script: conda env list
displayName: environment installation verification
- task: Bash@3
inputs:
targetType: 'inline'
script: |
eval "$(conda shell.bash hook)"
conda activate fpocket_test
make fpocket ARCH=MACOSXX86_64
pytest
displayName: Active

View File

@@ -1,38 +0,0 @@
============================================================================
FPOCKET BUG REPORT
============================================================================
Your name :
Your email address :
System Configuration
- ---------------------
Architecture (example: Intel Pentium, AMD Opteron) :
Operating System (example: openSuse, Mac OS X) :
Fpocket version (example: Fpocket-1.0) :
Compiler used (example: gcc 2.7.2) :
Please enter a FULL description of your problem:
- ------------------------------------------------
Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
- ----------------------------------------------------------------------
If you know how this problem might be fixed, list the solution below:
- ---------------------------------------------------------------------
=============================================================================
!!! Please send this bug report to fpocket-support@lists.sourceforge.net !!!
!!! The development team will try to fix your problem as soon as possible !!!
=============================================================================

View File

@@ -1,16 +0,0 @@
>>>>>>>>>>>>>>> FPocket V1.0 -> V1.1 <<<<<<<<<<<<<<<
>>>>>>>>>>>>>>> FPocket V1.0 -> V1.0 <<<<<<<<<<<<<<<
***** MAJOR CHANGES
19-02-2010 (vl) Created (not too soon... ;) )
*****MINOR CHANGES
19-02-2010 (vl) Created (not too soon... ;) )
***** GENERAL COMMENTS

View File

@@ -1,32 +0,0 @@
# -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.
# Need the 2.59 version of autoconf
AC_PREREQ(2.59)
# Intialize autoconf, specifiting package name, version and authors
AC_INIT([fpocket], 1.0, [P. Schmidtke <pschmidtke@mmb.pcb.ub.es> and V. Le Guilloux <vincent.le-guilloux@univ-orleans.fr>])
# Check that we run autoconf in the right directory
AC_CONFIG_SRCDIR([src/fpmain.c])
# Auxiliary scripts such as install-sh and depcomp should be in builx-aux
# AC_CONFIG_AUX_DIR([build-aux])
# Initialize automake (foreign package for the moment)
AM_INIT_AUTOMAKE([-Wall -Werror foreign])
# Check for C compiler and make utility
AC_PROG_CC
AC_PROG_MAKE_SET
# Declare output header and output files (Makefiles)
AC_CONFIG_HEADERS([config.h])
AC_CONFIG_FILES([
Makefile
man/Makefile
bin/Makefile
])
# Launch the tool
AC_OUTPUT

7
data/.gitignore vendored Normal file
View File

@@ -0,0 +1,7 @@
nbproject
*.o
*.out
.vscode
src/qhull/bin/
src/qhull/lib/libqhullstatic_r.a
src/qhull/lib/libqhullstatic.a

3697
data/sample/1ATP.pdb Normal file

File diff suppressed because it is too large Load Diff

4894
data/sample/1QNH.cif Normal file

File diff suppressed because it is too large Load Diff

3678
data/sample/1UYD.cif Normal file

File diff suppressed because it is too large Load Diff

1969
data/sample/1UYD_wrote.cif Normal file

File diff suppressed because it is too large Load Diff

1952
data/sample/1UYD_wrote.pdb Normal file

File diff suppressed because it is too large Load Diff

8291
data/sample/1g50.cif Normal file

File diff suppressed because it is too large Load Diff

6764
data/sample/1g50.pdb Normal file

File diff suppressed because it is too large Load Diff

1555
data/sample/1orc.cif Normal file

File diff suppressed because it is too large Load Diff

877
data/sample/1orc.pdb Normal file
View File

@@ -0,0 +1,877 @@
HEADER GENE REGULATING PROTEIN 30-OCT-95 1ORC
TITLE CRO REPRESSOR INSERTION MUTANT K56-[DGEVK]
COMPND MOL_ID: 1;
COMPND 2 MOLECULE: CRO REPRESSOR INSERTION MUTANT K56-[DGEVK];
COMPND 3 CHAIN: A;
COMPND 4 ENGINEERED: YES;
COMPND 5 MUTATION: YES;
COMPND 6 OTHER_DETAILS: RESULTS IN A 71-RESIDUE STABLE "MONOMER"
COMPND 7 MUTANT
SOURCE MOL_ID: 1;
SOURCE 2 ORGANISM_SCIENTIFIC: ENTEROBACTERIA PHAGE LAMBDA;
SOURCE 3 ORGANISM_TAXID: 10710;
SOURCE 4 GENE: CRO MUTANT K56-[DGEVK];
SOURCE 5 EXPRESSION_SYSTEM: ESCHERICHIA COLI;
SOURCE 6 EXPRESSION_SYSTEM_TAXID: 562;
SOURCE 7 EXPRESSION_SYSTEM_GENE: CRO MUTANT K56-[DGEVK]
KEYWDS GENE REGULATING PROTEIN
EXPDTA X-RAY DIFFRACTION
AUTHOR R.A.ALBRIGHT,M.C.MOSSING,B.W.MATTHEWS
REVDAT 2 24-FEB-09 1ORC 1 VERSN
REVDAT 1 23-DEC-96 1ORC 0
JRNL AUTH R.A.ALBRIGHT,M.C.MOSSING,B.W.MATTHEWS
JRNL TITL HIGH-RESOLUTION STRUCTURE OF AN ENGINEERED CRO
JRNL TITL 2 MONOMER SHOWS CHANGES IN CONFORMATION RELATIVE TO
JRNL TITL 3 THE NATIVE DIMER.
JRNL REF BIOCHEMISTRY V. 35 735 1996
JRNL REFN ISSN 0006-2960
JRNL PMID 8547253
JRNL DOI 10.1021/BI951958N
REMARK 1
REMARK 1 REFERENCE 1
REMARK 1 AUTH M.C.MOSSING,R.T.SAUER
REMARK 1 TITL STABLE, MONOMERIC VARIANTS OF LAMBDA CRO OBTAINED
REMARK 1 TITL 2 BY INSERTION OF A DESIGNED BETA-HAIRPIN SEQUENCE
REMARK 1 REF SCIENCE V. 250 1712 1990
REMARK 1 REFN ISSN 0036-8075
REMARK 2
REMARK 2 RESOLUTION. 1.54 ANGSTROMS.
REMARK 3
REMARK 3 REFINEMENT.
REMARK 3 PROGRAM : TNT
REMARK 3 AUTHORS : TRONRUD,TEN EYCK,MATTHEWS
REMARK 3
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 1.54
REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 20.00
REMARK 3 DATA CUTOFF (SIGMA(F)) : 0.000
REMARK 3 COMPLETENESS FOR RANGE (%) : NULL
REMARK 3 NUMBER OF REFLECTIONS : 9834
REMARK 3
REMARK 3 USING DATA ABOVE SIGMA CUTOFF.
REMARK 3 CROSS-VALIDATION METHOD : NULL
REMARK 3 FREE R VALUE TEST SET SELECTION : NULL
REMARK 3 R VALUE (WORKING + TEST SET) : NULL
REMARK 3 R VALUE (WORKING SET) : 0.178
REMARK 3 FREE R VALUE : NULL
REMARK 3 FREE R VALUE TEST SET SIZE (%) : NULL
REMARK 3 FREE R VALUE TEST SET COUNT : NULL
REMARK 3
REMARK 3 USING ALL DATA, NO SIGMA CUTOFF.
REMARK 3 R VALUE (WORKING + TEST SET, NO CUTOFF) : NULL
REMARK 3 R VALUE (WORKING SET, NO CUTOFF) : 0.1780
REMARK 3 FREE R VALUE (NO CUTOFF) : NULL
REMARK 3 FREE R VALUE TEST SET SIZE (%, NO CUTOFF) : NULL
REMARK 3 FREE R VALUE TEST SET COUNT (NO CUTOFF) : NULL
REMARK 3 TOTAL NUMBER OF REFLECTIONS (NO CUTOFF) : 9834
REMARK 3
REMARK 3 NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT.
REMARK 3 PROTEIN ATOMS : 500
REMARK 3 NUCLEIC ACID ATOMS : 0
REMARK 3 HETEROGEN ATOMS : 0
REMARK 3 SOLVENT ATOMS : 59
REMARK 3
REMARK 3 WILSON B VALUE (FROM FCALC, A**2) : NULL
REMARK 3
REMARK 3 RMS DEVIATIONS FROM IDEAL VALUES. RMS WEIGHT COUNT
REMARK 3 BOND LENGTHS (A) : 0.015 ; NULL ; NULL
REMARK 3 BOND ANGLES (DEGREES) : 2.300 ; NULL ; NULL
REMARK 3 TORSION ANGLES (DEGREES) : NULL ; NULL ; NULL
REMARK 3 PSEUDOROTATION ANGLES (DEGREES) : NULL ; NULL ; NULL
REMARK 3 TRIGONAL CARBON PLANES (A) : NULL ; NULL ; NULL
REMARK 3 GENERAL PLANES (A) : NULL ; NULL ; NULL
REMARK 3 ISOTROPIC THERMAL FACTORS (A**2) : NULL ; NULL ; NULL
REMARK 3 NON-BONDED CONTACTS (A) : NULL ; NULL ; NULL
REMARK 3
REMARK 3 INCORRECT CHIRAL-CENTERS (COUNT) : NULL
REMARK 3
REMARK 3 BULK SOLVENT MODELING.
REMARK 3 METHOD USED : NULL
REMARK 3 KSOL : NULL
REMARK 3 BSOL : NULL
REMARK 3
REMARK 3 RESTRAINT LIBRARIES.
REMARK 3 STEREOCHEMISTRY : NULL
REMARK 3 ISOTROPIC THERMAL FACTOR RESTRAINTS : NULL
REMARK 3
REMARK 3 OTHER REFINEMENT REMARKS: BOTH TERMINI OF CRO K56-[DGEVK]ARE
REMARK 3 DISORDERED. RESIDUES 1, 2, 62, 63, 64, 65, AND 66 HAVE NO
REMARK 3 INTERPRETABLE DENSITY AND ARE NOT INCLUDED IN THE MODEL.
REMARK 3 RESIDUES 3 AND 61 EXTEN AWAY FROM THE GLOBULAR PORTION OF THE
REMARK 3 MOLECULE AND ARE MOST PROBABLY IN MULTIPLE CONFORMATIONS. THE
REMARK 3 GLOBULAR PORTION OF THE MOLECULE (RESIDUES 4 - 59) IS
REMARK 3 EXCEPTIONALLY WELL-DEFINED IN THE DENSITY.
REMARK 4
REMARK 4 1ORC COMPLIES WITH FORMAT V. 3.15, 01-DEC-08
REMARK 100
REMARK 100 THIS ENTRY HAS BEEN PROCESSED BY BNL.
REMARK 200
REMARK 200 EXPERIMENTAL DETAILS
REMARK 200 EXPERIMENT TYPE : X-RAY DIFFRACTION
REMARK 200 DATE OF DATA COLLECTION : 26-SEP-93
REMARK 200 TEMPERATURE (KELVIN) : NULL
REMARK 200 PH : NULL
REMARK 200 NUMBER OF CRYSTALS USED : NULL
REMARK 200
REMARK 200 SYNCHROTRON (Y/N) : N
REMARK 200 RADIATION SOURCE : NULL
REMARK 200 BEAMLINE : NULL
REMARK 200 X-RAY GENERATOR MODEL : NULL
REMARK 200 MONOCHROMATIC OR LAUE (M/L) : M
REMARK 200 WAVELENGTH OR RANGE (A) : 1.5418
REMARK 200 MONOCHROMATOR : NULL
REMARK 200 OPTICS : NULL
REMARK 200
REMARK 200 DETECTOR TYPE : AREA DETECTOR
REMARK 200 DETECTOR MANUFACTURER : XUONG-HAMLIN MULTIWIRE
REMARK 200 INTENSITY-INTEGRATION SOFTWARE : SDMS
REMARK 200 DATA SCALING SOFTWARE : NULL
REMARK 200
REMARK 200 NUMBER OF UNIQUE REFLECTIONS : 9834
REMARK 200 RESOLUTION RANGE HIGH (A) : NULL
REMARK 200 RESOLUTION RANGE LOW (A) : NULL
REMARK 200 REJECTION CRITERIA (SIGMA(I)) : 0.000
REMARK 200
REMARK 200 OVERALL.
REMARK 200 COMPLETENESS FOR RANGE (%) : 95.8
REMARK 200 DATA REDUNDANCY : 4.600
REMARK 200 R MERGE (I) : 0.02900
REMARK 200 R SYM (I) : NULL
REMARK 200 <I/SIGMA(I)> FOR THE DATA SET : NULL
REMARK 200
REMARK 200 IN THE HIGHEST RESOLUTION SHELL.
REMARK 200 HIGHEST RESOLUTION SHELL, RANGE HIGH (A) : NULL
REMARK 200 HIGHEST RESOLUTION SHELL, RANGE LOW (A) : NULL
REMARK 200 COMPLETENESS FOR SHELL (%) : NULL
REMARK 200 DATA REDUNDANCY IN SHELL : NULL
REMARK 200 R MERGE FOR SHELL (I) : NULL
REMARK 200 R SYM FOR SHELL (I) : NULL
REMARK 200 <I/SIGMA(I)> FOR SHELL : NULL
REMARK 200
REMARK 200 DIFFRACTION PROTOCOL: NULL
REMARK 200 METHOD USED TO DETERMINE THE STRUCTURE: NULL
REMARK 200 SOFTWARE USED: NULL
REMARK 200 STARTING MODEL: NULL
REMARK 200
REMARK 200 REMARK: NULL
REMARK 280
REMARK 280 CRYSTAL
REMARK 280 SOLVENT CONTENT, VS (%): 40.87
REMARK 280 MATTHEWS COEFFICIENT, VM (ANGSTROMS**3/DA): 2.08
REMARK 280
REMARK 280 CRYSTALLIZATION CONDITIONS: NULL
REMARK 290
REMARK 290 CRYSTALLOGRAPHIC SYMMETRY
REMARK 290 SYMMETRY OPERATORS FOR SPACE GROUP: P 21 21 21
REMARK 290
REMARK 290 SYMOP SYMMETRY
REMARK 290 NNNMMM OPERATOR
REMARK 290 1555 X,Y,Z
REMARK 290 2555 -X+1/2,-Y,Z+1/2
REMARK 290 3555 -X,Y+1/2,-Z+1/2
REMARK 290 4555 X+1/2,-Y+1/2,-Z
REMARK 290
REMARK 290 WHERE NNN -> OPERATOR NUMBER
REMARK 290 MMM -> TRANSLATION VECTOR
REMARK 290
REMARK 290 CRYSTALLOGRAPHIC SYMMETRY TRANSFORMATIONS
REMARK 290 THE FOLLOWING TRANSFORMATIONS OPERATE ON THE ATOM/HETATM
REMARK 290 RECORDS IN THIS ENTRY TO PRODUCE CRYSTALLOGRAPHICALLY
REMARK 290 RELATED MOLECULES.
REMARK 290 SMTRY1 1 1.000000 0.000000 0.000000 0.00000
REMARK 290 SMTRY2 1 0.000000 1.000000 0.000000 0.00000
REMARK 290 SMTRY3 1 0.000000 0.000000 1.000000 0.00000
REMARK 290 SMTRY1 2 -1.000000 0.000000 0.000000 17.38500
REMARK 290 SMTRY2 2 0.000000 -1.000000 0.000000 0.00000
REMARK 290 SMTRY3 2 0.000000 0.000000 1.000000 24.15500
REMARK 290 SMTRY1 3 -1.000000 0.000000 0.000000 0.00000
REMARK 290 SMTRY2 3 0.000000 1.000000 0.000000 19.58500
REMARK 290 SMTRY3 3 0.000000 0.000000 -1.000000 24.15500
REMARK 290 SMTRY1 4 1.000000 0.000000 0.000000 17.38500
REMARK 290 SMTRY2 4 0.000000 -1.000000 0.000000 19.58500
REMARK 290 SMTRY3 4 0.000000 0.000000 -1.000000 0.00000
REMARK 290
REMARK 290 REMARK: NULL
REMARK 300
REMARK 300 BIOMOLECULE: 1
REMARK 300 SEE REMARK 350 FOR THE AUTHOR PROVIDED AND/OR PROGRAM
REMARK 300 GENERATED ASSEMBLY INFORMATION FOR THE STRUCTURE IN
REMARK 300 THIS ENTRY. THE REMARK MAY ALSO PROVIDE INFORMATION ON
REMARK 300 BURIED SURFACE AREA.
REMARK 350
REMARK 350 COORDINATES FOR A COMPLETE MULTIMER REPRESENTING THE KNOWN
REMARK 350 BIOLOGICALLY SIGNIFICANT OLIGOMERIZATION STATE OF THE
REMARK 350 MOLECULE CAN BE GENERATED BY APPLYING BIOMT TRANSFORMATIONS
REMARK 350 GIVEN BELOW. BOTH NON-CRYSTALLOGRAPHIC AND
REMARK 350 CRYSTALLOGRAPHIC OPERATIONS ARE GIVEN.
REMARK 350
REMARK 350 BIOMOLECULE: 1
REMARK 350 AUTHOR DETERMINED BIOLOGICAL UNIT: MONOMERIC
REMARK 350 APPLY THE FOLLOWING TO CHAINS: A
REMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000
REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000
REMARK 350 BIOMT3 1 0.000000 0.000000 1.000000 0.00000
REMARK 465
REMARK 465 MISSING RESIDUES
REMARK 465 THE FOLLOWING RESIDUES WERE NOT LOCATED IN THE
REMARK 465 EXPERIMENT. (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN
REMARK 465 IDENTIFIER; SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.)
REMARK 465
REMARK 465 M RES C SSSEQI
REMARK 465 MET A 1
REMARK 465 GLU A 2
REMARK 465 LYS A 62
REMARK 465 LYS A 63
REMARK 465 THR A 64
REMARK 465 THR A 65
REMARK 465 ALA A 66
REMARK 470
REMARK 470 MISSING ATOM
REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS(M=MODEL NUMBER;
REMARK 470 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER;
REMARK 470 I=INSERTION CODE):
REMARK 470 M RES CSSEQI ATOMS
REMARK 470 LYS A 21 CG CD CE NZ
REMARK 500
REMARK 500 GEOMETRY AND STEREOCHEMISTRY
REMARK 500 SUBTOPIC: COVALENT BOND LENGTHS
REMARK 500
REMARK 500 THE STEREOCHEMICAL PARAMETERS OF THE FOLLOWING RESIDUES
REMARK 500 HAVE VALUES WHICH DEVIATE FROM EXPECTED VALUES BY MORE
REMARK 500 THAN 6*RMSD (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN
REMARK 500 IDENTIFIER; SSEQ=SEQUENCE NUMBER; I=INSERTION CODE).
REMARK 500
REMARK 500 STANDARD TABLE:
REMARK 500 FORMAT: (10X,I3,1X,2(A3,1X,A1,I4,A1,1X,A4,3X),1X,F6.3)
REMARK 500
REMARK 500 EXPECTED VALUES PROTEIN: ENGH AND HUBER, 1999
REMARK 500 EXPECTED VALUES NUCLEIC ACID: CLOWNEY ET AL 1996
REMARK 500
REMARK 500 M RES CSSEQI ATM1 RES CSSEQI ATM2 DEVIATION
REMARK 500 GLU A 54 CD GLU A 54 OE2 0.069
REMARK 500
REMARK 500 REMARK: NULL
REMARK 500
REMARK 500 GEOMETRY AND STEREOCHEMISTRY
REMARK 500 SUBTOPIC: COVALENT BOND ANGLES
REMARK 500
REMARK 500 THE STEREOCHEMICAL PARAMETERS OF THE FOLLOWING RESIDUES
REMARK 500 HAVE VALUES WHICH DEVIATE FROM EXPECTED VALUES BY MORE
REMARK 500 THAN 6*RMSD (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN
REMARK 500 IDENTIFIER; SSEQ=SEQUENCE NUMBER; I=INSERTION CODE).
REMARK 500
REMARK 500 STANDARD TABLE:
REMARK 500 FORMAT: (10X,I3,1X,A3,1X,A1,I4,A1,3(1X,A4,2X),12X,F5.1)
REMARK 500
REMARK 500 EXPECTED VALUES PROTEIN: ENGH AND HUBER, 1999
REMARK 500 EXPECTED VALUES NUCLEIC ACID: CLOWNEY ET AL 1996
REMARK 500
REMARK 500 M RES CSSEQI ATM1 ATM2 ATM3
REMARK 500 ASP A 9 CB - CG - OD2 ANGL. DEV. = -5.6 DEGREES
REMARK 500 ASP A 47 CB - CG - OD1 ANGL. DEV. = 6.5 DEGREES
REMARK 500 ASP A 47 CB - CG - OD2 ANGL. DEV. = -5.5 DEGREES
REMARK 500 ASP A 56A CB - CG - OD2 ANGL. DEV. = -6.3 DEGREES
REMARK 500
REMARK 500 REMARK: NULL
REMARK 525
REMARK 525 SOLVENT
REMARK 525
REMARK 525 THE SOLVENT MOLECULES HAVE CHAIN IDENTIFIERS THAT
REMARK 525 INDICATE THE POLYMER CHAIN WITH WHICH THEY ARE MOST
REMARK 525 CLOSELY ASSOCIATED. THE REMARK LISTS ALL THE SOLVENT
REMARK 525 MOLECULES WHICH ARE MORE THAN 5A AWAY FROM THE
REMARK 525 NEAREST POLYMER CHAIN (M = MODEL NUMBER;
REMARK 525 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE
REMARK 525 NUMBER; I=INSERTION CODE):
REMARK 525
REMARK 525 M RES CSSEQI
REMARK 525 HOH A 148 DISTANCE = 5.42 ANGSTROMS
DBREF 1ORC A 1 66 UNP P03040 RCRO_LAMBD 1 66
SEQADV 1ORC GLU A 54 UNP P03040 INSERTION
SEQADV 1ORC VAL A 55 UNP P03040 INSERTION
SEQADV 1ORC LYS A 56 UNP P03040 INSERTION
SEQADV 1ORC ASP A 56A UNP P03040 INSERTION
SEQADV 1ORC GLY A 56B UNP P03040 INSERTION
SEQRES 1 A 71 MET GLU GLN ARG ILE THR LEU LYS ASP TYR ALA MET ARG
SEQRES 2 A 71 PHE GLY GLN THR LYS THR ALA LYS ASP LEU GLY VAL TYR
SEQRES 3 A 71 GLN SER ALA ILE ASN LYS ALA ILE HIS ALA GLY ARG LYS
SEQRES 4 A 71 ILE PHE LEU THR ILE ASN ALA ASP GLY SER VAL TYR ALA
SEQRES 5 A 71 GLU GLU VAL LYS ASP GLY GLU VAL LYS PRO PHE PRO SER
SEQRES 6 A 71 ASN LYS LYS THR THR ALA
FORMUL 2 HOH *57(H2 O)
HELIX 1 1 LEU A 7 PHE A 14 1 8
HELIX 2 2 GLN A 16 LEU A 23 1 8
HELIX 3 3 GLN A 27 HIS A 35 1 9
SHEET 1 A 3 LYS A 39 ILE A 44 0
SHEET 2 A 3 VAL A 50 LYS A 56 -1 N VAL A 55 O LYS A 39
SHEET 3 A 3 GLU A 56C PRO A 57 -1 N LYS A 56E O GLU A 54
CISPEP 1 PHE A 58 PRO A 59 0 -0.65
CRYST1 34.770 39.170 48.310 90.00 90.00 90.00 P 21 21 21 4
ORIGX1 1.000000 0.000000 0.000000 0.00000
ORIGX2 0.000000 1.000000 0.000000 0.00000
ORIGX3 0.000000 0.000000 1.000000 0.00000
SCALE1 0.028760 0.000000 0.000000 0.00000
SCALE2 0.000000 0.025530 0.000000 0.00000
SCALE3 0.000000 0.000000 0.020700 0.00000
ATOM 1 N GLN A 3 12.772 36.309 7.065 1.00100.00 N
ATOM 2 CA GLN A 3 12.632 37.265 8.163 1.00 48.14 C
ATOM 3 C GLN A 3 13.732 37.165 9.263 1.00 52.27 C
ATOM 4 O GLN A 3 14.053 36.101 9.788 1.00 44.13 O
ATOM 5 CB GLN A 3 11.223 37.222 8.746 1.00 74.94 C
ATOM 6 CG GLN A 3 10.520 38.577 8.616 1.00100.00 C
ATOM 7 CD GLN A 3 10.533 39.326 9.931 1.00100.00 C
ATOM 8 OE1 GLN A 3 9.824 38.926 10.900 1.00100.00 O
ATOM 9 NE2 GLN A 3 11.382 40.376 9.990 1.00 83.72 N
ATOM 10 N ARG A 4 14.321 38.294 9.634 1.00 25.53 N
ATOM 11 CA ARG A 4 15.360 38.275 10.642 1.00 18.68 C
ATOM 12 C ARG A 4 14.763 38.034 12.028 1.00 21.79 C
ATOM 13 O ARG A 4 13.620 38.418 12.306 1.00 21.16 O
ATOM 14 CB ARG A 4 16.095 39.616 10.699 1.00 18.77 C
ATOM 15 CG ARG A 4 16.778 40.003 9.403 1.00 24.51 C
ATOM 16 CD ARG A 4 17.273 41.464 9.421 1.00 32.20 C
ATOM 17 NE ARG A 4 18.202 41.755 8.324 1.00 36.56 N
ATOM 18 CZ ARG A 4 17.806 41.955 7.070 1.00 84.82 C
ATOM 19 NH1 ARG A 4 16.515 41.892 6.756 1.00 46.36 N
ATOM 20 NH2 ARG A 4 18.690 42.219 6.117 1.00 31.29 N
ATOM 21 N ILE A 5 15.549 37.392 12.890 1.00 17.55 N
ATOM 22 CA ILE A 5 15.183 37.125 14.284 1.00 14.87 C
ATOM 23 C ILE A 5 16.203 37.872 15.172 1.00 16.47 C
ATOM 24 O ILE A 5 17.401 37.815 14.843 1.00 17.02 O
ATOM 25 CB ILE A 5 15.285 35.595 14.525 1.00 15.08 C
ATOM 26 CG1 ILE A 5 14.275 34.892 13.658 1.00 40.26 C
ATOM 27 CG2 ILE A 5 15.049 35.246 15.983 1.00 14.47 C
ATOM 28 CD1 ILE A 5 14.495 33.386 13.673 1.00 45.11 C
ATOM 29 N THR A 6 15.800 38.526 16.285 1.00 13.28 N
ATOM 30 CA THR A 6 16.814 39.214 17.093 1.00 11.96 C
ATOM 31 C THR A 6 17.671 38.212 17.779 1.00 14.04 C
ATOM 32 O THR A 6 17.266 37.055 17.907 1.00 14.20 O
ATOM 33 CB THR A 6 16.173 40.127 18.156 1.00 14.58 C
ATOM 34 OG1 THR A 6 15.450 39.272 19.028 1.00 17.32 O
ATOM 35 CG2 THR A 6 15.146 41.042 17.451 1.00 14.30 C
ATOM 36 N LEU A 7 18.849 38.618 18.202 1.00 13.73 N
ATOM 37 CA LEU A 7 19.728 37.670 18.894 1.00 13.16 C
ATOM 38 C LEU A 7 19.041 37.135 20.139 1.00 15.87 C
ATOM 39 O LEU A 7 19.090 35.912 20.454 1.00 12.79 O
ATOM 40 CB LEU A 7 21.048 38.383 19.263 1.00 12.68 C
ATOM 41 CG LEU A 7 21.982 37.558 20.140 1.00 17.98 C
ATOM 42 CD1 LEU A 7 22.427 36.311 19.389 1.00 20.47 C
ATOM 43 CD2 LEU A 7 23.218 38.409 20.407 1.00 18.75 C
ATOM 44 N LYS A 8 18.371 38.033 20.873 1.00 14.10 N
ATOM 45 CA LYS A 8 17.689 37.619 22.103 1.00 14.00 C
ATOM 46 C LYS A 8 16.593 36.611 21.795 1.00 19.65 C
ATOM 47 O LYS A 8 16.429 35.627 22.507 1.00 19.85 O
ATOM 48 CB LYS A 8 17.076 38.822 22.826 1.00 15.93 C
ATOM 49 CG LYS A 8 16.367 38.544 24.133 1.00 20.77 C
ATOM 50 CD LYS A 8 15.511 39.771 24.498 1.00 38.02 C
ATOM 51 CE LYS A 8 15.092 39.901 25.953 1.00 83.24 C
ATOM 52 NZ LYS A 8 14.141 41.009 26.161 1.00 99.56 N
ATOM 53 N ASP A 9 15.797 36.854 20.746 1.00 12.13 N
ATOM 54 CA ASP A 9 14.716 35.901 20.441 1.00 12.04 C
ATOM 55 C ASP A 9 15.261 34.577 19.956 1.00 16.77 C
ATOM 56 O ASP A 9 14.676 33.565 20.239 1.00 17.35 O
ATOM 57 CB ASP A 9 13.754 36.452 19.381 1.00 16.98 C
ATOM 58 CG ASP A 9 12.817 37.489 19.951 1.00 30.47 C
ATOM 59 OD1 ASP A 9 12.548 37.571 21.130 1.00 29.85 O
ATOM 60 OD2 ASP A 9 12.329 38.272 19.045 1.00 30.89 O
ATOM 61 N TYR A 10 16.369 34.570 19.227 1.00 13.45 N
ATOM 62 CA TYR A 10 16.968 33.316 18.763 1.00 12.08 C
ATOM 63 C TYR A 10 17.425 32.470 19.933 1.00 13.52 C
ATOM 64 O TYR A 10 17.190 31.235 19.970 1.00 14.35 O
ATOM 65 CB TYR A 10 18.152 33.638 17.846 1.00 14.33 C
ATOM 66 CG TYR A 10 18.675 32.467 17.052 1.00 13.99 C
ATOM 67 CD1 TYR A 10 19.677 31.649 17.566 1.00 14.46 C
ATOM 68 CD2 TYR A 10 18.162 32.193 15.781 1.00 15.06 C
ATOM 69 CE1 TYR A 10 20.159 30.568 16.820 1.00 14.84 C
ATOM 70 CE2 TYR A 10 18.645 31.118 15.037 1.00 17.89 C
ATOM 71 CZ TYR A 10 19.654 30.307 15.551 1.00 13.52 C
ATOM 72 OH TYR A 10 20.113 29.235 14.810 1.00 18.91 O
ATOM 73 N ALA A 11 18.053 33.115 20.912 1.00 14.43 N
ATOM 74 CA ALA A 11 18.530 32.389 22.104 1.00 17.07 C
ATOM 75 C ALA A 11 17.358 31.828 22.884 1.00 17.90 C
ATOM 76 O ALA A 11 17.431 30.725 23.401 1.00 17.23 O
ATOM 77 CB ALA A 11 19.353 33.293 23.015 1.00 13.38 C
ATOM 78 N MET A 12 16.267 32.569 22.941 1.00 13.72 N
ATOM 79 CA MET A 12 15.110 32.087 23.658 1.00 13.95 C
ATOM 80 C MET A 12 14.489 30.885 22.958 1.00 16.33 C
ATOM 81 O MET A 12 13.821 30.024 23.529 1.00 15.90 O
ATOM 82 CB MET A 12 14.085 33.240 23.737 1.00 16.34 C
ATOM 83 CG MET A 12 14.166 34.057 24.981 1.00 55.28 C
ATOM 84 SD MET A 12 13.007 35.441 24.890 1.00100.00 S
ATOM 85 CE MET A 12 13.748 36.594 26.079 1.00 98.88 C
ATOM 86 N ARG A 13 14.591 30.843 21.623 1.00 14.91 N
ATOM 87 CA ARG A 13 14.011 29.725 20.874 1.00 13.59 C
ATOM 88 C ARG A 13 14.894 28.499 20.818 1.00 14.85 C
ATOM 89 O ARG A 13 14.377 27.403 20.833 1.00 16.17 O
ATOM 90 CB ARG A 13 13.631 30.085 19.422 1.00 13.23 C
ATOM 91 CG ARG A 13 12.343 30.937 19.399 1.00 25.46 C
ATOM 92 CD ARG A 13 12.232 31.960 18.236 1.00 26.76 C
ATOM 93 NE ARG A 13 10.993 32.722 18.331 1.00 32.94 N
ATOM 94 CZ ARG A 13 10.675 33.572 19.327 1.00 49.00 C
ATOM 95 NH1 ARG A 13 11.504 33.835 20.326 1.00 49.46 N
ATOM 96 NH2 ARG A 13 9.491 34.194 19.329 1.00 62.76 N
ATOM 97 N PHE A 14 16.201 28.679 20.678 1.00 11.35 N
ATOM 98 CA PHE A 14 17.091 27.571 20.478 1.00 12.06 C
ATOM 99 C PHE A 14 18.062 27.296 21.591 1.00 16.73 C
ATOM 100 O PHE A 14 18.752 26.272 21.544 1.00 17.92 O
ATOM 101 CB PHE A 14 17.937 27.839 19.221 1.00 12.10 C
ATOM 102 CG PHE A 14 17.051 27.815 18.032 1.00 21.70 C
ATOM 103 CD1 PHE A 14 16.447 26.617 17.644 1.00 20.65 C
ATOM 104 CD2 PHE A 14 16.771 28.999 17.344 1.00 23.31 C
ATOM 105 CE1 PHE A 14 15.616 26.604 16.530 1.00 30.57 C
ATOM 106 CE2 PHE A 14 15.924 29.013 16.241 1.00 24.51 C
ATOM 107 CZ PHE A 14 15.349 27.803 15.854 1.00 29.75 C
ATOM 108 N GLY A 15 18.187 28.213 22.551 1.00 12.81 N
ATOM 109 CA GLY A 15 19.126 28.023 23.641 1.00 16.07 C
ATOM 110 C GLY A 15 20.448 28.715 23.368 1.00 15.58 C
ATOM 111 O GLY A 15 20.833 28.977 22.216 1.00 15.80 O
ATOM 112 N GLN A 16 21.160 29.060 24.429 1.00 19.00 N
ATOM 113 CA GLN A 16 22.448 29.783 24.302 1.00 23.07 C
ATOM 114 C GLN A 16 23.545 28.961 23.693 1.00 19.82 C
ATOM 115 O GLN A 16 24.387 29.465 22.952 1.00 22.09 O
ATOM 116 CB GLN A 16 22.898 30.455 25.608 1.00 31.37 C
ATOM 117 CG GLN A 16 21.973 31.648 25.860 1.00 44.42 C
ATOM 118 CD GLN A 16 22.477 32.563 26.928 1.00 89.98 C
ATOM 119 OE1 GLN A 16 21.674 33.132 27.677 1.00100.00 O
ATOM 120 NE2 GLN A 16 23.797 32.701 27.003 1.00 83.25 N
ATOM 121 N THR A 17 23.503 27.676 23.943 1.00 14.81 N
ATOM 122 CA THR A 17 24.522 26.836 23.410 1.00 14.49 C
ATOM 123 C THR A 17 24.469 26.755 21.886 1.00 18.16 C
ATOM 124 O THR A 17 25.485 26.941 21.176 1.00 18.87 O
ATOM 125 CB THR A 17 24.387 25.482 24.095 1.00 29.61 C
ATOM 126 OG1 THR A 17 24.840 25.669 25.428 1.00 37.58 O
ATOM 127 CG2 THR A 17 25.221 24.420 23.405 1.00 37.41 C
ATOM 128 N LYS A 18 23.271 26.451 21.353 1.00 15.02 N
ATOM 129 CA LYS A 18 23.171 26.378 19.902 1.00 14.97 C
ATOM 130 C LYS A 18 23.473 27.761 19.265 1.00 19.02 C
ATOM 131 O LYS A 18 24.128 27.857 18.233 1.00 17.30 O
ATOM 132 CB LYS A 18 21.799 25.902 19.488 1.00 15.62 C
ATOM 133 CG LYS A 18 21.565 26.030 17.979 1.00 17.87 C
ATOM 134 CD LYS A 18 20.302 25.331 17.522 1.00 20.77 C
ATOM 135 CE LYS A 18 19.938 25.641 16.061 1.00 21.05 C
ATOM 136 NZ LYS A 18 21.070 25.348 15.174 1.00 23.25 N
ATOM 137 N THR A 19 22.998 28.826 19.929 1.00 13.94 N
ATOM 138 CA THR A 19 23.232 30.176 19.408 1.00 15.11 C
ATOM 139 C THR A 19 24.728 30.452 19.247 1.00 17.10 C
ATOM 140 O THR A 19 25.195 30.883 18.197 1.00 16.28 O
ATOM 141 CB THR A 19 22.562 31.204 20.345 1.00 14.30 C
ATOM 142 OG1 THR A 19 21.163 30.972 20.435 1.00 16.12 O
ATOM 143 CG2 THR A 19 22.748 32.585 19.777 1.00 18.66 C
ATOM 144 N ALA A 20 25.504 30.162 20.312 1.00 15.42 N
ATOM 145 CA ALA A 20 26.961 30.386 20.308 1.00 16.54 C
ATOM 146 C ALA A 20 27.601 29.564 19.216 1.00 18.55 C
ATOM 147 O ALA A 20 28.435 30.047 18.448 1.00 16.40 O
ATOM 148 CB ALA A 20 27.581 30.040 21.665 1.00 18.19 C
ATOM 149 N LYS A 21 27.178 28.327 19.124 1.00 15.87 N
ATOM 150 CA LYS A 21 27.755 27.471 18.068 1.00 14.12 C
ATOM 151 C LYS A 21 27.417 27.978 16.665 1.00 23.53 C
ATOM 152 O LYS A 21 28.261 27.983 15.775 1.00 22.00 O
ATOM 153 CB LYS A 21 27.380 25.988 18.165 1.00 19.74 C
ATOM 154 N ASP A 22 26.163 28.359 16.428 1.00 13.98 N
ATOM 155 CA ASP A 22 25.792 28.797 15.096 1.00 13.60 C
ATOM 156 C ASP A 22 26.555 30.021 14.674 1.00 17.42 C
ATOM 157 O ASP A 22 26.763 30.257 13.483 1.00 18.55 O
ATOM 158 CB ASP A 22 24.285 29.133 15.132 1.00 17.16 C
ATOM 159 CG ASP A 22 23.463 27.892 15.121 1.00 21.24 C
ATOM 160 OD1 ASP A 22 23.956 26.806 14.917 1.00 21.84 O
ATOM 161 OD2 ASP A 22 22.212 28.101 15.342 1.00 21.83 O
ATOM 162 N LEU A 23 26.917 30.836 15.671 1.00 16.70 N
ATOM 163 CA LEU A 23 27.573 32.091 15.415 1.00 17.48 C
ATOM 164 C LEU A 23 29.059 31.985 15.508 1.00 21.83 C
ATOM 165 O LEU A 23 29.783 32.927 15.155 1.00 32.31 O
ATOM 166 CB LEU A 23 27.082 33.234 16.362 1.00 17.26 C
ATOM 167 CG LEU A 23 25.591 33.560 16.187 1.00 21.17 C
ATOM 168 CD1 LEU A 23 25.236 34.773 17.030 1.00 23.34 C
ATOM 169 CD2 LEU A 23 25.395 33.916 14.721 1.00 27.21 C
ATOM 170 N GLY A 24 29.528 30.890 16.014 1.00 19.23 N
ATOM 171 CA GLY A 24 30.950 30.717 16.194 1.00 19.43 C
ATOM 172 C GLY A 24 31.518 31.656 17.260 1.00 25.53 C
ATOM 173 O GLY A 24 32.607 32.153 17.082 1.00 29.65 O
ATOM 174 N VAL A 25 30.822 31.922 18.393 1.00 18.56 N
ATOM 175 CA VAL A 25 31.348 32.806 19.455 1.00 18.47 C
ATOM 176 C VAL A 25 31.227 31.997 20.731 1.00 29.23 C
ATOM 177 O VAL A 25 30.609 30.971 20.707 1.00 35.08 O
ATOM 178 CB VAL A 25 30.567 34.121 19.584 1.00 20.81 C
ATOM 179 CG1 VAL A 25 30.841 34.963 18.347 1.00 28.96 C
ATOM 180 CG2 VAL A 25 29.091 33.779 19.602 1.00 30.28 C
ATOM 181 N TYR A 26 31.767 32.399 21.827 1.00 16.42 N
ATOM 182 CA TYR A 26 31.548 31.539 22.976 1.00 16.61 C
ATOM 183 C TYR A 26 30.335 32.063 23.742 1.00 26.02 C
ATOM 184 O TYR A 26 29.975 33.190 23.562 1.00 20.21 O
ATOM 185 CB TYR A 26 32.816 31.460 23.844 1.00 15.69 C
ATOM 186 CG TYR A 26 33.397 32.834 24.133 1.00 15.36 C
ATOM 187 CD1 TYR A 26 33.008 33.568 25.261 1.00 19.73 C
ATOM 188 CD2 TYR A 26 34.370 33.356 23.286 1.00 20.92 C
ATOM 189 CE1 TYR A 26 33.566 34.821 25.529 1.00 22.18 C
ATOM 190 CE2 TYR A 26 34.915 34.612 23.529 1.00 20.77 C
ATOM 191 CZ TYR A 26 34.528 35.334 24.653 1.00 23.19 C
ATOM 192 OH TYR A 26 35.125 36.565 24.868 1.00 37.78 O
ATOM 193 N GLN A 27 29.724 31.279 24.636 1.00 15.04 N
ATOM 194 CA GLN A 27 28.470 31.644 25.313 1.00 16.50 C
ATOM 195 C GLN A 27 28.439 32.914 26.073 1.00 18.25 C
ATOM 196 O GLN A 27 27.478 33.680 26.054 1.00 21.76 O
ATOM 197 CB GLN A 27 27.876 30.468 26.136 1.00 20.07 C
ATOM 198 CG AGLN A 27 27.570 29.232 25.290 0.50 12.45 C
ATOM 199 CG BGLN A 27 26.388 30.644 26.494 0.50 28.90 C
ATOM 200 CD AGLN A 27 26.956 28.056 26.052 0.50 20.66 C
ATOM 201 CD BGLN A 27 26.083 30.009 27.823 0.50 30.43 C
ATOM 202 OE1AGLN A 27 27.168 26.882 25.699 0.50 22.61 O
ATOM 203 OE1BGLN A 27 25.473 30.610 28.715 0.50 34.69 O
ATOM 204 NE2AGLN A 27 26.206 28.365 27.097 0.50 23.95 N
ATOM 205 NE2BGLN A 27 26.508 28.766 27.938 0.50 41.63 N
ATOM 206 N SER A 28 29.464 33.122 26.809 1.00 15.52 N
ATOM 207 CA SER A 28 29.586 34.339 27.606 1.00 16.10 C
ATOM 208 C SER A 28 29.401 35.614 26.777 1.00 21.89 C
ATOM 209 O SER A 28 28.767 36.563 27.222 1.00 20.41 O
ATOM 210 CB SER A 28 30.919 34.313 28.380 1.00 14.82 C
ATOM 211 OG SER A 28 30.982 35.513 29.123 1.00 26.67 O
ATOM 212 N ALA A 29 29.961 35.637 25.564 1.00 17.59 N
ATOM 213 CA ALA A 29 29.883 36.770 24.667 1.00 17.67 C
ATOM 214 C ALA A 29 28.484 37.014 24.183 1.00 27.65 C
ATOM 215 O ALA A 29 28.089 38.132 23.981 1.00 20.63 O
ATOM 216 CB ALA A 29 30.757 36.537 23.465 1.00 21.94 C
ATOM 217 N ILE A 30 27.751 35.952 23.955 1.00 23.64 N
ATOM 218 CA ILE A 30 26.369 36.053 23.551 1.00 36.96 C
ATOM 219 C ILE A 30 25.512 36.766 24.653 1.00 26.56 C
ATOM 220 O ILE A 30 24.675 37.683 24.424 1.00 21.49 O
ATOM 221 CB ILE A 30 25.777 34.653 23.188 1.00 26.00 C
ATOM 222 CG1 ILE A 30 26.348 34.051 21.899 1.00 22.78 C
ATOM 223 CG2 ILE A 30 24.260 34.754 23.064 1.00 24.42 C
ATOM 224 CD1 ILE A 30 26.113 34.938 20.703 1.00 24.43 C
ATOM 225 N ASN A 31 25.684 36.302 25.886 1.00 26.96 N
ATOM 226 CA ASN A 31 24.929 36.844 27.012 1.00 24.13 C
ATOM 227 C ASN A 31 25.256 38.335 27.274 1.00 17.11 C
ATOM 228 O ASN A 31 24.450 39.220 27.547 1.00 18.57 O
ATOM 229 CB ASN A 31 25.209 35.902 28.208 1.00 28.18 C
ATOM 230 CG ASN A 31 24.270 36.176 29.328 1.00100.00 C
ATOM 231 OD1 ASN A 31 24.697 36.533 30.445 1.00 89.90 O
ATOM 232 ND2 ASN A 31 22.981 36.068 28.999 1.00 64.11 N
ATOM 233 N LYS A 32 26.471 38.639 27.080 1.00 16.17 N
ATOM 234 CA LYS A 32 26.812 40.008 27.294 1.00 17.05 C
ATOM 235 C LYS A 32 26.216 40.905 26.218 1.00 31.66 C
ATOM 236 O LYS A 32 25.732 41.995 26.518 1.00 29.57 O
ATOM 237 CB LYS A 32 28.306 40.081 27.162 1.00 21.07 C
ATOM 238 CG LYS A 32 28.902 41.412 27.606 1.00 52.04 C
ATOM 239 CD LYS A 32 30.396 41.492 27.303 1.00 81.78 C
ATOM 240 CE LYS A 32 31.140 42.468 28.196 1.00100.00 C
ATOM 241 NZ LYS A 32 30.873 43.865 27.856 1.00 43.46 N
ATOM 242 N ALA A 33 26.285 40.437 24.956 1.00 19.80 N
ATOM 243 CA ALA A 33 25.747 41.180 23.787 1.00 21.92 C
ATOM 244 C ALA A 33 24.297 41.500 23.968 1.00 21.92 C
ATOM 245 O ALA A 33 23.806 42.585 23.666 1.00 22.77 O
ATOM 246 CB ALA A 33 25.948 40.430 22.479 1.00 18.69 C
ATOM 247 N ILE A 34 23.591 40.542 24.444 1.00 18.08 N
ATOM 248 CA ILE A 34 22.199 40.740 24.698 1.00 21.87 C
ATOM 249 C ILE A 34 22.059 41.741 25.810 1.00 43.97 C
ATOM 250 O ILE A 34 21.320 42.721 25.716 1.00 33.65 O
ATOM 251 CB ILE A 34 21.594 39.409 25.069 1.00 29.26 C
ATOM 252 CG1 ILE A 34 21.436 38.525 23.825 1.00 21.75 C
ATOM 253 CG2 ILE A 34 20.239 39.555 25.760 1.00 34.93 C
ATOM 254 CD1 ILE A 34 20.948 37.106 24.189 1.00 26.89 C
ATOM 255 N HIS A 35 22.822 41.512 26.856 1.00 29.05 N
ATOM 256 CA HIS A 35 22.815 42.399 28.012 1.00 39.46 C
ATOM 257 C HIS A 35 23.018 43.838 27.625 1.00 23.68 C
ATOM 258 O HIS A 35 22.323 44.672 28.118 1.00 38.17 O
ATOM 259 CB HIS A 35 23.911 42.003 29.015 1.00 30.65 C
ATOM 260 CG HIS A 35 24.280 43.102 29.942 1.00100.00 C
ATOM 261 ND1 HIS A 35 25.418 43.879 29.708 1.00 68.44 N
ATOM 262 CD2 HIS A 35 23.672 43.529 31.112 1.00 35.25 C
ATOM 263 CE1 HIS A 35 25.494 44.750 30.724 1.00100.00 C
ATOM 264 NE2 HIS A 35 24.463 44.573 31.588 1.00 46.69 N
ATOM 265 N ALA A 36 23.981 44.104 26.778 1.00 20.84 N
ATOM 266 CA ALA A 36 24.334 45.408 26.286 1.00 24.23 C
ATOM 267 C ALA A 36 23.305 45.982 25.291 1.00 36.12 C
ATOM 268 O ALA A 36 23.444 47.124 24.846 1.00 33.39 O
ATOM 269 CB ALA A 36 25.665 45.379 25.545 1.00 19.30 C
ATOM 270 N GLY A 37 22.302 45.192 24.903 1.00 24.14 N
ATOM 271 CA GLY A 37 21.290 45.659 23.941 1.00 29.84 C
ATOM 272 C GLY A 37 21.854 46.026 22.565 1.00 35.89 C
ATOM 273 O GLY A 37 21.472 46.990 21.917 1.00 30.09 O
ATOM 274 N ARG A 38 22.782 45.260 22.076 1.00 17.99 N
ATOM 275 CA ARG A 38 23.322 45.550 20.761 1.00 19.82 C
ATOM 276 C ARG A 38 22.220 45.138 19.756 1.00 18.28 C
ATOM 277 O ARG A 38 21.405 44.264 20.065 1.00 20.34 O
ATOM 278 CB ARG A 38 24.653 44.808 20.523 1.00 19.50 C
ATOM 279 CG ARG A 38 25.693 45.271 21.551 1.00 25.48 C
ATOM 280 CD ARG A 38 26.965 44.505 21.472 1.00 24.57 C
ATOM 281 NE ARG A 38 27.490 44.392 20.140 1.00 20.97 N
ATOM 282 CZ ARG A 38 28.580 43.686 19.958 1.00 26.05 C
ATOM 283 NH1 ARG A 38 29.167 43.092 20.984 1.00 23.69 N
ATOM 284 NH2 ARG A 38 29.095 43.570 18.741 1.00 23.73 N
ATOM 285 N LYS A 39 22.188 45.755 18.596 1.00 16.48 N
ATOM 286 CA LYS A 39 21.140 45.482 17.629 1.00 17.09 C
ATOM 287 C LYS A 39 21.666 44.466 16.675 1.00 16.45 C
ATOM 288 O LYS A 39 22.313 44.817 15.683 1.00 18.25 O
ATOM 289 CB LYS A 39 20.764 46.788 16.930 1.00 19.95 C
ATOM 290 CG LYS A 39 20.222 47.767 17.972 1.00 24.06 C
ATOM 291 CD LYS A 39 20.513 49.217 17.614 1.00 32.96 C
ATOM 292 CE LYS A 39 19.981 50.191 18.673 1.00 45.86 C
ATOM 293 NZ LYS A 39 19.454 51.472 18.095 1.00 53.67 N
ATOM 294 N ILE A 40 21.410 43.183 17.021 1.00 13.44 N
ATOM 295 CA ILE A 40 21.938 42.065 16.280 1.00 15.20 C
ATOM 296 C ILE A 40 20.773 41.250 15.766 1.00 13.22 C
ATOM 297 O ILE A 40 19.848 40.929 16.498 1.00 12.59 O
ATOM 298 CB ILE A 40 22.888 41.267 17.200 1.00 10.03 C
ATOM 299 CG1 ILE A 40 24.136 42.108 17.439 1.00 12.39 C
ATOM 300 CG2 ILE A 40 23.375 40.058 16.441 1.00 12.12 C
ATOM 301 CD1 ILE A 40 24.876 41.612 18.651 1.00 16.76 C
ATOM 302 N PHE A 41 20.831 40.957 14.469 1.00 13.05 N
ATOM 303 CA PHE A 41 19.729 40.244 13.782 1.00 13.58 C
ATOM 304 C PHE A 41 20.273 39.045 13.004 1.00 17.16 C
ATOM 305 O PHE A 41 21.284 39.108 12.333 1.00 15.68 O
ATOM 306 CB PHE A 41 18.882 41.229 12.858 1.00 21.85 C
ATOM 307 CG PHE A 41 18.442 42.514 13.603 1.00 18.36 C
ATOM 308 CD1 PHE A 41 17.305 42.513 14.419 1.00 24.71 C
ATOM 309 CD2 PHE A 41 19.182 43.703 13.501 1.00 20.79 C
ATOM 310 CE1 PHE A 41 16.918 43.656 15.133 1.00 21.54 C
ATOM 311 CE2 PHE A 41 18.813 44.845 14.214 1.00 17.68 C
ATOM 312 CZ PHE A 41 17.672 44.826 15.021 1.00 18.87 C
ATOM 313 N LEU A 42 19.580 37.919 13.130 1.00 13.76 N
ATOM 314 CA LEU A 42 20.040 36.694 12.507 1.00 13.18 C
ATOM 315 C LEU A 42 19.086 36.217 11.421 1.00 17.74 C
ATOM 316 O LEU A 42 17.873 36.362 11.506 1.00 17.28 O
ATOM 317 CB LEU A 42 19.994 35.580 13.591 1.00 15.42 C
ATOM 318 CG LEU A 42 21.233 35.539 14.492 1.00 18.78 C
ATOM 319 CD1 LEU A 42 21.341 36.862 15.260 1.00 23.71 C
ATOM 320 CD2 LEU A 42 21.058 34.350 15.470 1.00 25.37 C
ATOM 321 N THR A 43 19.673 35.634 10.383 1.00 16.26 N
ATOM 322 CA THR A 43 18.880 35.050 9.288 1.00 15.99 C
ATOM 323 C THR A 43 19.275 33.561 9.186 1.00 17.09 C
ATOM 324 O THR A 43 20.473 33.256 9.056 1.00 17.62 O
ATOM 325 CB THR A 43 19.298 35.726 7.940 1.00 22.81 C
ATOM 326 OG1 THR A 43 18.835 37.049 8.024 1.00 25.25 O
ATOM 327 CG2 THR A 43 18.677 35.005 6.742 1.00 31.16 C
ATOM 328 N ILE A 44 18.298 32.679 9.252 1.00 15.60 N
ATOM 329 CA ILE A 44 18.568 31.274 9.131 1.00 20.91 C
ATOM 330 C ILE A 44 18.383 30.861 7.696 1.00 15.87 C
ATOM 331 O ILE A 44 17.359 31.193 7.095 1.00 18.91 O
ATOM 332 CB ILE A 44 17.632 30.429 10.008 1.00 30.28 C
ATOM 333 CG1 ILE A 44 17.843 30.827 11.458 1.00 46.11 C
ATOM 334 CG2 ILE A 44 17.844 28.902 9.819 1.00 25.10 C
ATOM 335 CD1 ILE A 44 16.714 30.328 12.352 1.00 70.44 C
ATOM 336 N ASN A 45 19.384 30.184 7.182 1.00 14.10 N
ATOM 337 CA ASN A 45 19.327 29.689 5.821 1.00 20.16 C
ATOM 338 C ASN A 45 18.803 28.263 5.824 1.00 19.07 C
ATOM 339 O ASN A 45 18.874 27.527 6.819 1.00 17.69 O
ATOM 340 CB ASN A 45 20.686 29.784 5.149 1.00 20.11 C
ATOM 341 CG ASN A 45 21.184 31.231 5.162 1.00 24.66 C
ATOM 342 OD1 ASN A 45 20.402 32.125 4.864 1.00 26.94 O
ATOM 343 ND2 ASN A 45 22.436 31.444 5.569 1.00 26.52 N
ATOM 344 N ALA A 46 18.251 27.870 4.705 1.00 22.61 N
ATOM 345 CA ALA A 46 17.669 26.544 4.653 1.00 24.21 C
ATOM 346 C ALA A 46 18.690 25.448 4.868 1.00 24.78 C
ATOM 347 O ALA A 46 18.367 24.365 5.260 1.00 22.23 O
ATOM 348 CB ALA A 46 16.890 26.345 3.375 1.00 22.88 C
ATOM 349 N ASP A 47 19.940 25.755 4.671 1.00 21.26 N
ATOM 350 CA ASP A 47 20.948 24.771 4.860 1.00 17.60 C
ATOM 351 C ASP A 47 21.370 24.710 6.266 1.00 20.70 C
ATOM 352 O ASP A 47 22.319 24.028 6.584 1.00 25.05 O
ATOM 353 CB ASP A 47 22.175 25.020 3.980 1.00 23.87 C
ATOM 354 CG ASP A 47 22.912 26.289 4.380 1.00 32.11 C
ATOM 355 OD1 ASP A 47 22.589 27.015 5.289 1.00 28.98 O
ATOM 356 OD2 ASP A 47 23.950 26.520 3.643 1.00 37.94 O
ATOM 357 N GLY A 48 20.729 25.468 7.113 1.00 19.80 N
ATOM 358 CA GLY A 48 21.127 25.428 8.525 1.00 23.59 C
ATOM 359 C GLY A 48 22.139 26.456 8.966 1.00 24.53 C
ATOM 360 O GLY A 48 22.305 26.655 10.164 1.00 27.05 O
ATOM 361 N SER A 49 22.816 27.109 8.050 1.00 19.46 N
ATOM 362 CA SER A 49 23.797 28.088 8.500 1.00 16.49 C
ATOM 363 C SER A 49 23.061 29.352 8.896 1.00 20.85 C
ATOM 364 O SER A 49 21.918 29.501 8.514 1.00 19.20 O
ATOM 365 CB SER A 49 24.762 28.375 7.397 1.00 16.59 C
ATOM 366 OG SER A 49 24.021 28.847 6.295 1.00 21.47 O
ATOM 367 N VAL A 50 23.714 30.240 9.682 1.00 15.49 N
ATOM 368 CA VAL A 50 23.094 31.441 10.161 1.00 14.53 C
ATOM 369 C VAL A 50 23.925 32.627 9.792 1.00 16.31 C
ATOM 370 O VAL A 50 25.113 32.563 9.905 1.00 20.11 O
ATOM 371 CB VAL A 50 22.977 31.330 11.699 1.00 15.44 C
ATOM 372 CG1 VAL A 50 22.459 32.568 12.367 1.00 17.50 C
ATOM 373 CG2 VAL A 50 22.009 30.175 11.994 1.00 19.25 C
ATOM 374 N TYR A 51 23.309 33.678 9.343 1.00 16.72 N
ATOM 375 CA TYR A 51 24.067 34.887 9.017 1.00 21.29 C
ATOM 376 C TYR A 51 23.627 35.971 10.032 1.00 20.19 C
ATOM 377 O TYR A 51 22.451 36.117 10.309 1.00 20.05 O
ATOM 378 CB TYR A 51 23.801 35.345 7.558 1.00 21.31 C
ATOM 379 CG TYR A 51 24.146 36.813 7.269 1.00 22.84 C
ATOM 380 CD1 TYR A 51 25.466 37.186 7.021 1.00 29.04 C
ATOM 381 CD2 TYR A 51 23.158 37.789 7.193 1.00 27.80 C
ATOM 382 CE1 TYR A 51 25.841 38.500 6.745 1.00 29.93 C
ATOM 383 CE2 TYR A 51 23.506 39.112 6.911 1.00 28.96 C
ATOM 384 CZ TYR A 51 24.837 39.466 6.685 1.00 28.90 C
ATOM 385 OH TYR A 51 25.164 40.777 6.417 1.00 75.56 O
ATOM 386 N ALA A 52 24.576 36.653 10.690 1.00 14.21 N
ATOM 387 CA ALA A 52 24.167 37.639 11.639 1.00 13.99 C
ATOM 388 C ALA A 52 24.736 38.973 11.230 1.00 16.81 C
ATOM 389 O ALA A 52 25.857 39.062 10.734 1.00 17.86 O
ATOM 390 CB ALA A 52 24.691 37.252 13.008 1.00 15.37 C
ATOM 391 N GLU A 53 23.966 40.019 11.468 1.00 16.09 N
ATOM 392 CA GLU A 53 24.423 41.342 11.157 1.00 16.97 C
ATOM 393 C GLU A 53 24.076 42.266 12.282 1.00 15.88 C
ATOM 394 O GLU A 53 23.093 42.080 12.971 1.00 18.62 O
ATOM 395 CB GLU A 53 23.888 41.872 9.817 1.00 19.03 C
ATOM 396 CG GLU A 53 22.411 42.061 9.859 1.00 24.25 C
ATOM 397 CD GLU A 53 21.897 42.057 8.458 1.00 35.71 C
ATOM 398 OE1 GLU A 53 22.525 42.491 7.526 1.00 27.95 O
ATOM 399 OE2 GLU A 53 20.769 41.448 8.336 1.00 36.91 O
ATOM 400 N GLU A 54 24.926 43.267 12.481 1.00 15.46 N
ATOM 401 CA GLU A 54 24.720 44.221 13.546 1.00 16.14 C
ATOM 402 C GLU A 54 24.543 45.641 12.961 1.00 21.40 C
ATOM 403 O GLU A 54 25.130 46.003 11.948 1.00 25.19 O
ATOM 404 CB GLU A 54 25.935 44.261 14.482 1.00 14.41 C
ATOM 405 CG GLU A 54 25.673 45.177 15.692 1.00 18.40 C
ATOM 406 CD GLU A 54 26.800 45.153 16.699 1.00 26.54 C
ATOM 407 OE1 GLU A 54 27.830 44.520 16.500 1.00 30.33 O
ATOM 408 OE2 GLU A 54 26.550 45.857 17.788 1.00 25.55 O
ATOM 409 N VAL A 55 23.667 46.370 13.573 1.00 21.13 N
ATOM 410 CA VAL A 55 23.440 47.727 13.144 1.00 27.58 C
ATOM 411 C VAL A 55 24.110 48.583 14.182 1.00 27.92 C
ATOM 412 O VAL A 55 23.780 48.514 15.389 1.00 25.31 O
ATOM 413 CB VAL A 55 21.968 48.051 13.114 1.00 29.65 C
ATOM 414 CG1 VAL A 55 21.807 49.529 12.757 1.00 38.38 C
ATOM 415 CG2 VAL A 55 21.329 47.161 12.055 1.00 23.49 C
ATOM 416 N LYS A 56 25.062 49.374 13.733 1.00 28.14 N
ATOM 417 CA LYS A 56 25.762 50.207 14.693 1.00 58.37 C
ATOM 418 C LYS A 56 26.180 51.499 14.044 1.00 33.42 C
ATOM 419 O LYS A 56 26.815 51.453 12.959 1.00 31.02 O
ATOM 420 CB LYS A 56 27.010 49.460 15.125 1.00 49.75 C
ATOM 421 CG LYS A 56 27.697 49.942 16.393 1.00 39.95 C
ATOM 422 CD LYS A 56 28.812 48.968 16.750 1.00100.00 C
ATOM 423 CE LYS A 56 29.778 49.430 17.831 1.00100.00 C
ATOM 424 NZ LYS A 56 30.915 48.498 18.004 1.00100.00 N
ATOM 425 N ASP A 56A 25.831 52.621 14.696 1.00 53.90 N
ATOM 426 CA ASP A 56A 26.191 53.931 14.169 1.00 49.50 C
ATOM 427 C ASP A 56A 25.702 54.051 12.772 1.00 54.12 C
ATOM 428 O ASP A 56A 26.476 54.298 11.863 1.00 49.28 O
ATOM 429 CB ASP A 56A 27.710 54.134 14.031 1.00 47.57 C
ATOM 430 CG ASP A 56A 28.484 53.954 15.317 1.00100.00 C
ATOM 431 OD1 ASP A 56A 28.021 54.228 16.433 1.00 92.41 O
ATOM 432 OD2 ASP A 56A 29.701 53.474 15.101 1.00100.00 O
ATOM 433 N GLY A 56B 24.457 53.821 12.567 1.00 39.18 N
ATOM 434 CA GLY A 56B 24.042 53.992 11.222 1.00 28.07 C
ATOM 435 C GLY A 56B 24.539 52.950 10.315 1.00 36.07 C
ATOM 436 O GLY A 56B 23.996 52.846 9.210 1.00 57.51 O
ATOM 437 N GLU A 56C 25.519 52.134 10.743 1.00 28.59 N
ATOM 438 CA GLU A 56C 25.964 51.070 9.818 1.00 40.34 C
ATOM 439 C GLU A 56C 25.580 49.623 10.131 1.00 27.57 C
ATOM 440 O GLU A 56C 25.320 49.256 11.286 1.00 35.60 O
ATOM 441 CB GLU A 56C 27.401 51.139 9.280 1.00 47.14 C
ATOM 442 CG GLU A 56C 27.803 52.562 8.881 1.00 68.41 C
ATOM 443 CD GLU A 56C 27.611 52.798 7.413 1.00100.00 C
ATOM 444 OE1 GLU A 56C 27.850 51.956 6.548 1.00100.00 O
ATOM 445 OE2 GLU A 56C 27.166 54.002 7.176 1.00100.00 O
ATOM 446 N VAL A 56D 25.610 48.868 9.016 1.00 30.28 N
ATOM 447 CA VAL A 56D 25.278 47.464 8.936 1.00 30.80 C
ATOM 448 C VAL A 56D 26.504 46.588 8.684 1.00 65.18 C
ATOM 449 O VAL A 56D 27.082 46.538 7.616 1.00 31.22 O
ATOM 450 CB VAL A 56D 24.104 47.217 7.973 1.00 46.38 C
ATOM 451 CG1 VAL A 56D 24.562 47.175 6.517 1.00 64.25 C
ATOM 452 CG2 VAL A 56D 23.430 45.913 8.324 1.00 41.05 C
ATOM 453 N LYS A 56E 26.927 45.885 9.709 1.00 32.07 N
ATOM 454 CA LYS A 56E 28.089 45.034 9.600 1.00 33.83 C
ATOM 455 C LYS A 56E 27.784 43.617 10.073 1.00 26.57 C
ATOM 456 O LYS A 56E 26.887 43.381 10.856 1.00 23.11 O
ATOM 457 CB LYS A 56E 29.150 45.562 10.580 1.00 24.98 C
ATOM 458 CG LYS A 56E 29.528 47.024 10.411 1.00 56.26 C
ATOM 459 CD LYS A 56E 30.733 47.223 9.465 1.00100.00 C
ATOM 460 CE LYS A 56E 30.415 47.974 8.156 1.00100.00 C
ATOM 461 NZ LYS A 56E 30.878 47.300 6.922 1.00100.00 N
ATOM 462 N PRO A 57 28.618 42.711 9.677 1.00 31.74 N
ATOM 463 CA PRO A 57 28.533 41.324 10.100 1.00 30.22 C
ATOM 464 C PRO A 57 28.844 41.214 11.584 1.00 24.56 C
ATOM 465 O PRO A 57 29.574 42.010 12.109 1.00 22.29 O
ATOM 466 CB PRO A 57 29.695 40.627 9.395 1.00 56.22 C
ATOM 467 CG PRO A 57 30.582 41.730 8.819 1.00100.00 C
ATOM 468 CD PRO A 57 29.703 42.961 8.705 1.00 57.16 C
ATOM 469 N PHE A 58 28.273 40.220 12.248 1.00 17.63 N
ATOM 470 CA PHE A 58 28.556 39.967 13.639 1.00 18.93 C
ATOM 471 C PHE A 58 28.895 38.479 13.733 1.00 25.61 C
ATOM 472 O PHE A 58 28.195 37.659 13.133 1.00 26.61 O
ATOM 473 CB PHE A 58 27.305 40.192 14.482 1.00 17.79 C
ATOM 474 CG PHE A 58 27.530 39.734 15.890 1.00 25.73 C
ATOM 475 CD1 PHE A 58 28.200 40.537 16.816 1.00 25.33 C
ATOM 476 CD2 PHE A 58 27.080 38.479 16.310 1.00 25.86 C
ATOM 477 CE1 PHE A 58 28.407 40.112 18.135 1.00 17.65 C
ATOM 478 CE2 PHE A 58 27.279 38.039 17.628 1.00 20.68 C
ATOM 479 CZ PHE A 58 27.961 38.851 18.537 1.00 23.05 C
ATOM 480 N PRO A 59 29.937 38.104 14.484 1.00 25.63 N
ATOM 481 CA PRO A 59 30.784 38.992 15.209 1.00 25.99 C
ATOM 482 C PRO A 59 31.647 39.704 14.228 1.00 49.28 C
ATOM 483 O PRO A 59 31.863 39.264 13.106 1.00 28.39 O
ATOM 484 CB PRO A 59 31.738 38.121 15.985 1.00 25.40 C
ATOM 485 CG PRO A 59 31.704 36.771 15.325 1.00 25.66 C
ATOM 486 CD PRO A 59 30.415 36.709 14.528 1.00 30.42 C
ATOM 487 N SER A 60 32.184 40.798 14.653 1.00 49.75 N
ATOM 488 CA SER A 60 33.009 41.550 13.738 1.00100.00 C
ATOM 489 C SER A 60 34.406 41.011 13.645 1.00100.00 C
ATOM 490 O SER A 60 34.934 40.769 12.543 1.00 95.05 O
ATOM 491 CB SER A 60 33.118 42.996 14.188 1.00100.00 C
ATOM 492 OG SER A 60 34.194 43.115 15.114 1.00100.00 O
ATOM 493 N ASN A 61 34.965 40.865 14.851 1.00100.00 N
ATOM 494 CA ASN A 61 36.322 40.475 15.074 1.00100.00 C
ATOM 495 C ASN A 61 36.615 40.874 16.519 1.00100.00 C
ATOM 496 O ASN A 61 36.488 42.053 16.901 1.00100.00 O
ATOM 497 CB ASN A 61 37.166 41.359 14.114 1.00100.00 C
ATOM 498 CG ASN A 61 38.647 41.049 13.964 1.00100.00 C
ATOM 499 OD1 ASN A 61 39.421 41.895 13.468 1.00100.00 O
ATOM 500 ND2 ASN A 61 39.046 39.835 14.348 1.00100.00 N
TER 501 ASN A 61
HETATM 502 O HOH A 100 16.567 43.265 4.042 1.00 34.53 O
HETATM 503 O HOH A 101 20.456 38.947 9.389 1.00 23.99 O
HETATM 504 O HOH A 102 12.849 38.495 16.591 1.00 22.30 O
HETATM 505 O HOH A 103 13.926 40.856 20.884 1.00 41.05 O
HETATM 506 O HOH A 104 18.819 40.954 20.655 1.00 19.73 O
HETATM 507 O HOH A 105 22.693 27.151 27.634 1.00 70.09 O
HETATM 508 O HOH A 106 21.061 42.196 21.789 1.00 41.39 O
HETATM 509 O HOH A 107 18.782 42.676 18.500 1.00 18.57 O
HETATM 510 O HOH A 108 16.220 42.191 21.534 1.00 50.13 O
HETATM 511 O HOH A 109 17.337 35.292 25.464 1.00 39.26 O
HETATM 512 O HOH A 112 10.684 35.559 22.027 1.00 57.92 O
HETATM 513 O HOH A 113 20.062 26.858 12.663 1.00 44.33 O
HETATM 514 O HOH A 114 21.057 25.359 22.646 1.00 20.07 O
HETATM 515 O HOH A 115 10.987 37.712 13.387 1.00 53.83 O
HETATM 516 O HOH A 116 21.175 25.792 25.461 1.00 45.82 O
HETATM 517 O HOH A 117 26.070 29.114 11.027 1.00 22.17 O
HETATM 518 O HOH A 118 11.734 37.820 3.909 1.00 69.47 O
HETATM 519 O HOH A 119 11.313 40.892 19.018 1.00 47.45 O
HETATM 520 O HOH A 120 9.440 37.991 19.579 1.00 59.14 O
HETATM 521 O HOH A 121 12.308 39.937 22.921 1.00 66.69 O
HETATM 522 O HOH A 123 31.715 37.994 27.757 1.00 35.80 O
HETATM 523 O HOH A 124 33.332 38.551 25.542 1.00 72.97 O
HETATM 524 O HOH A 125 28.935 43.074 23.912 1.00 34.25 O
HETATM 525 O HOH A 126 29.860 40.122 23.720 1.00 32.71 O
HETATM 526 O HOH A 127 15.525 33.748 9.130 1.00 31.50 O
HETATM 527 O HOH A 128 31.988 41.342 17.207 1.00 59.78 O
HETATM 528 O HOH A 129 29.634 43.327 14.724 1.00 33.16 O
HETATM 529 O HOH A 130 22.330 39.552 29.588 1.00 65.61 O
HETATM 530 O HOH A 131 27.765 44.818 28.495 1.00 38.62 O
HETATM 531 O HOH A 133 24.462 47.460 17.919 1.00 23.88 O
HETATM 532 O HOH A 134 24.899 49.304 19.604 1.00 47.82 O
HETATM 533 O HOH A 135 26.021 28.115 30.069 1.00 59.83 O
HETATM 534 O HOH A 136 18.018 27.146 28.371 1.00 70.16 O
HETATM 535 O HOH A 138 16.935 29.765 26.527 1.00 48.67 O
HETATM 536 O HOH A 139 18.048 29.690 2.604 1.00 43.98 O
HETATM 537 O HOH A 141 31.065 26.910 15.705 1.00 64.63 O
HETATM 538 O HOH A 142 30.020 29.019 13.276 1.00 57.01 O
HETATM 539 O HOH A 143 29.845 26.873 22.152 1.00 79.81 O
HETATM 540 O HOH A 146 13.383 39.438 6.579 1.00 63.02 O
HETATM 541 O HOH A 147 20.711 27.622 2.096 1.00 49.53 O
HETATM 542 O HOH A 148 14.196 30.133 28.935 1.00 79.60 O
HETATM 543 O HOH A 150 28.792 35.220 12.803 1.00 70.46 O
HETATM 544 O HOH A 151 27.559 30.392 9.833 1.00 74.79 O
HETATM 545 O HOH A 152 28.329 26.679 12.467 1.00 58.31 O
HETATM 546 O HOH A 154 27.463 36.350 10.486 1.00 61.38 O
HETATM 547 O HOH A 156 18.107 32.722 3.682 1.00 53.96 O
HETATM 548 O HOH A 161 25.605 26.383 11.780 1.00 58.16 O
HETATM 549 O HOH A 162 16.433 43.785 10.736 1.00 59.32 O
HETATM 550 O HOH A 163 10.518 36.164 16.394 1.00 64.24 O
HETATM 551 O HOH A 166 19.795 28.946 27.147 1.00 45.05 O
HETATM 552 O HOH A 171 13.409 41.652 13.265 1.00 61.60 O
HETATM 553 O HOH A 174 27.287 32.431 11.584 1.00 64.21 O
HETATM 554 O HOH A 180 23.741 29.905 2.072 1.00 58.63 O
HETATM 555 O HOH A 181 32.794 51.457 17.245 1.00 65.72 O
HETATM 556 O HOH A 183 9.101 40.801 20.870 1.00 71.78 O
HETATM 557 O AHOH A 301 13.464 41.125 8.469 0.50 20.23 O
HETATM 558 O BHOH A 301 12.554 42.700 8.853 0.50 26.40 O
HETATM 559 O AHOH A 303 22.944 52.797 14.104 0.50 34.59 O
HETATM 560 O BHOH A 303 22.676 52.579 15.869 0.50 32.63 O
MASTER 259 0 0 3 3 0 0 6 559 1 0 6
END

33103
data/sample/1ttv.cif Normal file

File diff suppressed because it is too large Load Diff

32275
data/sample/1ttv.pdb Normal file

File diff suppressed because it is too large Load Diff

8073
data/sample/2P0R.cif Normal file

File diff suppressed because it is too large Load Diff

6218
data/sample/2P0R.pdb Normal file

File diff suppressed because it is too large Load Diff

8073
data/sample/2P0R_mod.cif Normal file

File diff suppressed because it is too large Load Diff

6218
data/sample/2P0R_mod.pdb Normal file

File diff suppressed because it is too large Load Diff

5456
data/sample/2P0R_wrote.cif Normal file

File diff suppressed because it is too large Load Diff

5439
data/sample/2P0R_wrote.pdb Normal file

File diff suppressed because it is too large Load Diff

3015
data/sample/3LKF.pdb Normal file

File diff suppressed because it is too large Load Diff

30184
data/sample/3VI4.cif Normal file

File diff suppressed because it is too large Load Diff

24880
data/sample/3vi4.pdb Normal file

File diff suppressed because it is too large Load Diff

6295
data/sample/4bdf.cif Normal file

File diff suppressed because it is too large Load Diff

5106
data/sample/4bdf.pdb Normal file

File diff suppressed because it is too large Load Diff

18555
data/sample/5RGF.cif Normal file

File diff suppressed because it is too large Load Diff

16636
data/sample/5rgf.pdb Normal file

File diff suppressed because it is too large Load Diff

20071
data/sample/6TL9.cif Normal file

File diff suppressed because it is too large Load Diff

4494
data/sample/6X3P.cif Normal file

File diff suppressed because it is too large Load Diff

17899
data/sample/6tl9.pdb Normal file

File diff suppressed because it is too large Load Diff

3007
data/sample/6x3p.pdb Normal file

File diff suppressed because it is too large Load Diff

4756
data/sample/7TAA.pdb Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,3 @@
data/sample/3LKF.pdb pc
data/sample/1ATP.pdb atp
data/sample/7TAA.pdb abc

View File

@@ -1,21 +0,0 @@
#! /bin/sh
D=1.73
m=3.0
M=6.0
i=36
n=2
out="desc_all_D"$D"_m"$m"_M"$M"_i"$i"_n"$n
mkdir $out
echo "== Describing training set"
./bin/dpocket -f data/train-d.txt -D $D -m $m -M $M -i $i -n $n
mv dpout_fpocketnp.txt dpout_fpocketnp_train.txt
mv dpout_fpocketp.txt dpout_fpocketp_train.txt
mv dpout_fpocket*_train.txt $out
cat $out/dpout_fpocketp_train.txt > $out/all_train.txt
awk 'NR > 1' $out/dpout_fpocketnp_train.txt >> $out/all_train.txt
sed 's/[ ]\{1,100\}/;/g' $out/all_train.txt > $out/all_train.csv

985
doc/GETTINGSTARTED.md Normal file
View File

@@ -0,0 +1,985 @@
# Getting started & Advanced guides
##### fpocket
* [fpocket basics](#fpocket-simple-pocket-detection)
* [fpocket advanced](#fpocket-advanced)
##### mdpocket
* [mdpocket basics](#mdpocket-pocket-detection-on-md-trajectories)
* [mdpocket advanced](#mdpocket-advanced)
##### dpocket
* [dpocket basics](#dpocket-descriptor-extraction)
* [dpocket advanced](#dpocket-advanced)
##### tpocket
* [tpocket basics](#tpocket-scoring-ranging-and-evaluation)
* [tpocket advanced](#tpocket-advanced)
##### other
* [pocket descriptors](#pocket-descriptors)
* [cofactor definitions](#Cofactor-definition)
* [customizing fpocket/mdpocket](#Customizing-fpocket)
## fpocket - simple pocket detection
To run the following examples, we use several sample input file `data/sample/` directory).
### Example
Here you have a very simple and straightforward example of how to run fpocket on a single PDB file downloaded from the RCSB PDB. The following command line will execute fpocket on the 1UYD.pdb file situated in the sample directory.
`fpocket -f sample/1UYD.pdb`
It is also possible to run fpocket on a PDBx/mmcif file type, for example :
`fpocket -f sample/2P0R.cif`
It is mandatory to give a PDB input file using the -f flag in command line. If nothing is given, fpocket prints the fpocket usage/help to the screen. fpocket will use standard parameters for the detection of pockets. Fore more information about these parameters see the [advanced fpocket features](#fpocket-advanced).
If fpocket works properly the output on the screen should look like this :
```bash
=========== Pocket hunting begins ==========
=========== Pocket hunting ends ============
```
If you have a look now in the sample directory, you will notice that fpocket created a folder named 1UYD_out/. This folder contains all the output from fpocket, so what you are actually interested in. If you just want to see rapidly the results, go to the 1UYD_out directory and launch the 1UYD_VMD.sh script. This script will launch the VMD molecular visualizer and load the protein with binding site information coming from fpocket.
![VMD with fpocket output](images/vmd1.png)
The illustration above is somehow what you will see if you launch the VMD script. VMD is well suited for representing the volume of alpha spheres and their respective centers. Usually the visual volume information is not of primordial importance, as the larger alpha spheres tend to reach far out of the protein and smaller alpha spheres are not visible because they are recovered by larger ones. As it can be seen within the Main VMD window, the visualization script loads 3 structures, all of them are explained in more detail in the output section of this chapter.
If you had a closer look before on the methodological aspects of this algorithm (we invite you to read the paper) a natural question would be how to represent apolar and polar alpha spheres. Currently the color code represents only the residue ID (rank of the cavity). If you want to see characteristics of alpha spheres we invite you to change the representation of alpha spheres. This can be found by clicking Graphics -> Representations. Another window will show up. There you select the first molecule (1UYD_out.pdb), like represented on the figure below.
![VMD representations](images/vmd2.png)
A script for fast visualization using PyMOL is also provided. PyMOL provides nice features browsing and selecting different pockets, using the predefined selection patterns on the right side of the main window. However, PyMOL does not interpret well the pqr file format, so alpha sphere volumes are not accurate and only alpha sphere centers can be shown.
![VMD representations](images/pymol1.png)
### Basic input
#### Mandatory (1 OR 2):
1: flag -f : one standard PDB or PDBx/mmcif file name.
2: flag -F : one text file containing a simple list of pdb path
#### Optional:
For more details on optional fpocket arguments see [advanced fpocket features](#fpocket-advanced).
### Output
Fpocket output is made of many files. To have a detailed overview of those files, see [advanced fpocket features](#fpocket-advanced).
Is there something else? No, you are done. Congratulations, you have successfully performed your first pocket prediction with fpocket...without any accidents we hope. As you might have seen, usage of fpocket is rather simple, although it is command line based software. Furthermore you should have seen that fpocket is very fast, well, lets say if you do not run it on a P1 100Mhz.
As mentioned before, fpocket provides much more possibilities especially for filtering out unwanted pockets, clustering of alpha spheres. For all these issues and usage of these more advanced features, refer to [advanced fpocket features](#fpocket-advanced)
## mdpocket pocket detection on MD trajectories
The fpocket developer team is proud to present a very new feature as part of the fpocket software package. As programmers are very creative people, as you might know, we called this program mdpocket, as acronym of Molecular Dynamics pocket (very original isn't it?). In the next paragraphs we will refer to Molecular Dynamics as MD.
Well, mdpocket is a freely available software that allows you to do the following very nice things in a quite fast way :
* pocket detection on MD trajectories (I already said this one)
* visualization of transient pockets (oh, will we have all the Pharma people on our back?)
* extraction of pocket descriptors during the MD trajectory (like pocket volume for example)
* get a static image of pocket occurrences during the MD trajectory (this you do not necessarily see the usefulness, but this will become clearer later)
* perform on the fly energy calculations within a detected pocket
If you are already used to run and analyze MD trajectories you know that there is a bunch of different software available to perform calculation and analysis of MD trajectories. Mdpocket is able to read plain PDB files describing the conformations of a protein, but now you can also read Amber crd files, gromacs xtc, netcdf and charmm and Namd dcd files (that was a nightmare to integrate and compile, so please make use of it).
### Example
It is VERY IMPORTANT to first align (superimpose) all snapshots onto each other. Why? Well, you have to do this due to the methodology used behind mdpocket. For more information on how mdpocket works feel free to read the mdpocket paper:
(http://bioinformatics.oxfordjournals.org/content/27/23/3276.long)
Below is an example for Amber for instance, but you can do the same with gromacs tools, mdtraj in python or in VMD analyzing NAMD trajectories for instance:
With Amber you can do the structural alignment and transformation using the freely available ptraj or cpptraj program and the following steps:
- 1: create a ptraj input file with the following content :
trajin ../md_1.x.gz 1 250 10
trajin ../md_2.x.gz 1 250 10
trajin ../md_3.x.gz 1 250 10
reference ../reference.pdb
strip !:1-208
rms reference :25-88,120-196@CA,C,N,O
trajout trajectory_superimposed.dcd charmm
go
- 2: Run ptraj using the following command:
`ptraj your_topology.top < ptraj_input_file.ptr`
A few words about what we are doing here. First, the ptraj input reads trajectory files. In this example, the trajectory is split up in 3 files. Each file has 250 snapshots. Here we only read every tenth snapshot of the 250. We set a reference PDB structure for the alignment.
The strip command allows you to drop residues, here everything other than the protein (solvent, counter ions etc...).
Next, we align each snapshot on the reference structure, using only the heavy atoms of residues 25-88 and 120 to 196.
The output is written to trajectory_superimposed.dcd. Here we write a dcd file just for demonstration purposes, you can write mdcrd or netcdf files as well with ptraj.
Now, here we are, we can run mdpocket (finally...):
`mdpocket --trajectory_file trajectory_superimposed.dcd --trajectory_format dcd -f reference.pdb`
NB: you still have to provide a pdb file containing the actual topology of the structure as most of the supported MD formats only store coordinates, but no information on the actual atom & residue types of the structure.
The following part will take a while, depending on the number of atoms in your system and the number of snapshots you analyze. In average on a sample MD of 4000 snapshots (3258 atoms) 0.4 seconds of calculation time were necessary for analysis of 1 snapshot on one core of a 2.66Ghz Intel Quad with 4Gb of RAM.
Mdpocket will print out some things and the actual status of advance of the calculation. Once finished you will be able to find the following output files in your current folder :
* `mdpout_freq_grid.dx`: This is an output grid file. The grid contains only a measure of frequency of how many times the pocket was open during a MD trajectory. This, averaged by the number of snapshots, gives a range of possible iso-values between 0 and 1. Currently we provide both types of grid files (frequency & density, as both have proven their usefulness during in-house studies. However, the frequency grid file is usually much easier to interpret.
This representation gives you already a lot of information especially about existing paths during a MD. For mechanistic studies this can often be enough, However, if you want to do measurements of the volume (for example) of a certain pocket you have to select this region first. As VMD and the grid file are not really suitable for selection, mdpocket provides two last output files called :
* `mdpout_dens_grid.dx`: This is one of the two grid output files coming from mdpocket. Briefly, a grid is superposed to all alpha spheres of all snapshots and the number of alpha spheres around each grid point is counted. This output is very useful as working file for a first crude visualization using PyMOL or VMD. In the following example we will show VMD as the visualization of grids is easier and less heavy with it. Open VMD and load the DX file. You should have something like this (colors are different) :
![VMD with mdpocket output](images/vmd3.png)
Well, this is nice, but you can hardly see anything interpretable in there. In order to see more clearly we recommend to change the representation by going to Graphics -> Representations as shown in the following illustration:
![VMD with mdpocket output](images/vmd4.png)
Now you basically can play with the Isovalue slider to get more or less conserved cavities during the MD trajectory. The unit of this isovalue can be expressed as number of Voronoi Vertices (alpha sphere centers) in a 8Å3 cube around each grid point per snapshot. The more a cavity is conserved (or dense) the higher this value. Thus, you will usually get internal pockets and protein internal channels. If you are interested in very superficial or transient binding sites you should decrease the isovalue until you see it.
* `mdpout_dens_iso_8.pdb`: This file contains all grid points having 3 or more Voronoi Vertices in the 8A3 volume around the grid point for each snapshot. Using PyMOL you can now select and save only the grid points of the pocket you are interested in. Save these points to another pdb file. Let us call this file my_pocket.pdb. The choice of the correct grid points for your pocket definition depends completely on you. As rule of a thumb we would recommend to use a high (like 5) isovalue if you want to show open channels in a protein or protein internal binding pockets. You should lower this isovalue (maybe to 2 or 3) if you are interested in transient phenomena (opening, closing of paths, transient pockets etc...). Refer to advanced features to know how to extract these pdb files with other iso values.
* `mdpout_freq_iso_0_5.pdb`: This is similar to the previous pdb file, just being produced on the frequency grid with a cut-off of 0.5.
In order to measure the pocket around your previously defined pocket during the MD trajectory you have to rerun mdpocket in a slightly different way:
`mdpocket --trajectory_file trajectory_superimposed.dcd --trajectory_format dcd -f reference.pdb --selected_pocket my_pocket.pdb`
As you can see, now you have to pass your pocket definition using the --selected_pocket flag of mdpocket. To see how to define your pocket, see the section [Pocket Selection](#pocket-selection). The -v flag is optional, it is just to provide reasonably good volume calculations in a reasonably good execution time. As during the first mdpocket run you should see some output first and the advancement of mdpocket through all you snapshots. Once finished you will find some other output files in your folder:
* `mdpout_mdpocket.pdb`: This is a pdb file that contains all Voronoi vertices in the selected pocket zone for each snapshot. Each snapshot is handled as separated model (like a NMR structure) and can thus be viewed as MD using PyMOL. Show the surface of the vertices and you can visualize the movement of your pocket. Be careful, VMD does not read this file, as from one snapshot to the other a different number and type of Voronoi vertices can be part of the model.
* `mdpout_mdpocket_atoms.pdb`: This is a pdb file similar to the previous output, but this time containing all receptor atoms defining the binding pocket.
* `mdpout_descritpors.txt`: Last but not least, maybe the most important file containing the pocket descriptors. You will find for each snapshot the pocket volume, the number of alpha spheres and all other default fpocket descriptors:
snapshot pock_volume nb_AS mean_as_ray ...
1 793.47 183 3.76
2 726.95 158 3.86
3 711.87 213 3.59
4 700.82 172 3.61
5 762.24 196 3.85
6 618.31 193 3.77
This output file can be easily analyzed using R, gnuplot or other suitable software. An example R output for the pocket volume would be:
![Pocket volume plot](images/volume.png)
If you want to reproduce this, simply launch R and type:
```R
r=read.table("mdpout_descriptors.txt",h=T)
ylim=c(400,1200)
plot(r[,"pock_volume"],ty='l',ylim=ylim,main="",xlab="",ylab="")
par(new=T)
plot(smooth.spline(r[,"pock_volume"],df=40),col="red",lwd=3,ylim=ylim,ty="l",xlab="snapshot",ylab="volume")
```
On this figure you can see a clear volume increase of the pocket in the beginning of the trajectory. Now you can check to what phenomena this increase is due to by analyzing the mdpout_mdpocket.pdb output in PyMOL. Not shown in this example, mdpocket now provides also measurements of the polar and apolar surface area (van der Waals + 1.4Å probe) of the pocket.
### Pocket Selection
In order to be able to track some nifty properties of your cavities, like the solvent accessible surface area, the volume or other fpocket descriptors, you have to select the zone you are interested in. This process is crucial and can depply influence sub-sequent results.
But first of all, what is a selected pocket here? Here, this means a PDB file containing dummy atoms at the positions of grid points that overlap with grid points in the pocket grid you calculated in the first run (frequency or density grid). How can you obtain these dummy atoms? This can be done in two different ways.
__The fast way:__ The first, easy and not very accurate way is to use the defaut pdb files coming from the first run of mdpocket to detect the pocket grids. If you read this manual with a huge attention and did not fall asleep in between, then you remember that mdpocket provides two files called `mdpout_freq_iso_0_5.pdb` and `mdpout_dens_8.pdb`. These files contain dummy atoms at grid point positions that were extracted at grid points having a given value or higher (iso-value of 0.5 and 8 respectively). Now you can use one of these files (depending on if you are more comfortable with one or the other grid, and open them in a molecular viewer that is able to edit structures. PyMOL is an excellent choice to perform this task. Simply select all dummy atoms in the zone of interest (your pocket you want to track) and then create an object with this selection. In the end, the result should look somehow like this:
![pocket selection](images/pymol2.png)
Here the red cloud corresponds to the grid points I have selected by hand. You can now save the grid points that you selected as a PDB file and use this as an input for tracking the properties of the cavity.
__The better way:__ In order to get a good estimate of the volume and extent of the pocket you will notice that the default output pdb files for the two grids are not always sufficient, because of their predefined iso-values. This why you should extract the grid points as a PDB file using your own choice of iso-values. As a general rule, take the iso-values as low as possible. You should still be able to distinguish the different pockets in the density grid, but it's volume should not be very tiny!
You can extract these grid points using a python script that is available in the scripts directory of the fpocket distribution, called `extractIso.py`. Simply execute it with `python extractIso.py` to see how to use it.
### Basic Input
#### Mandatory (running mode 1 - detecting pockets):
##### either:
--trajectory_file : input trajectory file in one of the supported formats
--trajectory_format : (dcd,xtc,netcdf,crd,crdbox,dtr,trr)
-f : topology of the structure as input PDB file
##### or :
-L: a mdpocket input file, this file has to contain the paths to the PDB files of all snapshots (one path per line)
#### Mandatory (running mode 2 - calculating descriptors):
##### either:
--trajectory_file : input trajectory file in one of the supported formats
--trajectory_format : (dcd,xtc,netcdf,crd,crdbox,dtr,trr)
-f : topology of the structure as input PDB file
--selected_pocket : a PDB file containing the sitepoints in the pocket to be selected
##### or :
-L: a mdpocket input file, this file has to contain the paths to the PDB files of all snapshots (one path per line)
--selected_pocket : a PDB file containing the sitepoints in the pocket to be selected
#### Optional:
-o : the prefix you want to give to mdpocket output files
Note that mdpocket determines its running mode by the input given by the user. Thus if you do not provide a wanted pocket using the --selected_pocket flag, mdpocket will automatically only perform cavity detection. mdpocket offers much more optional parameters in order to guide the pocket detection. All fpocket parameters for pocket clustering and filtering are also available in mdpocket. For this see [advanced mdpocket features](#mdpocket-advanced).
### Output (running mode 1 - pocket detection)
* `mdpout_dens_grid.dx`: A dx formatted grid output. This grid contains the number of Voronoi vertices seen per snapshot nearby the grid point. It can be easily visualized using VMD.
* `mdpout_freq_grid.dx`: Similar to the prevous file, this grid file contains the frequency of opening of a pocket at each grid point. It can be visualized using VMD.
* `mdpout_dens_iso_8.pdb`: A pdb file of all grid point positions corresponding to grid points having 8 or more Voronoi vertices nearby per snapshot. This file is provided in order to be able to edit the grid points using PyMOL and select only the points defining the pocket of interest. This pocket of interest should be used as input of mdpocket in the 2nd running mode. If you want to extract gridpoints with other isovalues, use the provided `extractISO.py` file in the scripts directory.
* `mdpout_freq_iso_0_5.pdb`: A pdb file of all grid point positions corresponding to grid points that are 50% of the trajectory overlapping with a pocket. This file is provided in order to be able to edit the grid points using PyMOL and select only the points defining the pocket of interest. This pocket of interest should be used as input of mdpocket in the 2nd running mode. If you want to extract gridpoints with other isovalues, use the provided `extractISO.py` file in the scripts directory.
### Output (running mode 2 - pocket characterization)
* `mdpout_mdpocket.pdb`: A pdb file containing all Voronoi vertices within the selected pocket region for all snapshots. This file is an NMR like file, containing each snapshot as
separated model. This file is best viewed using PyMOL and can be used to create pocket motion movies.
* `mdpout_mdpocket_atoms.pdb`: A pdb file containing all receptor atoms surrounding the selected pocket region. Like the previous output file, this is a NMR like file, containing each snapshot as separated model. This file can be viewed with VMD and PyMOL.
* `mdpout_descriptors.txt`: A text file containing the fpocket pocket descriptors of the selected pocket region for each snapshot. This file can be easily analyzed using standard statistical software like R.
## dpocket descriptor extraction
Until now you have seen what the majority of cavity detection algorithms can do. So a part from speed and hopefully prediction results, nothing distinguishes fpocket from other algorithms like ligsite, sitemap, sitefinder, pocketpicker, pass ...
This is just partially true, because the fpocket package contains dpocket. D is an acronym for describing. One purpose a cavity detection algorithm can be used for is the extraction of descriptors of the physico-chemical environment of the cavity. dpocket allows to do this in a very simple and straightforward way. As extracting binding pocket descriptors on only one protein would be somehow meaningless for studying pocket characteristics, dpocket enables analysis of multiple structures. So now, no longer scripting and automation is necessary to do these kind of things. But lets have a closer look using again a very simple example you can try on your workstation.
### Example
Here we go. dpocket requires one single input file. This input file must be a text file containing the following information:
- 1: the PDB file of the protein you want to analyze and
- 2: the ID of the ligand you would like to have as reference in order to define an explicitly defined binding pocket. The file used in this example (data/sample/test_dpocket.txt) looks like this :
```
data/sample/3LKF.pdb pc1
data/sample/1ATP.pdb atp
data/sample/7TAA.pdb abc
```
Here we analyze three pdb files. Note that the ligand name should be separated by a tabulation from the pdb file name. You can launch dpocket on this sample file using the following command:
`dpocket -f sample/test_dpocket.txt`
dpocket will yield 3 results files in the current directory. These files will be by default :
- dpout_explicitp.txt
- dpout_fpocketnp.txt
- dpout_fpocketp.txt
If you want to change naming of these files, use the `-o` flag in command line to define a new prefix for the fpocket output files, for example `my_test` as prefix would yield `my_test_explicitp.txt`. The three output files contain the in fpocket implemented pocket descriptors for each binding pocket found by fpocket :
- __fpocketp.txt__: describes all binding pockets found by fpocket that match one of the detection criteria. In other word, fpocket found several pocket in the protein, and this file will contain descriptors of pocket that are considered to be the binding pocket using some detection criteria.
- __fpocketnp.txt__: describes on the contrary all pockets found by fpocket that are not found to be the actual pocket using the detection criteria.
- __explicitp.txt__: describes the pockets explicitely defined. By explicitely defined here, we mean that the pocket will be defined as all vertices/atoms situated at a given distance of the ligand (4A by default), regardless of what fpocket found during the algorithm.
The ouput files are tab separated ASCII text files that are easy to parse using statistical software such as R. Thus statistical analysis of pocket descriptors becomes a very straightforward and easy process. Basically, the two first files might be used to establish a new scoring function as they describe what fpocket finds, while the last file could be used for a more detailed and accurate analysis of the exact part of the protein that interact with the ligand.
For more details of the output refer to the output section below, or to [advanced dpocket features](#dpocket-advanced).
### Basic input
#### Mandatory:
flag -f : a dpocket input file, this file has to contain the path to the PDB file, as well as the residuename of the reference ligand, separated by tabulation.
#### Optional:
flag -o : the prefix you want to give to dpocket output files
dpocket offers much more optional parameters in order to guide the pocket detection. For this see Advanced features chapter [advanced dpocket features](#dpocket-advanced).
### Output
Refer to [advanced dpocket features](#dpocket-advanced) for a detailed description of the dpocket output files.
In conclusion of this first very easy dpocket run, you can see that you have a very fast and reliable tool to extract pocket descriptors, of binding pockets and “non binding pockets” on a large scale level. These descriptor files provide an excellent tool for further statistical analysis and model building, which leads immediately to your wish to write a new scoring function for ranking pockets using the different descriptors. Well, fpocket, dpocket and tpocket are very useful tools to do exactly this! So go ahead. Lets suppose you have passed several thousands of PDB files and analyzed statistically the significance of all descriptors. You have set up a new scoring function. Now you have an external test set of PDB files you haven't tested. How can you evaluate your scoring function? This is actually also a very easy task, using tpocket.
## tpocket scoring ranging and evaluation
As already mentioned in the previous paragraph, tpocket can be used in order to evaluate rapidly cavity scoring functions. If you are for example in the pharmaceutical industry and you want to set up the ultimate drugability prediction score, you might be able to do this with fpocket and dpocket. Afterwards you can actually test your method using tpocket. T is an acronym for testing, here.
Something fancy we did not tell you about before is that you can also test your scoring function on apo structures using tpocket. The only requirement is the need to align holo and apo structure to obtain superposed apo and holo pockets. But lets explain this with an example. Of course, testing a holo dataset is even more easy, you just need to provide the resname of the ligand and tpocket will do the rest.
### Example tpocket on apo structures
If you had a look to the fpocket paper, you might have seen that the algorithm was validated on a dataset of 48 proteins previously used to evaluate several pocket detection algorithms. As fpocket programmers are, by definition, very nice people, they have included this data set (holo and aligned apo structures) in the distribution of fpocket, released as `fpocket-1.0-data` with the original fpocket 1 release. [The tar.gz is available on sourceforge](https://sourceforge.net/projects/fpocket/files/fpocket-1.0/fpocket-src-1.0/fpocket-data-1.0.tgz/download)
So let us use this set as example here. When you extract the dataset in your folder you should have a data folder containing among others two files, `pp_apo-t.txt` and `pp_cplx-t.txt`. The first file is a tpocket input file in order to assess the capacity of the scoring function to rank correctly known binding sites on apo structures. The second file is also a tpocket inputfile, but this time for known binding sites on holo structures. Here is a part of `pp_apo-t.txt`:
data/pp_data/unbound/1QIF-1ACJ.pdb data/pp_data/complex/1ACJ.pdb tha
data/pp_data/unbound/3APP-1APU.pdb data/pp_data/complex/1APU.pdb iva
data/pp_data/unbound/1HSI-1IDA.pdb data/pp_data/complex/1IDA.pdb qnd
data/pp_data/unbound/1PSN-1PSO.pdb data/pp_data/complex/1PSO.pdb iva
data/pp_data/unbound/1L3F-2TMN.pdb data/pp_data/complex/2TMN.pdb po3
data/pp_data/unbound/3TMS-1BID.pdb data/pp_data/complex/1BID.pdb UMP
data/pp_data/unbound/8ADH-1CDO.pdb data/pp_data/complex/1CDO.pdb NAD
data/pp_data/unbound/1HXF-1DWD.pdb data/pp_data/complex/1DWD.pdb MID
Here the first column contains the path to the apo structure, aligned to the holo structure, which is given in the second column. Using a holo dataset, the first and the second column would be the same. The third column indicates the PDB HETATM code of the ligand in the holo structure that is situated in the binding site.
You can use this file to run tpocket using the following command line :
`tpocket -L data/pp_apo-t.txt`
Let us continue with the more interesting case, the first example, with a lot of structures. After some time of calculation, tpocket will provide two standard output files. The moment has come, you will finally know if you discovered the ultimate method of drugability prediction, or sugar binding site prediction or whatever. The first file is called by default `stats_g.txt`. It contains global statistics about the prediction using all evaluation criterias available in tpocket, so for example how many binding sites you found among the 3 first ranked cavities. For representational purposes only the first of the six tables available in this file is depicted hereafter:
Ratio of good predictions (dist = 4A)
-------------------------------------
Rank <= 1 : 0.69
Rank <= 2 : 0.83
Rank <= 3 : 0.94
Rank <= 4 : 0.94
Rank <= 5 : 0.94
Rank <= 6 : 0.94
Rank <= 7 : 0.94
Rank <= 8 : 0.94
Rank <= 9 : 0.94
Rank <= 10 : 0.94
-------------------------------------
Mean distance : 2.924573
Mean relative overlap : 39.373226
This table schedules the capacity of your scoring function to identify the binding sites of the 48 apo structures using the criteria published within the original pocket picker paper. Not represented here, tpocket provides two other, maybe more accurate, measures for a correctly identified binding site. These measures are explained in more detail in the [advanced tpocket features section](#tpocket-advanced), as they can be a bit more tricky.
The second output file provides more accurate statistics about each structure analyzed. This file, called `stats_p.txt` enables the user to analyze more closely why scoring might not work well on a specific structure. Here is an extract of the first columns and lines of this file:
LIG | COMPLEXE | APO | NB_PCK | OVLP1 | OVLP2 | DIST_CM | POS1 | POS2 | POS3
THA 1ACJ.pdb 1QIF-1ACJ.pdb 22 79.31 78.33 0.00 1 1 0
IVA 1APU.pdb 3APP-1APU.pdb 4 0.00 0.00 3.43 0 0 1
QND 1IDA.pdb 1HSI-1IDA.pdb 4 82.69 81.65 3.19 1 1 1
IVA 1PSO.pdb 1PSN-1PSO.pdb 9 80.00 51.38 3.49 1 1 1
PO3 2TMN.pdb 1L3F-2TMN.pdb 10 58.33 72.00 2.69 1 1 1
UMP 1BID.pdb 3TMS-1BID.pdb 15 63.64 60.78 3.52 1 1 1
NAD 1CDO.pdb 8ADH-1CDO.pdb 18 0.00 0.00 3.41 0 0 1
MID 1DWD.pdb 1HXF-1DWD.pdb 10 93.48 81.37 3.86 1 1 1
Using this output you have a detailed view of what worked and what did not worked for all criteria. For instance, in this example, fpocket detects well all apo binding sites a part from the first one using the PocketPicker criterion for binding site identification (DIST_CM). POS3 corresponds to the rank of the cavity using the scoring function of fpocket. You have further information about the number of pockets per protein and the exact overlap with the actual pocket.
Now if you want to assess your scoring function on holo structures, you also can use tpocket. This time you only have to provide the `pp_cplx.txt`, also provided within the sample tar.gz file. As you can see, this file is very similar to `pp_apo.txt`. Only the first column repeats the path to the complex structure like this:
data/pp_data/complex/1acj.pdb data/pp_data/complex/1acj.pdb tha
data/pp_data/complex/1apu.pdb data/pp_data/complex/1apu.pdb iva
data/pp_data/complex/1ida.pdb data/pp_data/complex/1ida.pdb qnd
data/pp_data/complex/1pso.pdb data/pp_data/complex/1pso.pdb iva
data/pp_data/complex/2tmn.pdb data/pp_data/complex/2tmn.pdb po3
data/pp_data/complex/1bid.pdb data/pp_data/complex/1bid.pdb ump
data/pp_data/complex/1cdo.pdb data/pp_data/complex/1cdo.pdb nad
### Basic Input
#### Mandatory:
flag -L : a tpocket input file, this file has to contain the paths to the PDB files (apo, holo or holo,holo if you want to test fpocket only on holo structures), as well as the residuename of the reference ligand, separated by tabulation.
#### Optional:
flag -o : the prefix you want to give to tpocket detailed statistics
flag -e : the prefix you want to give to tpocket general statistics
tpocket offers much more optional parameters in order to guide the pocket detection. For this see the [advanced tpocket features section](#tpocket-advanced).
### Output
Using standard parameters on the example tpocket list given in the example paragraph above, tpocket returns two output files:
* `stats_p.txt`: This file contains the detailed statistics of tpocket. The name and the ligand of the analyzed PDB structure are repeated, as well as the exact overlap of the fpocket identified binding pocket with the actual binding pocket (identified with the help of the ligand, called OVLP here). You will see two different overlaps in the output. For further informations refer to the [advanced tpocket features section](#tpocket-advanced). Furthermore, the distance criterion used in the Chemistry Central Journal paper for publication of PocketPicker was used (DIST_CM). Next, you can also have exact information about the rank of the cavity using the fpocket scoring function.
* `sats_g.txt`: Second, tpocket provides more general statistics about pocket identification on the dataset provided. For both overlap criterions the ranking performance (the capacity of the fpocket scoring to rank correctly a binding site having a certain minimum overlap with the actual binding site) is printed into this file. Thus, statistics in this file gives you a rapid overview over the global performance of your method.
Summarizing features of tpocket, one could retain, that tpocket is a very fast way to test fpockets performance on your own dataset and test your own scoring functions for ranking purposes of identified binding sites.
You have finished the Getting started section. We hope that you notice the usefulness (hopefully;) of this package of programs for the research of new features, descriptors and scoring functions in the binding site identification field. Well, this was only a very fast overview over the very basic features of fpocket, dpocket and tpocket. If you want to dive into development of your own pocket descriptors and scoring functions, or if you want to change the pocket detection parameters for your purposes, continue with the Advanced features section, next.
# Advanced Features
You want to know more about fpocket? This is the section for you, here we tried to compile in a (we hope) comprehensive manner the most important details of fpocket, dpocket and tpocket, to which you have access by command line. It is primordial to know, that fpockets performance was assessed and scoring function was established for standard parameters. The performance of pocket detection and scoring is highly dependent on these parameters, so keep in mind that you might have to adapt scoring to your specific problem.
Note that this section does not provide too much information about the theoretical background of the way fpocket works. In order to learn more about this read the Materials & Methods of the [freely available paper on the BMC Bioinformatics](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-168) website. Nevertheless, we tried to keep it as clear as possible, using some application examples.
## fpocket advanced
### Input command line arguments
#### Mandatory:
The simplest way to run fpocket is either by providing a single pdb file, or by providing a list of pdb file, stored in a simple text file. You will need one of these two input to run fpocket:
-f string : one standard PDB filename that you want to analyze with fpocket
##### or:
-F string : filename of a simple list of pdb files.
#### Optional:
-m float: (default 3.4Å) This flag enables the user to modify the minimum radius an alpha sphere might have in a binding pocket. An alpha sphere is a contact sphere, that touches 4 atoms in 3D space without having any internal atoms. Here 3Å allow filtering of too small (protein internal) alpha spheres. If you want to analyze internal interstices, lower this parameter. In the contrary, if you want to analyze more solvent exposed cavities, you can raise this parameter in order to filter out too buried cavities.
-M float: (default 6.2Å) Here you can modify the maximum radius of alpha spheres in a pocket. An alpha sphere is a contact sphere, that touches 4 atoms in 3D space without having any internal atoms. Here 7Å allow to filter out too large contact spheres, that are lying on the protein surface. If you want to analyze very flat and solvent exposed surface depressions, raise this parameter. For analysis of buried parts of the protein you can lower this parameter. Higher radii might be more interesting for identification of protein protein binding sites or polysaccharide binding sites. Smaller radii enable detection of buried cavities for small organic molecules (drugs, for instance).
-l int: (None) If you have an input PDB file of an NMR structure or one with multiple models you can specify which model (conformation) you'd like to analyse
-C char: (default s) The clustering method to be used here. By default a pairwise single linkage clustering is used here.
's': pairwise single linkage clustering,
'm': pairwise maximum- (or complete-) linkage clustering,
'a': pairwise average-linkage clustering,
'c': pairwise centroid-linkage clustering
-e char: (default e) The distance measure used for the clustering algorithm.
'e': Euclidean distance
'b': City-block distance
'c': correlation
'a': absolute value of the correlation
'u': uncentered correlation
'x': absolute uncentered correlation
's': Spearman's rank correlation
'k': Kendall's tau
-i int: (default 15) This flag indicates how many alpha spheres a pocket must contain at least in order to figure in the results provided by fpocket. This parameter enables filtering of too small cavities. Thus, if you want to analyze smaller cavities also, lower this parameter, if you are only interested in huge cavities, like NADP binding sites, you can raise it in order to retain only very few pockets in the end. To give you an idea, a rather big cavity, like a NADP binding site, can have hundreds of alpha spheres. Thus, 30 as standard parameter enables also to keep smaller binding sites.
-A int: (default 3) Fpocket distinguishes between two types of alpha spheres. Polar alpha spheres and apolar alpha spheres. This flag ranges from 0 to 4 and modifies the definition of the alpha sphere type. By default, an alpha sphere contacting at least 3 apolar atoms (having an electronegativity below 2.8) is considered as apolar. If this is not the case it is considered as polar.
-D float: (default 2.4Å) this parameter changed compared to the previous versions of fpocket as we completely replaced the clustering algorithms entirely. This measure is now used to analyze a hierarchical distance and cut sub-trees at the desired distance. The bigger the distance, the larger the clusters you'll get.
-p float: (default 0.0) This is another parameter for filtering unwanted pockets. It defines the maximum ratio of apolar alpha spheres and the number of alpha spheres in a pocket in order to keep the pocket in the results list. That is to say, by default every pocket is kept (0.0). Now, if you would like to filter rather hydrophobic pockets, raise this parameter and very polar cavities will be filtered out. This parameter is a ratio, not a percentage, thus it ranges from 0 to 1.
-v int: (default 2500) By default, pockets volume are calculated using a monte-carlo algorithm. Basically, the algorithm picks a random point in the space and check if it is included in any alpha sphere, and stores this status. This is repeated N times, and we estimate the volume of the pocket using ratio between the number of hit and the number of iteration, scaled by the size of the box. This parameter defines the number of iteration to perform. Of course, the higher the value is, the greater the accuracy will be, but the performance will be slowed down.
-b (none): (NOT USED BY DEFAULT) This option allows the user to choose a discrete algorithm to calculate the volume of each pocket instead of the Monte Carlo method. This algorithm puts each pocket into a grid of dimension (1/N*X ; 1/N*Y ; 1/N*Z), N being the value given using this option, and X, Y and Z being the box dimensions, determined using coordinates of vertices. Then, a triple iteration on each dimensions is used to estimate the volume, checking if each points given by the iteration is in one of the pockets vertices. This parameter defines the grid discretization. If this parameter is used, this algorithm will be used instead of the Monte Carlo algorithm.
Warning: Although this algorithm could be more accurate, a high value might dramatically slow down the program, as this algorithm has a maximum complexity of N*N*N*nb_vertices, and a minimum of N*N*N !!!
-d (none): Option allowing you to output pockets and properties in a condensed format. This will put to the stdout pocket properties in a tab separated string and write pocket files in a subfolder
-r string: (None) This parameter allows you to run fpocket in a restricted mode. Let's suppose you have a very shallow or large pocket with a ligand inside and the automatic pocket prediction always splits up you pocket or you have only a part of the pocket found. Specifying your ligand residue with -r allows you to detect and characterize you ligand binding site explicitly. For instance for `1UYD.pdb` you can specify `-r 1224:PU8:A` (residue number of the ligand: residue name of the ligand: chain of the ligand)
-P string: (None) Binding site delimited by the user through the input. You can inidcate which amino-acids are part of the binding site you'd like to "identify" and calculate descriptors for. fpocket will run its usual alpha-sphere detection and clustering will be guided to collect all alpha spheres in contact with residues of interest. You should enter a string of residues with residue numbers, insertion codes & chain codes: 'residuenumber1:insertioncode1:chaincode1.residuenumber2:insertioncode2:chaincode2.residuenumber3:insertioncode3:chaincode3'. Insertion codes can be empty. `-P 107::A.138::A.51::A.98::A.55::A.93::A` for instance for part of the HSP90 binding site of 4cwr. NB: If you use an mmcif file as input, you need to use the automatically assigned residue number instead of author defined number for this to work.
-y string: (filename) EXPERIMENTAL: here you can specify a topology filename in the Amber prmtop format. This can then be used by fpocket & mdpocket to calculate energy grids for your pockets. NB: you have to specify the -x flag to run energy calculations
-x None: (None) EXPERIMENTAL: specify this flag if you want to run energy calculations on calculated pockets. That's not fully functional and only one or two probes are currently generated and output density grids written. Use with caution
-c char : (Default is none): Use this flag to choose which chains you want to delete before running fpocket. The selected chains can be specified with ',' or ':' delimiters, for example you can use it '-c B,D' or '-c B:D'. You can delete up to 20 different chains.
-k char : (Default is none): Use this flag to choose which chains you want to keep before running fpocket. The selected chains can be specified with ',' or ':' delimiters, for example you can use it '-k A,B,E' or '-k A:B:E'. You can keep up to 20 different chains.
-a char : (Default is none): With this flag you can select a chain you want to be considered as a ligand. Works the same way as the "-r" flag but with a whole chain. Only a single chain can be chosen, for example '-a D'.
-w char : (Default is 'd') : With this flag you are able to choose what kind of writing file output you want, 'd' -> default (same format outpout as input)| 'b' or "both"-> both pdb and mmcif | 'p' or "pdb"-> pdb | 'm' or "cif" or "mmcif" -> mmcif, for example "-w cif" or "-w p"
### Output files description
fpocket yields output directly in the directory of the data file, creating a directory using the name of the PDB file followed bu the _out extension. Here, the command ll sample/3LKF_out of the current sample run would look something like this:
total 332
-rw-r--r-- 1 peter users 769 Nov 29 00:14 3LKF.pml
-rw-r--r-- 1 peter users 698 Nov 29 00:14 3LKF.tcl
-rwxr-xr-x 1 peter users 30 Nov 29 00:14 3LKF_PYMOL.sh
-rwxr-xr-x 1 peter users 41 Nov 29 00:14 3LKF_VMD.sh
-rw-r--r-- 1 peter users 245835 Nov 29 00:14 3LKF_out.pdb
-rw-r--r-- 1 peter users 6725 Nov 29 00:14 3LKF_pockets.info
-rw-r--r-- 1 peter users 49355 Nov 29 00:14 3LKF_pockets.pqr
-rw-r--r-- 1 peter users 4073 Nov 29 00:14 3LKF_info.txt
drwxr-xr-x 2 peter users 4096 Nov 29 00:14 pockets
As you can see, fpocket provides a lot of files and another subdirectory. However, majority of these files are necessary for easy visualization of binding pockets. Lets explain the content and utility of each file:
* `3LKF_info.txt`: this file contains human readable information (descriptors) about the pockets found on the protein. Notably this file contains a pocket score (likeliness this is a small molecule binding site) and a druggability score (how druggable the binding site is) Here an extract:
Pocket 1 :
Score : 0.490
Druggability Score : 0.019
Number of Alpha Spheres : 21
Total SASA : 19.687
Polar SASA : 7.611
Apolar SASA : 12.076
Volume : 270.934
Mean local hydrophobic density : 3.000
Mean alpha sphere radius : 3.816
Mean alp. sph. solvent access : 0.519
Apolar alpha sphere proportion : 0.190
Hydrophobicity score: 23.889
...
* `3LKF.pml`: this is a PyMOL script for visualization of binding pockets using PyMOL
* `3LKF.tcl`: this a tcl script for visualization of binding pockets using VMD
* `3LKF_PYMOL.sh`: this is the executable script to launch fast visualization using PYMOL
* `3LKF_VMD.sh`: this is the executable script to launch fast visualization using VMD
* `3LKF_out.pdb`: this is the most important file, it contains the initial PDB structure given as input. Non cofactor HETATM occurrences will be stripped off in this file compared to the original PDB input file. The PDB file contains centers of alpha spheres using the HETATM definition as dummy atoms. These alpha sphere centers are attached in the end of the PDB file, using the STP residue name (for site point). Apolar alpha spheres carry the atom name APOL, polar alpha spheres the atom name POL. Pockets are sets of alpha spheres. They can be distinguished by residue number. Thus residue STP 1 would be the first binding pocket according to fpocket. To show this more clearly here is an extract of the `3LKF_out.pdb`:
ATOM 2349 CD LYS A 299 9.679 16.827 105.636 0.00 0.00 C 0
ATOM 2350 CE LYS A 299 10.371 16.314 104.370 0.00 0.00 C 0
ATOM 2351 NZ LYS A 299 11.749 15.794 104.597 0.00 0.00 N 0
ATOM 2352 OXT LYS A 299 5.240 20.009 107.670 0.58 9.64 O 0
HETATM 1 APOL STP C 1 27.849 33.435 123.906 0.00 0.00 Ve
HETATM 2 APOL STP C 1 29.108 33.195 122.206 0.00 0.00 Ve
HETATM 3 APOL STP C 1 28.611 33.141 119.797 0.00 0.00 Ve
HETATM 4 APOL STP C 1 26.830 32.143 118.779 0.00 0.00 Ve
* `3LKF_pockets.pqr`: This file contains all alpha sphere centers, as the 3LKF_out.pdb file, but contains no information about the protein structure. Furthermore using the pqr format enables writing of the van der Waals radius of atoms explicitely in this file. Here this possibility was used to output the radii of alpha spheres of a pocket. Charging this pqr file, one can analyze more precisely the volume recognized by fpocket. Note that, currently only VMD supports reading this format correctly. PyMOL is able to read pqr file, but does not interpret van der Waals radii.
* `pockets/`: Well, again a subdirectory. But I promise, it's the last one. For development purposes or easy analysis, fpocket proposes this directory which contains according to the current example:
pocket0_atm.pdb pocket2_vert.pqr pocket5_atm.pdb pocket7_vert.pqr
pocket0_vert.pqr pocket3_atm.pdb pocket5_vert.pqr pocket8_atm.pdb
pocket1_atm.pdb pocket3_vert.pqr pocket6_atm.pdb pocket8_vert.pqr
pocket1_vert.pqr pocket4_atm.pdb pocket6_vert.pqr pocket9_atm.pdb
pocket2_atm.pdb pocket4_vert.pqr pocket7_atm.pdb pocket9_vert.pqr
* `*_atm.pdb`: These files contain only the atoms contacted by alpha spheres in the given pocket. Complementary to this information, `*_vert.pqr` files contain only the centers and radii of alpha spheres within the respective pocket. As extensions mention, atoms are output in the PDB file format and alpha sphere centers in the PQR file format.
### A word on druggability
With the [Understanding Druggability paper](https://pubs.acs.org/doi/abs/10.1021/jm100574m) we introduced an alternative scoring function in fpocket allowing you to assess the likelihood of a binding site to bind small druglike molecules. Since the publication the score has been retrained and performance improved (no paper for that work out). Roughly, if you get a druggability score of 0 or close to 0 it's predicted no-druggable with drug like molecules. If the score is above 0.5 there might be a chance to find drug-like molecules.
The druggability assessment is done using some of the pocket descriptors extracted by fpocket. IT DOES BY NO MEANS indicate that no molecule binds to a pocket. I.E. a peptide binding site will bind peptides, but peptides won't necessarily be considered as drug-like molecules.
## mdpocket advanced
A lot of the functionality of mdpocket has already been covered in the Getting started section. However, there is at least one little functionality that you can access via mdpocket that you don't know about yet.
### Detect transient druggable binding pockets
The current version of fpocket contains two scoring methods to score the pockets. The first one is the original fpocket score, published in the first release and the scientific paper. Later, a second pocket score was added. This score, called druggability score intends to assess at what point the identified pocket is likely to bind drug like molecules. This drug score is a value between 0 and 1, 0 signifying that the pocket is likely to not bind a drug like molecule and 1, that it is very likely to bind the latter. In combination with mdpocket the drug score can be of use when someone wants to assess if during a MD trajectory somewhere “druggable” pockets appear. You can do this during the first explorative mdpocket run (without studying a particular pocket), by specifying the `-S` flag in command line when calling mdpocket. This flag will yield mdpocket not to do the following thing: For each snapshot fpocket is run normally and a druggability score is associated to each pocket. Voronoi vertices near to grid points are used to map the drug score to each grid point (instead of counting them, we increment by the drug score of the pocket). We thus recommend to analyze the frequency grid when running mdpocket with `-S`. You will immediately notice that much less pockets are found in the grid at higher iso-values. This can also help to focus initially on your drug binding site (if you are coming from big pharma), especially for the tedious pocket selection by hand, this is very handy.
If you want to draw conclusions about the “mean druggability” of some pockets using the frequency grid you should beware of the fact, that the mean drug score that you see there (the iso-value) is very underestimated compared to values you obtain on crystal structures.
Last, but very important : if you plan to run a mdpocket calculation using `-S`, you should use the fpocket default pocket detection parameters. Using different parameters, like for channels etc makes strictly no sense as the druggability score was trained using the default fpocket parameters.
### Detect different types of pockets
Fpocket was initially created to detect small molecule binding sites on proteins. That is what most people are interested in (a big assumption, we know). But as we want to please a maximum number of you, distinguished fpocket users, we try to keep fpocket as flexible as possible via these various (probably a bit opaque) command line arguments. These arguments become very interesting when one is interested in a different type of pocket detection. For instance, detecting channels and gaz pathways in a protein is a completely different topic compared to finding drug binding sites.
If one wants to identify transient internal pockets and channels one could modify the pocket detection parameters for fpocket / mdpocket. Here we give examples of typical parameters and what type of pockets you are likely to get back from fpocket / mdpocket :
__Detect small molecule binding sites__ : Use the default parameters (don't specify anything)
__Detect putative channels and small cavities__ : -m 2.8 -M 5.5 -i 3
__Detect pockets where sterically water binding is possible__ : -m 3.5 -M 5.5 -i 3
__Detect rather big, external pockets__ : -m 3.5 -M 10.0 -i 3
### Additional scripts
In order to facilitate some simple tasks for conversion, extraction and creation of input files the fpocket distribution contains some additional python scripts that can be of use for some specific tasks but do not have anything to do in a concrete way with the pocket detection itself. This is why they are not included as standalone program here.
* `createMDPocketInputFile.py`: This is a standard python script (that should work out of the box on all machines having python installed on it) that takes the path of all the snapshot PDB files of a MD trajectory as input and creates a valid mdpocket input file (alpha numerically sorted list of paths). We recommend you to use this script if you need a valid mdpocket input file without worrying about how to order in a alphanumeric way your file names to form a valid list.
* `extractISO.py`: This is a python script that makes use of the numpy library. If you do not have numpy installed this will not work. However installing numpy is a rather good idea as this is a very nice library ;). The script takes as input a mdpocket dx grid file, a filename (the one you want for the output) and a wanted isvalue. The script will write all grid point coordinates from the dx file having a grid value higher or equal than the wanted isovalue to the output file.
## dpocket advanced
Input command line arguments
### Mandatory:
-f : a dpocket input file, this file has to contain the path to the PDB file, as well as the residuename (PDB HET residue tag, like “hem”, for heme) of the reference ligand, separated by a tabulation.
See the [Getting started section of dpocket](#dpocket-descriptor-extraction) for an example of such a file.
### Optional:
-o : (default dpout) the prefix you want to give to dpocket output files. The standard will produce three output files named dpout_fpocketnp.txt, dpout_fpocketp.txt, dpout_explicitp.txt.
-e : Use the first explicit interface definition (default): we define the explicit pocket as being all atoms contacted by alpha spheres situated at a distance of d A° from any ligand atom.
-E : Use the second explicit interface definition: we define the explicit pocket as being all atoms situated at a distance of d A° from any ligand atom.
-d : The distance criteria used for the explicit pocket definition.
Last, all optional parameters used by fpocket are also accessible on command line through dpocket. Refer to the preceding paragraph to see details about fpocket parameters.
### Output files description
As shown in the example, dpocket creates 3 output files. Lets describe them a bit more in detail here:
* `dpout_explicitp.txt`: This file contains all pocket descriptors implemented in fpocket of the explicitly defined binding pocket. What does this mean, explicitly? In the input you have associated a ligand identification to each PDB file. This ligand is used by fpocket in order to identify the actual binding pocket.
pdb ligand overlap lig_vol pocket_vol nb_alpha_spheres mean_asph_ray
data/3LKF.pdb PC 100.00 132.90 1678.64 29 3.94
data/1ATP.pdb ATP 100.00 322.62 2127.53 65 3.59
data/7TAA.pdb ABC 100.00 608.66 4977.48 97 4.20
Note that this is only an extract of this file. It contains a lot of columns (descriptors) that are not represented here. The first line describes the nature of the entry. The next line recapitulates the pdb structure analyzed (`data/sample/3LKF.pdb`), the ligand used as reference (PC). Next the overlap between the actual and found binding pocket is shown, here 100% as this is an explicitly defined binding pocket. The next entries can be used as descriptors, like the ligand volume, the pocket volume, the number of alpha spheres in the binding pocket, the mean alpha sphere radius ... For a complete list of all implemented descriptors in fpocket, refer to the Advanced features [Pocket descriptors section](#pocket-descriptors).
The volumes calculated here are not accurate at all. If you want to calculate accurate volumes you have to change parameters for volume calculation. As volume calculations are generally over-estimated using alpha sphere approaches, especially for open binding pockets, this calculation is made available, but uses the minimum sampling for the calculation. For more accurate calculation significantly more calculation time would be necessary. You can provide a higher sampling via the `-v` flag in the command line.
* `dpout_fpocketnp.txt`: This file contains the same kind of descriptors as the preceding one, but this time for pockets identified by fpocket, that are “non binding pockets”. Non binding pockets means here, that the pockets do not correspond to the pocket where the reference ligand binds. Be careful, this does not necessarily mean that other pockets do not bind anything.
* `dpout_fpocketp.txt`: The last file is also formated the same way as the preceding both. This file contains the binding pocket, this time identified by fpocket and not explicitly by the ligand.
## tpocket advanced
This program of the fpocket package is certainly very useful for testing new scoring methods rapidly on a large dataset of protein ligand complexes. However one might encounter difficulties to understand results, interest, advantages and drawbacks of this methodology. In order to facilitate your understanding of this package we provide some more fundamental information first, before treating more practical questions about tpocket.
### Input command line arguments
#### Mandatory:
-L : a tpocket input file. This file has to contain the paths to the PDB files (apo, holo or holo,holo if you want to test fpocket only on holo structures), as well as the residuename (PDB HET residue tag, like “hem” for heme) of the reference ligand, separated by tabulations.
#### Optional:
-o : (default ./stats_p.txt) The filename you want to give to tpocket detailed statistics.
-e : (default ./stats_g.txt) The filename you want to give to tpocket global statistics.
-d : Distance criteria used for one of the 3 definition of a pocket: All atoms situated at a distance lower of equal that d will be considered as part of the actual pocket.
-k : Keep fpocket output for each pdb test.
Last, all optional parameters used by fpocket are also accessible on command line through tpocket. Refer to [fpocket advanced](#fpocket-advanced) for fpocket parameters.
### Actual pocket definition for evaluation of fpocket
Delimiting, and more generally defining what is the exact binding pocket of a protein in an automated way is not that easy. Finding a criteria that evaluate correctly the ability of fpocket to detect the actual binding site of a protein is consequently even more difficult.
Tpocket makes use of 6 different ways to determine if a pocket found by fpocket could be considered as the actual binding pocket, with respect to a given ligand:
* 1 The actual binding site is reduced to a single point, the barycenter of the pocket (calculated using alpha spheres). The binding pocket is defined as the pocket which barycenter is situated at a distance of 4A of any ligand atom. It corresponds to the Ppc discussed in the paper.
* 2 The actual binding pocket is defined by the set of atoms that are in contact with alpha sphere that are nearby (< 3A) the actual ligand. This set of atoms is then compared to all atoms contacted by all voronoi vertices included in each pocket found by fpocket. WARNING: this is currently not safely usable for an holo/apo dataset.
* 3 The actual binding pocket is defined by the set of atoms that are nearby (4A) the actual ligand. The same procedure as for the first definition is then applied to say whether a pocket can be considered as the actual pocket or not. WARNING: this is currently not safely usable for an holo/apo dataset.
* 4 The actual binding pocket is defined by the set of alpha sphere nearby (< 3A) the actual ligand. Then, for a given pocket, we calculate the correspondence between alpha sphere in the pocket, and alpha sphere in the actual binding pocket. If this ratio exceed a certain value (25%), we consider this pocket as being the actual pocket.
* 5 For a given pocket, we calculate the proportion of ligand atom that are nearby (< 3A) at least one alpha sphere of pocket. If this proportion exceed a certain value (50%), we consider this pocket as being the actual pocket.
* 6 A combination of both 5th and 6th criteria described above. If both 4th and 5th criterion are satisfied, then this criteria is. This corresponds to the MOc (Mutual Overlap criterion) discussed in the paper.
The reason why we choose 3A for the criteria 2, 4 and 5 is quite simple: as in the current algorithm, the minimum radius of an alpha sphere is 3A, a ligand atom situated at a distance lower or equal than this value can be considered as included in this alpha sphere, and therefore detected. Of course, this applies to alpha sphere with higher radius too.
All of these criteria have their strengths and witnesses, that's why we choose to implement all of them.
## Pocket descriptors
In order to discriminate an interesting pocket from a lot of uninteresting ones, fpocket uses descriptors for each pocket. A scoring function, using these descriptors, was trained to well identify what we generally call “binding site”. Here are set together all descriptors implemented in fpocket. The ones that are currently used for scoring are marked with a *, and the one having the tag normalized associated with have a normalized (ie. scaled to a [0, 1] range, the highest (resp the lowest) value of a given descriptor being set to 1 (resp 0)) equivalent descriptor.
### Number of alpha spheres (normalized)
As the title says, this is surely the most simple descriptor. The number of alpha spheres reflects generally more or less proportionally the size of the cavity.
### Density of the cavity (normalized)
This descriptor tends to measure the density and “buriedness” of a pocket. It is nothing else than the mean value of all alpha sphere pair to pair distances in the binding pocket. Thus, a small value indicates a rather big compactness of the binding pocket and thus a rather burried pocket. Larger values give indication about more extended and exposed cavities.
### Polarity Score (normalized)
In the contrary to hydrophobicity this descriptor tries to measure the hydrophilicity character of a binding pocket. To each residue of the binding pocket a polarity score is assigned (as published on http://www.info.univ-angers.fr/~gh/Idas/proprietes.htm). The final polarity score is the mean of all polarity scores of all residues in the binding pocket. This is extremely approximative, so should not be overestimated. Each residue is evaluated only once.
### Mean local hydrophobic density (normalized)
This descriptor tries to identify if the binding pocket contains local parts that are rather hydrophobic. For each apolar alpha sphere the number of apolar alpha sphere neighbors is detected by seeking for overlapping apolar alpha spheres. The sum of all apolar alpha sphere neighbors is divided by the total number of apolar alpha spheres in the pocket. Last this score is normalized compared to other binding pockets.
### Proportion of apolar alpha spheres (normalized)
This descriptor, returned as percentage, reflects the proportion of apolar alpha spheres among all alpha spheres of one pocket identified by fpocket. This can reflect somehow the hydrophobic/-philic character of a binding pocket.
### Druggability Score
The druggability score is a numerical value between 0 and 1 associated to each pocket using a logistic function. This scores intends to assess the likeliness of the pocket to bind a small drug like molecule. A low score indicates that drug like molecules are likely to not bind to this pocket. A druggability score at 0.5 (the threshold) indicates that binding of prodrugs or druglike molecules can be possible. 1 indicates that binding of druglike molecules is very likely. The theoretical basis of the score is currently in the lengthy process of scientific publication.
### Maximum distance between two alpha sphere (normalized)
This descriptor store the maximum distance found between two alpha sphere in a given pocket.
### Hydrophobicity Score
This descriptor is based on a residue based hydrophobicity scale published by Monera & al. in the Journal of Protein Science 1, 319-329 (1995). For all residues implicated in the binding site the mean hydrophobicity score is calculated and is used as descriptor for the whole pocket. Each residue is evaluated only once.
### Charge Score
According to (http://www.info.univ-angers.fr/~gh/Idas/proprietes.htm) the charge of each amino acid in the binding site is tracked. The mean charge for all amino acids in contact with at least one alpha sphere of the pocket is calculated to form this charge score. Each residue is evaluated only once.
### Volume Score
Similarly to other descriptors, this one is based on data published on (http://www.info.univ-angers.fr/~gh/Idas/proprietes.htm). This data resumes relative volume of different amino acids. In order to calculate this descriptor the mean volume score of all amino acids in contact with at least one alpha sphere of the pocket is calculated. Each residue is evaluated only once.
### Composition of amino acids
As the name indicates, fpocket tracks the composition in amino acids of binding pockets. If at least one atom of a residue is in contact with at least one alpha sphere of a binding pocket it is accounted to be part of the binding site. This descriptor is returned as cumulative list, for instance you can find 2 valines, 3 glutamates etc... in the binding site.
Occurences of amino acids in different descriptor outputs are given in the following order : Ala, Cys, Asp, Glu, Phe, Gly, His, Ile, Lys, Leu, Met, Asn, Pro, Gln, Arg, Ser, Thr, Val, Trp, Tyr.
### Pocket volume
As indicated by the name, this descriptor tries to evaluate the volume of a binding pocket using a Monte-Carlo algorithm that calculates full volume occupied by all alpha sphere in a given pocket. The number of iteration of this algorithm can be controlled using fpocket input parameters.
### Polar Surface Area
This descriptor provides an estimation of the polar surface area of the pocket based on information of the receptor atoms. The method used to calculate the area only provides an approximation, but should be good enough to get some rather relevant estimates.
### Apolar Surface Area
See polar surface area in the previous point, only for apolar atoms.
### Total Surface Area
The sum of the polar and apolar surface area of the pocket, that is to say the receptor side surface area of the pocket.
### B-factor score (normalized)
Please handle with a lot of care this score with native crystal structures. This score is based on the mean B-factor of all atoms of the binding pocket (atoms that are contacted by at least one alpha sphere). As the B factor does not necessarily reflect flexibility in crystal structures, this score is somehow abusive. However, one could imagine performing molecular dynamics or other in order to determine relative flexibility of atoms and store this information in the B-factor column of the PDB file format.
This descriptor is normalized with other pockets of the same protein.
### List of abbreviations used in dpocket & mdpocket output
- pdb : pdb file name
- lig : ligand HET ID
- overlap : overlap of atoms in the actual pocket versus atoms in the pocket identified with fpocket
- PP-crit : binary PocketPicker criterion (1 if the ligand is < 4A from the center of mass of the alpha spheres, 0 else)
- PP-dst : the minimum distance between the center of mass of the pocket and the ligand
- crit4 : proportion of ligand atoms that have at least one vertice that lies within 3 A
- crit5 : proportion of alpha spheres that lie within 3A from any ligand atom
- crit6 : binary criterion that is 1 if crit4 >=0.5 and crit5>=0.2, 0 else
- crit6_continue : a continuous measure of crit6, but this is experimental and we currently don't use it...
- lig_vol : volume of the ligand
- pock_vol : volume of the pocket
- nb_AS : number of alpha spheres
- nb_AS_norm : number of alpha spheres normalized by all pockets on the protein
- mean_as_ray : mean alpha sphere radius
- mean_as_solv_acc : mean alpha sphere solvent accessibility
- apol_as_prop : proportion of apolar alpha spheres in the pocket
- apol_as_prop_norm : normalized proportion of apolar alpha spheres
- mean_loc_hyd_dens : mean local hydrophobic density
- mean_loc_hyd_dens_norm : normalized mean local hydrophobic density
- polarity_score_norm : normalized polarity score
- flex : measure of the flexibility of the pocket (B-factor based)
- prop_polar_atm : proportion of polar atoms
- as_density : alpha sphere density
- as_density_norm : normalized alpha sphere density
- as_max_dst : maximum distance between the center of mass and all alpha spheres
- as_max_dst_norm : normalized as_max_dst
- drug_score : druggability score
- pock_asa : solvent accessible surface area of the pocket
- pock_pol_asa : polar solvent accessible surface area of the pocket
- pock_apol_asa : apolar solvent accessible surface area of the pocket
- pock_asa22 : accessible surface area using a probe of 2.2 A instead of 1.4
- pock_pol_asa22 : see pock_pol_asa and pock_asa22
- pock_apol_asa22 : see pocket_apol_asa and pock_asa22
## Cofactor definition
fpocket, dpocket and tpocket contain in the current release a fixed set of cofactors. So far so good, but what for? Cofactors are often structurally necessary or must be present in the protein structure for ligand binding. The PDB nomenclature, however, treats them as usual hetero atoms, using the HETATM tag. This is the tag that fpocket uses to identify and eliminate crystallographic waters and possible ligands of holo protein structures. In order to force fpocket to keep the cofactor you are interested in, that is to say, to consider it as entire part of the protein structure for binding pocket detection, a list list of HETATM names is defined in the beginning of the `rpdb.c` file under (https://github.com/Discngine/fpocket/blob/master/src/rpdb.c#L39) the name `static const char *ST_keep_hetatm[]`. The next line of code defines the number of cofactors defined in this list : `static const int ST_nb_keep_hetatm = 111` ;
If you would like to add a new cofactor, you have to modifiy this code. First you add the whished HETATM tag to `ST_keep_hetatm` in the end of the list. Thus for example, `“MSE”` will become `“MSE”,”PTE”` if your cofactor has the HETATM tag PTE. Do not forget to increment the `ST_nb_keep_hetatm` variable to `112`, else this cofactor will not be taken into account.
Next you have to recompile the program, before being able to use this new definition.
In future releases this cofactor definition will be done dynamically with an external list.
The following list resumes the cofactors fpocket considers as recurrent in the PDB and useful to keep in protein structures in a systematic manner.
## Customizing fpocket
This section will introduce several ways of customizing fpocket by modifying the source code. We will first gives all instructions needed to recompile and rebuild the full package when any modification of the source code has to be taken into account. Then, we will describe how to write a new scoring function, and how to write your own descriptors and include it to dpocket output. We will not show the full content of the function to modify as we want to stay as concise as possible. The newly added code for these examples will be highlighted in blue.
### How to rebuild the package
After any modification to the fpocket source code, you will logically need to rebuild the package so the modification could be taken into account. Here is the current procedure to do so:
Go to your fpocket codebase:
```bash
make uninstall
make clean
```
Then, you will have to perform the installation process again to rebuild the package.
### Writing your own scoring function
Writing your own scoring function using currently implemented descriptors is a simple task, provided that you are not afraid to write one line of C code. Currently, the fpocket algorithm sort pockets using each pocket score. Each score is calculated by a single function. The source file src/pscoring.c contains the definition of this function that have the following prototype:
```C
float score_pocket(s_desc *pdesc) ;
```
The function takes as argument a pointer to a structure that contains all descriptors currently available in fpocket, and is called for each pocket to be scored. All descriptors available have been described previously, and you can check the exact name given to each of them in the source file headers/descriptors.h that defines the s_desc structure shown here.
Lets say that you just want to score pockets according to the number of alpha sphere of each pocket. To do so, you just have to change the content of score_pocket function and return the right value:
```C
float score_pocket(s_desc *pdesc)
{
float score = (float) pdesc->nb_asph ;
return score ;
}
```
Although this example is really simple, you may now understand that you can write any kind of scoring function, like a linear or non-linear combination of descriptors derived from a regression model or any other method. The only limitation is the use of available descriptors implemented in fpocket.
Of course, although the current scoring function has very satisfying performances using only 4 of the available descriptors, you may want to implement your own set. The next section will give you the basics to do so.
### Writing your own descriptor
So what if you want to write your own descriptors? Well this will be a little more difficult than writing your own scoring function, but nothing is impossible!
Suppose that we want to add a new (and very simple) descriptor: the maximum alpha sphere radius in a given pocket.
First of all, you have to add the variable that will store your descriptor to the structure containing all descriptors. This has to be done in the descriptor.h source file, in the definition of the structure `s_desc`. We will add the following line:
```C
typedef struct s_desc
{ ...
float as_max_r ;
} s_desc ;
```
After adding our variable, we need to give a default value when no calculation have been performed, lets say -1. This is done in the function reset_desc located in the same file:
```C
void reset_desc(s_desc *desc)
{ ...
desc->as_max_r = -1.0 ;
}
```
Let's now implement our descriptor. Go to the `src/descriptor.c` source file. In this file, you fill find the main function that calculate descriptors based on a list of atoms and a list of alpha sphere. Here is the prototype of this function:
```C
void set_descriptors( s_atm **tatoms, int natoms,
s_vvertice **tvert, int nvert,
s_desc *desc) ;
```
As you can see, the function takes in argument a list of atoms, a list of vertices, and an input/output descriptor structure that will actually store all descriptors calculated. When descriptors has to be calculated on a given pocket, we first get all atoms and vertices of the pocket, and we call this function using those atoms and vertices as arguments. The calculation then use information on atoms and vertices to calculate descriptors.
Based on those parameters, you will have to write your own code in this function, and update in consequent the desc variable given in argument so the descriptor value could be stored. Lets do this. You will probably notice that the current code is not fully modular. This is because of computational optimization: a fully modular code sometimes requires additional loop and treatment compared to an optimized code. Anyway, the task is still very simple. Lets go into the part of the code that will do the job.
```C
void set_descriptors( s_atm **tatoms, int natoms,
s_vvertice **tvert, int nvert,
s_desc *desc)
{ ...
float as_max_r = -1.0 ; /* Declare and initialize the descriptor */
...
for(i = 0 ; i < nvert ; i++) {
/* Loop through all vertices and update descriptors */
vcur = tvert[i] ;
if(vcur->ray > as_max_r) as_max_r = vcur->ray ;
...
}
...
desc->as_max_r = as_max_r ; /* Store the descriptor */
}
```
That's it, your descriptor is implemented, as each pocket descriptors is automatically calculated using this function at the end of the fpocket algorithm. Thus, it can now be used in the scoring function described previously, after rebuilding the package of course.
### Normalizing your descriptors
An advantage of normalization is that two descriptors generated from pockets of two different proteins can be compared to each other at a certain degree, depending on the normalization process. For example, if we normalize the number of alpha sphere between 0 and 1 (well here it's more a scaling than a normalization), the largest pocket of any protein will always have 1 as value for the normalized descriptor.
To do so, we can't use the exact same process as adding a given descriptor, because all descriptors of all pockets need to be calculated before the normalization step. Consequently, the calculation of all normalized descriptors is currently performed in the `src/pocket.c` source file. In this file, the function `set_normalized_descriptors` does the job, and have the following prototype:
```C
void set_normalized_descriptors(c_lst_pockets *pockets)
```
As you can see, it simply takes in argument a list of pockets, in fact a simple chained list, e.g. all pockets found in a given protein. Of course each pocket contained in this structure have a descriptor structure associated with.
Lets now enter more deeply into the code, and implement a normalized version of the new descriptors so it ranges between 0 and 1. The first step is similar to the first step needed to implement a new descriptors: you need to add a variable that will store this normalized descriptor in the structures pdesc:
```C
typedef struct s_desc
{ ...
float as_max_r ;
float as_max_r_norm ;
} s_desc ;
```
You can now add the default initialization of this descriptor:
```C
void reset_desc(s_desc *desc)
{ ...
desc->as_max_r = -1.0 ;
desc->as_max_r_norm = -1.0 ;
}
```
Lets implement the descriptor now. Go to the `src/pocket.c` source file, set_normalized_descriptor function. To calculate the normalized descriptor, we need the min and max value of the non-normalized descriptors. Next, we have to loop on the pocket list, update the min and max if necessary, and perform the normalization at the end of the loop. So easy:
```C
void set_normalized_descriptors(c_lst_pockets *pockets)
{ ...
/* Declare min and max */
float as_max_r_m = 1000, /* Initialize to a large value*/
as_max_r_M = -1.0 ; /* Initialize to a small value */
...
cur = pockets->first ;
/* Perform a first processing step, e.g. to set min and max */
while(cur) {
dcur = pcur->pdesc ;
if(cur == pockets->first) {
...
/* If it is the first pocket, min = max = pocket */
as_max_r_m = as_max_r_M = dcur->as_max_r ;
}
else {
...
/* If it is the Nth != 1 pocket, check and update
min and max if necessary*/
if(dcur->as_max_r > as_max_r_M)
as_max_r_M = dcur-> as_max_r ;
else if(dcur->as_max_r < as_max_m)
as_max_r_m = dcur->as_max_r ;
}
cur = cur->next ;
}
/* Perform a second loop to do the actual normalisation */
cur = pockets->first ;
while(cur) {
dcur = cur->pocket->pdesc ;
...
dcur->as_max_r_norm = (dcur->as_max_r - as_max_r_m)
/ (as_max_r_M - as_max_r_m) ;
}
}
```
And that's it. There is a little bit more effort to provide here to normalize the descriptor, but we believe it's not that much to do.
Unfortunately, we haven't taken the time to automatically add any new descriptor to the dpocket input. So basically here, your descriptors is implemented and can be used by a scoring function, but is not written to the dpocket output. The next paragraph will learn you how to so, it's very easy.
### Including your descriptor in dpocket
Although it would be possible, we haven't taken the time to construct a system that would detect and add automatically any new descriptor to the dpocket output.
So let's do this manually. The dpocket output format is defined by 3 macros in the dpocket.h header file:
```C
#define M_DP_OUTP_HEADER "pdb lig ...”
#define M_DP_OUTP_FORMAT "%s %s ...”
#define M_DP_OUTP_VAR (fc, l, ovlp, status, dst, lv, d) fc, l, ...
```
The first macro defines the header of the output file. The second macro corresponds to the format of each value to output given to the fprintf function. Finally, the last macro is the list of variables, with d being the pointer to the descriptor structure defined previously. Basically, writing the dpocket output for each pocket requires two main processes: write the header, and loop to write each pocket descriptor.
To include our descriptor into the dpocket output, we just need to add the header label of the descriptor, add the output format of the descriptor, and add the descriptor itself. Those three steps will modify the first, the second, and the third macro defined previously, respectively. The only difficulty is to keep the correspondence between of all 3 positions (header, format and variable) in the line: column number (position) of the header corresponding to the number of alpha sphere must correspond the that of the format and variable. For example, if we want to add our normalized variable at the first position of dpocket output, it would give:
```C
#define M_DP_OUTP_HEADER "as_max_r pdb lig ...”
#define M_DP_OUTP_FORMAT "%3.5f %s %s ...”
#define M_DP_OUTP_VAR (fc, l, ovlp, status, dst, lv, d) d->as_max_r, fc, l, ovlp, ...
```
That's all. Remember to be careful on this step: adding a new descriptor to dpocket is really easy in theory, but losing the correspondence between header, format and variable position columns is easy too, in which case interpretation, visualization and analysis of dpocket output can become somehow difficult or even meaningless.
### Including your descriptor in mdpocket
Adding a descriptor to mdpocket works pretty much the same way than in dpocket. So write your own descriptor like described previously for dpocket. The only difference is the last step, instead of modifying the dpocket.h macros you should modify the macros of mdpocket.h. They are constructed exactly the same way and are even somehow easier because smaller:
```C
#define M_MDP_OUTP_HEADER "snapshot pock_volume nb_AS...”
#define M_MDP_OUTP_FORMAT "%d %4.2f %d %4.2f %4.2f %4.2f..."
#define M_MDP_OUTP_VAR(i, d) i, d->volume ...
```
Simply add the header of your descriptor the output header macro, the output format to the format macro and the variable to the variable macro, exactly like in the previously described dpocket.h file.

152
doc/INSTALLATION.md Normal file
View File

@@ -0,0 +1,152 @@
# Installation
## Prerequisites
Currently fpocket proposes two different ways for visualization of binding pockets. Both are based on commonly used molecular visualization tools : VMD and PyMol. In order to use visualization you need to install at least one of both softwares, or any other valid tool able to read standard PDB files (Chimera, MOE, Maestro etc).
Currently, visualization using VMD has better rendering and performances and visualization using PyMol better handling of binding pockets. You can download VMD for free from http://www.ks.uiuc.edu/Research/vmd/. PyMol can be freely downloaded from https://pymol.org/2/.
## Dependencies
fpocket relies on Qhull. In the officially released version fpocket ships Qhull with it and Qhull compilation is automatically done when compiling and installing fpocket. Since the 3.0 release of fpocket
- libnetcdf and
- libstdc++
are required to compile fpocket.
## System Requirements
fpocket is available for Linux/Unix type OS's, and also MacOSX (so basically all OS's that don't completely suck).
In order to run fpocket, you should have at minimum a Pentium III 500 Mhz (does that still exist?) with 128Mb of RAM (lol). This program was co-developed and tested under the following Linux distributions : openSuse 10.3 (and newer), Centos 5.2, Fedora Core 7, Ubuntu 8.10 as well as Mac OS X (10.5, 10.6, 10.14.6). You need a valid C compiler like gcc or clang (for mac).
## Getting Started
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
### Prerequisites
The most recent versions (starting with fpocket 3.0) make use of the molfile plugin from VMD. This plugin is shipped with fpocket. However, now you need to install the netcdf library on your system. This is typically called netcdf-devel or so, depending on you linux distribution.
fpocket needs to be compiled to run on your machine. For this you'll need the gnu c compiler (or another one, but didn't test with others than GCC).
install netcdf-devel on ubuntu type :
```
sudo apt-get install libnetcdf-dev
```
on a RHEL based distribution something like this should do:
```
sudo yum install netcdf-devel.x86_64
```
### Installing
Download the sources from github via the website or using git clone and then build and deploy fpocket using the following commands.
#### Compiling on Linux
```bash
git clone https://github.com/Discngine/fpocket.git
cd fpocket
make
sudo make install
```
#### Compiling on OSX
Install MacPorts https://www.macports.org/ for instance (needed for netcdf install)
```bash
sudo port install netcdf
export LIBRARY_PATH=/opt/local/lib
git clone https://github.com/Discngine/fpocket.git
cd fpocket
make ARCH=MACOSXX86_64
sudo make install
```
End with an example of getting some data out of the system or using it for a little demo
## Running the tests
The source code of fpocket is shipped with samples. They can be found in the data/sample folder. Try to run fpocket against the 1uyd sample to check if it's running OK.
```
cd data/sample
fpocket -f 1UYD.pdb
```
fpocket should state when it's beginning to search pocket and also when it's ending the search. Upon completion the folder should now contain a folder called 1UYD_out. Check whether the folder exists and the pdb files contain data and the pocket info file contains results.
## Frequent issues encountered
### netcdf issues
```
cannot find -lnetcdf
```
mdpocket supports reading and writing NETCDF formatted files. In order to use this you need to install the netcdf development libraries on your system.
#### Centos:
This can be achieved like this :
```
yum install -y epel-release #if the epel repo is not yet activated on your system
yum install -y netcdf-devel
```
#### Ubuntu:
```
sudo apt-get install libnetcdf-dev
```
#### OSX:
Install MacPorts https://www.macports.org/ for instance (needed for netcdf install)
```
sudo port install netcdf
export LIBRARY_PATH=/opt/local/lib
```
Run make again after installing this library. Mdpocket / fpocket should build just fine now.
### stdc++ issues
```
cannot find -lstdc++
```
You need to install the stc++ static libraries to build fpocket & mdpocket.
#### Centos:
On centos 7.4 this can be done like this :
```
yum install -y libstc++-static
```
#### Ubuntu:
```
sudo apt-get install libstdc++6
```
### Linking to molfile plugin issues
If you observe an error similar to this one
```
ld: warning: ignoring file plugins/MACOSXX86/molfile/libmolfile_plugin.a, file was built for archive which is not the architecture being linked (x86_64): plugins/MACOSXX86/molfile/libmolfile_plugin.a
Undefined symbols for architecture x86_64:
"_molfile_parm7plugin_init", referenced from:
_read_topology in topology.o
"_molfile_parm7plugin_register", referenced from:
_read_topology in topology.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [bin/fpocket] Error 1
make: *** [all] Error 2
```
then statically built libmolfile_plugin is not compatible with your machine. First check out that the ARCH variable set in the first line of the Makefile of fpocket actually reflects the architecture you want. For now I'm trying to support linux 64 bit systems and OSX 64 (LINUXAMD64) bit systems built with clang 32 and 64 bit (MACOSXX86 MACOSXX86_64). So all should work out of the box. If they do not, you might need to build the molfile plugin for your architecture. All available system architectures for the molfile plugin can be found in the plugins folder tree : [plugins directory](https://github.com/Discngine/fpocket/tree/master/plugins).
Here you can find more information on how to build the molfile plugin on CentOS 7.4:
[compile molfile plugin on centos 7.4 - Discngine blog post](https://www.discngine.com/blog/2019/5/25/building-the-vmd-molfile-plugin-from-source-code)
Once built, copy the architecture folder into the fpocket/plugins directory and make sure to declare this architecture in the ARCH variable in the Makefile. Finally run make again.
If you manage to build for other architectures and it works, I'd be happy to accept PR's with the relevant plugin architectures as I cannot build all of them on my own ;).
## Read next
* [Getting Started & Advanced Features](GETTINGSTARTED.md)

43
doc/INTRODUCTION.md Normal file
View File

@@ -0,0 +1,43 @@
# Introduction
Thanks for taking the time to read this official userguide of fpocket. In this guide are presented general functionalities of the fpocket program and its derivatives, dpocket, tpocket and mdpocket. Yes, indeed fpocket is a package of four distinct programs, mentioned here before. fpocket is an acronym for “finding” pocket; dpocket is an acronym for “describing” pockets as it is for extraction of physico-chemical descriptors of pockets; tpocket is an acronym for “testing” pockets, as it is used for testing on a large scale scoring function for ranking protein cavities developed with fpocket, among each other. mdpocket was named after pocket detection on molecular dynamics (MD) trajectories.
This is not a usual guide. You can find here elements you can find in usual user guides, but we included several examples in the getting started section, which should enhance fast understanding of how to work with fpocket. The getting started guide can be understood like a mini tutorial of basic functionality of this software.
Furthermore, we don't take ourselves too seriously, so the way this manual is written might not correspond to the industry standard ;)
## License & Copyright
This program is published under the MIT Licence. Basically do whatever you want with it.
Vincent Le Guilloux, Peter Schmidtke are authors of fpocket, dpocket, tpocket (which perform protein cavity detection, cavity descriptor extraction, large scale cavity prediction evaluations) Peter Schmidtke is the author of mdpocket which performs pocket detection and descriptor extraction on MD trajectories).
Contributions
The initial fpocket software was developed, validated, documented and distributed by Vincent Le Guilloux & Peter Schmidtke. Both, contributed equally to this project. The initial work on fpocket was initiated and supervised by Pierre Tufféry.
Latest extensions were developed, validated, documented and distributed by Peter Schmidtke (mdpocket, druggability score, energy calculations) supervised by Xavier Barril.
## Publication & Citation
The methods paper about this software was published in BMC Bioinformatics. In order to cite fpocket in the future, please cite this paper :
- Vincent Le Guilloux, Peter Schmidtke and Pierre Tuffery, “Fpocket: An open source platform for ligand pocket detection”, BMC Bioinformatics 2009, 10:168
If you use the druggability score of fpocket, please cite :
- Peter Schmidtke & Xavier Barril “Understanding and predicting druggability. A high-throughput method for detection of drug binding sites.”, J Med Chem, 2010, 53(15):5858-67
Last, the mdpocket paper has been published too and can be cited using:
- Peter Schmldtke, Axel Bidon-Chanal, Javier Luque, Xavier Barril, “MDpocket: open-source cavity detection and characterization on molecular dynamics trajectories.”, Bioinformatics. 2011 Dec 1;27(23):3276-85
Contact
If you want to contact the fpocket developers please create a github issue here: https://github.com/Discngine/fpocket/issues
We are happy about positive, negative, in any way constructive feedback.
## Read next
* [Installation](INSTALLATION.md)
* [Getting Started](GETTINGSTARTED.md)

15
doc/MANUAL.md Normal file
View File

@@ -0,0 +1,15 @@
# fpocket User Manual
fpocket is a protein pocket prediction algorithm. Given a PDB protein structure it enables the user to identify potent binding sites. Based on Voronoi tessellation, this algorithm is very fast and particularly well suited for large scale protein binding pocket screenings and development of scoring functions for binding pocket characterization. Now, fpocket also allows pocket detection on MD trajectories and assessment of the volume & the druggability of a binding site. Last, also interaction energy calculations are now possible using fpocket & mdpocket.
## Notes
1. This program uses output coming from Qhull. Qhull is integrated within fpocket. More information about Qhull can be found in the paper : Barber, C.B., Dobkin, D.P., and Huhdanpaa, H.T., "The Quickhull algorithm for convex hulls," ACM Trans. on Mathematical Software, 22(4):469-483, Dec 1996, http://www.qhull.org
2. Part of this software includes code based on external code developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. The PDB parser of the Molfile Plugin of VMD were modified for the purposes of fpocket's PDB parsing. Furthermore, the molfile plugin allows now mdpocket to analyse various MD trajectory formats.
3. Within the whole documentation code and output from computer programs are represented and formatted in the following way : `ls -1 > out.txt`
4. This documentation, as well as the software itself, is under steady change. The fpocket developer team tries to provide a useful and easy to understand documentation, a thing that completely lacks in most of scientific open source softwares nowadays. In our opinion an open source software is useless without documentation of the source code on one side and documentation of the software on the other. Thus, we welcome every suggestion to improve this documentation in terms of accuracy, clarity and completeness.
## Contents
* [Introduction](INTRODUCTION.md)
* [Installation](INSTALLATION.md)
* [Getting Started & Advanced Features](GETTINGSTARTED.md)

View File

Before

Width:  |  Height:  |  Size: 57 B

After

Width:  |  Height:  |  Size: 57 B

View File

Before

Width:  |  Height:  |  Size: 1.3 KiB

After

Width:  |  Height:  |  Size: 1.3 KiB

Some files were not shown because too many files have changed in this diff Show More