1. Specify the input molecules
Using ChemProt, you will have the possibility to search for bioactive chemicals
annotated for specific proteins.
You can also search for bioactive compounds based on their structural similarity.
There are 5 possible ways of entering molecular structure information.
1) Draw the molecule in the JME window to the right and click on the "Import" buttom to translate to SMILES.
2) Insert the SMILES strings of the molecules in the window to the left.
3) Read the SMILES strings from a file (*.smi)
4) Read the structures from a sdf file (*.sdf) Sample copy
5) Type a name of a compound in the "type compound name field". Various
chemical names and synonymous can be used
The SMILES entry window and the smi file should have preferably the canonical
smile format. Hydrogens will not be considered for the similarity search.
Multiple structures can be entered on separate lines as below.
CC(=O)OC1=CC=CC=C1C(=O)O Aspirin
CN(C)CCCC1(C2=C(CO1)C=C(C=C2)C#N)C3=CC=C(C=C3)F Citalopram
C1C2C(C(C1Cl)Cl)C3(C(=C(C2(C3(Cl)Cl)Cl)Cl)Cl)Cl Chlordane
CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5 Gleevec
OC1=CC=C(C=C1)C1=C(C(=O)C2=CC=C(OCCN3CCCCC3)C=C2)C2=C(S1)C=C(O)C=C2 Raloxifene
CC12CCC3C(C1CCC2O)CCC4=CC(=O)CCC34C Testosterone
Identifiers can be left out, entering SMILES only on separate lines.
Note: The chirality of the compound can be used to retrieve specific
chemical-protein annotations, but it will not be considered in the structural
similarity search.
2. Structural similarity search
The chemical search is based on the assumption that 2 chemicals sharing similar
structure share also similar activity on a protein.
To do the structural similarity search 2 types of fingerprints can be
used:
- MACCs fingerprints (composed of 166 queries; atoms- or fragment based)
- Pharmacophore fingerprints (3 points pharmacophore graph triangle GpiDAPH3
computed from MOE)
The structural similarity between 2 chemicals is measured using the tanimoto
coefficient (Tc). The closer to 1 is the coefficient, the higher is the
similarity.
To limit the size of the output, only chemical with a Tc > to 0.85 for MACCs and
> to 0.60 for Ph4 are shown.
3. Databases
Chemical-Protein information has been gathered from different databases. You can
decide if you prefer to get the data from all the databases in one layout (select
"all") or if you prefer to use specific databases.
4. Submit the job
Click on the
"Submit" button. The status of your job (either 'queued'
or 'running') will be displayed and constantly updated until it terminates and
the server output appears in the browser window. It is estimated around 30 sec.
to 45 sec. is needed to get an output per molecule.
At any time during the wait you may enter your e-mail address and simply leave
the window. Your job will continue; you will be notified by e-mail when it has
terminated. The e-mail message will contain the URL under which the results are
stored; they will remain on the server for 24 hours for you to collect them.