Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Usage instructions

1. Specify the input molecules

Using ChemProt, you will have the possibility to search for bioactive chemicals annotated for specific proteins.

You can also search for bioactive compounds based on their structural similarity.

There are 5 possible ways of entering molecular structure information.
1) Draw the molecule in the JME window to the right and click on the "Import" buttom to translate to SMILES.
2) Insert the SMILES strings of the molecules in the window to the left.
3) Read the SMILES strings from a file (*.smi)
4) Read the structures from a sdf file (*.sdf) Sample copy
5) Type a name of a compound in the "type compound name field". Various chemical names and synonymous can be used

The SMILES entry window and the smi file should have preferably the canonical smile format. Hydrogens will not be considered for the similarity search.
Multiple structures can be entered on separate lines as below.

CC(=O)OC1=CC=CC=C1C(=O)O Aspirin
CN(C)CCCC1(C2=C(CO1)C=C(C=C2)C#N)C3=CC=C(C=C3)F Citalopram
C1C2C(C(C1Cl)Cl)C3(C(=C(C2(C3(Cl)Cl)Cl)Cl)Cl)Cl Chlordane
CC1=C(C=C(C=C1)NC(=O)C2=CC=C(C=C2)CN3CCN(CC3)C)NC4=NC=CC(=N4)C5=CN=CC=C5 Gleevec
OC1=CC=C(C=C1)C1=C(C(=O)C2=CC=C(OCCN3CCCCC3)C=C2)C2=C(S1)C=C(O)C=C2 Raloxifene
CC12CCC3C(C1CCC2O)CCC4=CC(=O)CCC34C Testosterone

Identifiers can be left out, entering SMILES only on separate lines.

Note: The chirality of the compound can be used to retrieve specific chemical-protein annotations, but it will not be considered in the structural similarity search.

2. Structural similarity search

The chemical search is based on the assumption that 2 chemicals sharing similar structure share also similar activity on a protein.

To do the structural similarity search 2 types of fingerprints can be used:

- MACCs fingerprints (composed of 166 queries; atoms- or fragment based)
- Pharmacophore fingerprints (3 points pharmacophore graph triangle GpiDAPH3 computed from MOE)

The structural similarity between 2 chemicals is measured using the tanimoto coefficient (Tc). The closer to 1 is the coefficient, the higher is the similarity.
To limit the size of the output, only chemical with a Tc > to 0.85 for MACCs and > to 0.60 for Ph4 are shown.

3. Databases

Chemical-Protein information has been gathered from different databases. You can decide if you prefer to get the data from all the databases in one layout (select "all") or if you prefer to use specific databases.

4. Submit the job

Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window. It is estimated around 30 sec. to 45 sec. is needed to get an output per molecule.

At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified by e-mail when it has terminated. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect them.


Scientific problems:        Technical problems: