Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

PyMOL



Getting started


If you are using UNIX/Linux, type pymol at the prompt. For Windows, click the PyMOL icon. PyMOL starts with two windows, the Viewer window, where the structures are displayed, and the External (Tcl/Tk) GUI window, which has a collection of pull-down menus and a command input field. This tutorial is intended for users who don't use PyMOL every day, and therefore I will try to use the menus as much as possible, to reduce the amount of commands that needs to be remembered by the occasional user. If you are going to be using PyMOL a lot, it is worthwhile to get hold of the manual and learn the commands. You can ease your work considerably through scripting.

We will start the tutorial by working on the PDB file 1duz. If you are working at the CBS network, you can retrieve the file using the command "getpdb" (at the UNIX prompt):

getpdb 1duz > 1duz.pdb

You can also download it from www.rcsb.org/pdb.

First, take a look at the PDB file itself. PDB files are text files, which contain a header with various information about the structure, followed by the actual coordinates of the atoms in the structure. In the header, you can find information on how to generate the biomolecule from the coordinates in the PDB file in the REMARK 300 and REMARK 350 fields:

REMARK 300                                                                      
REMARK 300 BIOMOLECULE: 1, 2                                                    
REMARK 300 THIS ENTRY CONTAINS THE CRYSTALLOGRAPHIC ASYMMETRIC UNIT             
REMARK 300 WHICH CONSISTS OF 6 CHAIN(S). SEE REMARK 350 FOR                     
REMARK 300 INFORMATION ON GENERATING THE BIOLOGICAL MOLECULE(S).                
REMARK 350                                                                      
REMARK 350 GENERATING THE BIOMOLECULE                                           
REMARK 350 COORDINATES FOR A COMPLETE MULTIMER REPRESENTING THE KNOWN           
REMARK 350 BIOLOGICALLY SIGNIFICANT OLIGOMERIZATION STATE OF THE                
REMARK 350 MOLECULE CAN BE GENERATED BY APPLYING BIOMT TRANSFORMATIONS          
REMARK 350 GIVEN BELOW.  BOTH NON-CRYSTALLOGRAPHIC AND                          
REMARK 350 CRYSTALLOGRAPHIC OPERATIONS ARE GIVEN.                               
REMARK 350                                                                      
REMARK 350 BIOMOLECULE: 1                                                       
REMARK 350 APPLY THE FOLLOWING TO CHAINS: A, B, C                               
REMARK 350   BIOMT1   1  1.000000  0.000000  0.000000        0.00000            
REMARK 350   BIOMT2   1  0.000000  1.000000  0.000000        0.00000            
REMARK 350   BIOMT3   1  0.000000  0.000000  1.000000        0.00000            
REMARK 350 BIOMOLECULE: 2                                                       
REMARK 350 APPLY THE FOLLOWING TO CHAINS: D, E, F                               
REMARK 350   BIOMT1   2  1.000000  0.000000  0.000000        0.00000            
REMARK 350   BIOMT2   2  0.000000  1.000000  0.000000        0.00000            
REMARK 350   BIOMT3   2  0.000000  0.000000  1.000000        0.00000            
In this case, the PDB file contains 6 chains, where the A, B and C chains make up the first biomolecule, and the D, E and F chains make up the second. A bit further down in the header, you can find the SEQRES fields, which gives the sequence of the chains:
SEQRES   1 A  275  GLY SER HIS SER MET ARG TYR PHE PHE THR SER VAL SER          
SEQRES   2 A  275  ARG PRO GLY ARG GLY GLU PRO ARG PHE ILE ALA VAL GLY          
SEQRES   3 A  275  TYR VAL ASP ASP THR GLN PHE VAL ARG PHE ASP SER ASP          
SEQRES   4 A  275  ALA ALA SER GLN ARG MET GLU PRO ARG ALA PRO TRP ILE          
SEQRES   5 A  275  GLU GLN GLU GLY PRO GLU TYR TRP ASP GLY GLU THR ARG          
SEQRES   6 A  275  LYS VAL LYS ALA HIS SER GLN THR HIS ARG VAL ASP LEU          
SEQRES   7 A  275  GLY THR LEU ARG GLY TYR TYR ASN GLN SER GLU ALA GLY          
SEQRES   8 A  275  SER HIS THR VAL GLN ARG MET TYR GLY CYS ASP VAL GLY          
SEQRES   9 A  275  SER ASP TRP ARG PHE LEU ARG GLY TYR HIS GLN TYR ALA          
SEQRES  10 A  275  TYR ASP GLY LYS ASP TYR ILE ALA LEU LYS GLU ASP LEU          
SEQRES  11 A  275  ARG SER TRP THR ALA ALA ASP MET ALA ALA GLN THR THR          
SEQRES  12 A  275  LYS HIS LYS TRP GLU ALA ALA HIS VAL ALA GLU GLN LEU          
SEQRES  13 A  275  ARG ALA TYR LEU GLU GLY THR CYS VAL GLU TRP LEU ARG          
SEQRES  14 A  275  ARG TYR LEU GLU ASN GLY LYS GLU THR LEU GLN ARG THR          
SEQRES  15 A  275  ASP ALA PRO LYS THR HIS MET THR HIS HIS ALA VAL SER          
SEQRES  16 A  275  ASP HIS GLU ALA THR LEU ARG CYS TRP ALA LEU SER PHE          
SEQRES  17 A  275  TYR PRO ALA GLU ILE THR LEU THR TRP GLN ARG ASP GLY          
SEQRES  18 A  275  GLU ASP GLN THR GLN ASP THR GLU LEU VAL GLU THR ARG          
SEQRES  19 A  275  PRO ALA GLY ASP GLY THR PHE GLN LYS TRP ALA ALA VAL          
SEQRES  20 A  275  VAL VAL PRO SER GLY GLN GLU GLN ARG TYR THR CYS HIS          
SEQRES  21 A  275  VAL GLN HIS GLU GLY LEU PRO LYS PRO LEU THR LEU ARG          
SEQRES  22 A  275  TRP GLU                                                      
SEQRES   1 B  100  MET ILE GLN ARG THR PRO LYS ILE GLN VAL TYR SER ARG          
SEQRES   2 B  100  HIS PRO ALA GLU ASN GLY LYS SER ASN PHE LEU ASN CYS          
SEQRES   3 B  100  TYR VAL SER GLY PHE HIS PRO SER ASP ILE GLU VAL ASP          
SEQRES   4 B  100  LEU LEU LYS ASN GLY GLU ARG ILE GLU LYS VAL GLU HIS          
SEQRES   5 B  100  SER ASP LEU SER PHE SER LYS ASP TRP SER PHE TYR LEU          
SEQRES   6 B  100  LEU TYR TYR THR GLU PHE THR PRO THR GLU LYS ASP GLU          
SEQRES   7 B  100  TYR ALA CYS ARG VAL ASN HIS VAL THR LEU SER GLN PRO          
SEQRES   8 B  100  LYS ILE VAL LYS TRP ASP ARG ASP MET                          
SEQRES   1 C    9  LEU LEU PHE GLY TYR PRO VAL TYR VAL                          
SEQRES   1 D  275  GLY SER HIS SER MET ARG TYR PHE PHE THR SER VAL SER          
SEQRES   2 D  275  ARG PRO GLY ARG GLY GLU PRO ARG PHE ILE ALA VAL GLY          
SEQRES   3 D  275  TYR VAL ASP ASP THR GLN PHE VAL ARG PHE ASP SER ASP          
SEQRES   4 D  275  ALA ALA SER GLN ARG MET GLU PRO ARG ALA PRO TRP ILE          
SEQRES   5 D  275  GLU GLN GLU GLY PRO GLU TYR TRP ASP GLY GLU THR ARG          
SEQRES   6 D  275  LYS VAL LYS ALA HIS SER GLN THR HIS ARG VAL ASP LEU          
SEQRES   7 D  275  GLY THR LEU ARG GLY TYR TYR ASN GLN SER GLU ALA GLY          
SEQRES   8 D  275  SER HIS THR VAL GLN ARG MET TYR GLY CYS ASP VAL GLY          
SEQRES   9 D  275  SER ASP TRP ARG PHE LEU ARG GLY TYR HIS GLN TYR ALA          
SEQRES  10 D  275  TYR ASP GLY LYS ASP TYR ILE ALA LEU LYS GLU ASP LEU          
SEQRES  11 D  275  ARG SER TRP THR ALA ALA ASP MET ALA ALA GLN THR THR          
SEQRES  12 D  275  LYS HIS LYS TRP GLU ALA ALA HIS VAL ALA GLU GLN LEU          
SEQRES  13 D  275  ARG ALA TYR LEU GLU GLY THR CYS VAL GLU TRP LEU ARG          
SEQRES  14 D  275  ARG TYR LEU GLU ASN GLY LYS GLU THR LEU GLN ARG THR          
SEQRES  15 D  275  ASP ALA PRO LYS THR HIS MET THR HIS HIS ALA VAL SER          
SEQRES  16 D  275  ASP HIS GLU ALA THR LEU ARG CYS TRP ALA LEU SER PHE          
SEQRES  17 D  275  TYR PRO ALA GLU ILE THR LEU THR TRP GLN ARG ASP GLY          
SEQRES  18 D  275  GLU ASP GLN THR GLN ASP THR GLU LEU VAL GLU THR ARG          
SEQRES  19 D  275  PRO ALA GLY ASP GLY THR PHE GLN LYS TRP ALA ALA VAL          
SEQRES  20 D  275  VAL VAL PRO SER GLY GLN GLU GLN ARG TYR THR CYS HIS          
SEQRES  21 D  275  VAL GLN HIS GLU GLY LEU PRO LYS PRO LEU THR LEU ARG          
SEQRES  22 D  275  TRP GLU                                                      
SEQRES   1 E  100  MET ILE GLN ARG THR PRO LYS ILE GLN VAL TYR SER ARG          
SEQRES   2 E  100  HIS PRO ALA GLU ASN GLY LYS SER ASN PHE LEU ASN CYS          
SEQRES   3 E  100  TYR VAL SER GLY PHE HIS PRO SER ASP ILE GLU VAL ASP          
SEQRES   4 E  100  LEU LEU LYS ASN GLY GLU ARG ILE GLU LYS VAL GLU HIS          
SEQRES   5 E  100  SER ASP LEU SER PHE SER LYS ASP TRP SER PHE TYR LEU          
SEQRES   6 E  100  LEU TYR TYR THR GLU PHE THR PRO THR GLU LYS ASP GLU          
SEQRES   7 E  100  TYR ALA CYS ARG VAL ASN HIS VAL THR LEU SER GLN PRO          
SEQRES   8 E  100  LYS ILE VAL LYS TRP ASP ARG ASP MET                          
SEQRES   1 F    9  LEU LEU PHE GLY TYR PRO VAL TYR VAL                          
The molecule we are looking at is an MHC molecule. (Molecule of the month at the PDB in February, 2005: http://www.rcsb.org/pdb/molecules/pdb62_1.html.) MHC molecules consist of two protein chains, here A and B (and D and E) and a peptide, here C (and F), which in this case is 9 residues long. When you have the PDB file in your PyMOL working directory, open it by selecting "open" under the "File" menu. Now the structure is shown in the Viewer window in the "lines" representation. You can now see the two biomolecules plus a lot of water molecules. We only want to look at one molecule, so we define an object consisting of chains A-C, by entering at the command prompt:

create molecule1, chain A or chain B or chain C

This creates an object named "molecule1" consisting of one molecule only, which now shows up under "1duz" on the list of objects in the viewer window. (Note, that the PyMOL language is case-sensitive, but upper case is not used for commands.) Next to the object names in the Viewer window are five buttons: A(ctions), S(how), H(ide), L(abels) and C(olor). Hide the 1duz object now by selecting H(ide) everything. If the "molecule1" object is not centered, choose "zoom" under A(ction). This puts your selected object in the center of the display and makes it fill out the whole window. S(how) the "molecule1" object in different representations, try them all to see what they do. Note, that you have to H(ide) the representations after use, or they will all be shown on top of each other. When you have seen them all, go back to the "cartoon" representation. Now experiment with C(olor) until you have found something you are pleased with. Rotate the molecule using the left mouse button and zoom (move in the Z direction) using the right mouse button until you are also pleased with the orientation of the molecule. There is a table with an overview of the different mouse functions in the lower right-hand corner of the Viewer window. Now you are ready to create a publication-quality image. It's dead easy - just press "Ray" in the GUI window (or type "ray" at the prompt). This will take a few seconds, depending on the complexity of the image you want to ray-trace.

Changing settings

If you choose "Edit all" under "Settings" in the GUI window, you can see all the settings, that PyMOL allows you to change. If you like slimmer sticks, then change the corresponding value in PyMOL (stick radius). The background color can be changed to any RGB color by changing "bg_rgb". The three numbers from 0-1 represent the amount of red, green and blue, respectively. Try changing the background color to red.

Saving

To save your image, choose "Save Image" in the "File" menu in the GUI window. If you have to leave PyMOL now, but expect to come back and work some more on your image later, you can save your PyMOL session, so you won't have to start all over again next time. Do this by selecting "Save session" or "Save session as" under the "File" menu. It is a good idea to save your session before doing something radical, in this way it works as an "undo" button.

Selections

This example has demonstrated many of the most often used features in PyMOL, but in order to create more complex figures, you will need to know a little more about selections.

If you wish to do something with just a subset of the atoms in an object, you can create either a named selection or a new object consisting of the subset you are interested in. Selections and objects play slightly different roles in PyMOL. For most purposes, I recommend creating new objects with the selections you wish to work with. Objects are created by the command "create" which has the syntax

create name, selection

where

name = object to create (or modify)
selection = atoms to include in the object

To understand how selections work, let's have another look at the PDB file. After the header records, the actual coordinates of the atoms are listed in the following format:

ATOM      1  N   GLY A   1      14.752  -6.145  13.692  1.00 25.90           N  
ATOM      2  CA  GLY A   1      15.556  -5.512  12.629  1.00 26.24           C  
ATOM      3  C   GLY A   1      15.173  -4.015  12.551  1.00 25.34           C  
ATOM      4  O   GLY A   1      14.683  -3.476  13.556  1.00 26.37           O  
ATOM      5  N   SER A   2      15.385  -3.415  11.390  1.00 24.50           N  
ATOM      6  CA  SER A   2      15.117  -1.999  11.027  1.00 23.82           C  
ATOM      7  C   SER A   2      13.621  -1.646  10.953  1.00 21.82           C  
ATOM      8  O   SER A   2      12.801  -2.540  10.698  1.00 20.40           O  
ATOM      9  CB  SER A   2      15.721  -1.709   9.635  1.00 24.56           C  
ATOM     10  OG  SER A   2      17.138  -1.917   9.705  1.00 29.71           O  
ATOM     11  N   HIS A   3      13.271  -0.356  11.200  1.00 20.13           N  
ATOM     12  CA  HIS A   3      11.821   0.031  11.153  1.00 18.92           C  
ATOM     13  C   HIS A   3      11.657   1.450  10.570  1.00 18.05           C  
ATOM     14  O   HIS A   3      12.661   2.139  10.422  1.00 17.09           O  
ATOM     15  CB  HIS A   3      11.225   0.000  12.561  1.00 18.31           C  
ATOM     16  CG  HIS A   3      11.161  -1.410  13.108  1.00 17.87           C  
ATOM     17  ND1 HIS A   3      10.185  -2.297  12.725  1.00 18.42           N  
ATOM     18  CD2 HIS A   3      11.937  -2.071  13.989  1.00 19.01           C  
ATOM     19  CE1 HIS A   3      10.362  -3.467  13.333  1.00 19.54           C  
ATOM     20  NE2 HIS A   3      11.415  -3.344  14.121  1.00 19.01           N  
In the first column is the field "ATOM" which indicates that the rest of the line contains atomic coordinate information. The next column is the atom number, then comes atom name (in PyMOL called "name"), then residue name (PyMOL: resn), chain id (PyMOL: chain), residue number (PyMOL: resi). The next three columns are the x, y and z coordinates of the atoms, followed by the occupancy (usually 1.00) and the B-factor. The last column is rarely used (and is missing in some PDB files), but gives the atom type.

B-factors

The B-factor column is very useful. The B-factors can be used to illustrate flexibility of the structure (the higher the B-factor, the more flexible the structure). The structure can be colored by B-factor, if you choose "spectrum" in the C(olor) menu, and then "b-factors". Try this. You can see, that the structure is most rigid in the center and most flexible in the loop regions. You can use the fact that a structure can be colored by the numbers in the B-factor column, and replace the B-factors with values for other features that you wish to depict on your structure. This could for example be a measure for conservation, which would turn your image into a type of 3-D logo plot. But that was a side track.

Selection syntax

I will demonstrate the selection syntax through examples. To create an object consisting of just the carbon atoms in the structure, enter

create carbons, name ca+cb+cg+cd

Try this. You will notice, that it chooses all the carbons in all the chains (including chains D-F). If you want to only choose the carbons in chains A-C, you can either delete the 1duz object and then recreate the carbons object, or you can limit your selection to chains A-C:

create carbons, (name ca+cb+cg+cd) and (chain A or chain B or chain C)

Note that if you have a (+)-separated list of identifiers, no spaces are allowed. If you want to show a range of residues, such as the first 10 residues of the N-terminal, you can use a (-):

create Nterminal, (resi 1-10) and (chain A)

(but you can not use both (+) and (-) in the same go, resi 1-10+36 is NOT allowed).

create bb, name c+o+n+ca

Selection algebra

You have already seen how to include residues that are either in chain A or chain B or chain C in the above example. Here is a list of other useful selection operators and modifiers:
Operator	  	Effect

not s1		 	Selects atoms that are not included in s1
			PyMOL ex: create sidechains, not bb
		
s1 and s2		Selects atoms included in both s1 and s2
		
		
s1 or s2		Selects atoms included in either s1 or s2

s1 around X		Selects atoms with centers within X Angstroms of the center of
			any atom in s1

s1 expand X		Expands s1 by all atoms within X Angstroms of the center of
			any atom in s1
		
s1 within X of s2	Selects atoms in s1 that are within X Angstroms of s2

byres s1		Expands selection to complete residues.		


byobject s1		Expands selection to complete objects.

neighbor s1		selects atoms directly bonded to s1.


Your turn!!

Create an image like this one: (the surface is covering the A and B chains, and the peptide - the C chain - is shown in stick representation)

When your image has been approved by the instructor, then have a go at this one: (same as above, but waters that are within 4 Angstroms of the peptide are included) If you have your own favorite structure, you can try to create an image of that now.