Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Probe Design for microarrays using OligoWiz 2.0

Exercise written by: Rasmus Wernersson


Overview

This exercise will show how to use the OligoWiz 2.0 server for designing probes for DNA microarrays.

The first part of the exercise will show how to use the basic functionality of OligoWiz 2.0 by using a small set of Bacillus subtilis transcripts. In the second part of the excercise we will explore the more advanced posibilities of OligoWiz 2.0: using sequence feature annotation (such as intron/exon structure) for in the probe selection process.


Getting started.

  1. Open the OligoWiz 2.0 webpage in a new browser window

  2. Open the "instructions" help page from the main OligoWiz webpage

    Please keep the help page open throughout this exercise. The help page contains answers to must of the questions that might arise.

  3. Optional: Intstall Java 1.4.1 or better

    If you run this exercise from your own computer, you may need to install a suitable version of Java. Please follow the instruction on the OligoWiz webpage.

  4. Download the OligoWiz client (the Java program) from the OligoWiz webpage

  5. Launch the OligoWiz 2.0 client

    On Windows and Mac simply double-click on the JAR file. On Unix systems you'll need to start the program from the command line:

    java -jar OligoWiz-2.1.0.jar


Exercise 1: Bacillus subtilis

  1. Download the sample dataset

    Download the "30 Bacillus genes" sample dataset linked near the bottom of the OligoWiz webpage. This is a FASTA format file containing 30 transcripts from Bacillus subtilis.

  2. Inspect the data file

    Use a simple text editor to inspect the file (E.g Notepad or Wordpad on Windows, TextEdit or BBEdit on Mac, NEdit on UNIX).

  3. Get ready to launch the query to the OligoWiz 2.0 server

    IMPORTANT: The OligoWiz help page you opened ealier contains a quick walk-through of how to launch a query, near the top of the page.

    You lauch the query by completing the following steps:

    1. Specify the input file

    2. Notice: Some browsers may give the file a new extension (e.g. .txt). If you cannot see the file in the file-chooser dialog, select "All files" as the type of files you want to see.

    3. Accept the default result name suggested to OligoWiz - or choose a new one

    4. Select Bacillus subtilis (subspecies "subtilis") as the species database to use

    5. You can find the correct entry through the "Taxonomy" tree (Bacillus is a Firmicute), or through the Alphabetically sorted tree (sorted by genus name)

    6. Once the correct database is selected, press the "?" button positioned just above the species database tree. This will bring up the "Databases" webpage (also linked from the main OligoWiz 2.0 webpage) and jump directly to the relevant database entry. [Launching an external browser, _may_ not work on some Linux/UNIX configurations. It works on Windows and Mac for sure.]

    7. Adjust parameters

      • Allow the length to vary between 65 and 70bp (set Aim oligo length to 70bp).
      • Let OligoWiz determine the most optimal Tm (default)
      • Use default cross-hybridization settings
      • Select "Random primed" as the position score
    8. Launch query by pressing "Submit query"

  4. Wait for OligoWiz to process and download the result

    The query should be complete within one minute. When everything is ready the query status changes to "completed - click to view". Double-click the query to view the result.

  5. Have a look at the sequence/probe inspection interface

    Try to select different sequences in the leftmost table, and have a look at the scoring graphs. Notice how the weighting of the different scores can be changed.

  6. Select probes for all the transcripts

    Press the "Place oligos..." button to open up the probe placement dialog. At this point we will ignore the filter-options and only experiment with the distance settings.

    Initially try setting the maximum number of probes per sequence to 1, and press "Apply to all". Inspect the result in the main window.

    Notice how the length of the probes may vary in order to minimize the variation in Tm.

  7. Export the probe sequences

    In the main window press the "Export oligos..." button to open up the export dialog. Select FASTA format as the file format and save all the probes to a file.

    The resultant file can be inspected in a text-editor.

  8. Experiment with the probe distance settings

    Try to select more than one probe per sequence by adjusting the distance parameters and pressing "Apply to all" again. Notice that the minimum distance criteria can be use to allow/disallow overlapping probes.

  9. Close the probe placement dialog before you process to the next exercise


Exercise 2: Intron containing Yeast genes

In this excercise you will start out by extracting sequence and intron/exon annotation from a set of Yeast genes using the FeatureExtract server.

Extract sequence and annotation

  1. Open the FeatureExtract server in a new window

  2. Extract feature annotation from the Yeast mitochondiral genome

    Most GenBank entries can be accessed directly by their Accession ID. Paste in the following ID in the left most text-box (titled: "Paste in list of GenBank accession IDs")

    NC_001224

    Run the FeatureExtract server with default setting by pressing "Submit query".

    Once completed, the server will show a brief summary of the result. Download the TAB format data file (right-click then "save as" in your browser). A this point it is a good idea to save the file under a descriptive filename, e.g. "yeast_mit.tab". NOTICE: Whatever filename you choose, please save the file with .tab as the extension.

    Have a look at the following two tables which gives an overview of the kind it data found in a TAB format file: Sample output tables.

The FeatureExtract server has a lot of advanced functionality, which is not used in this exercise. Especially the ability to extract flanking regions and extracting dataset according to a specific feature (e.g. "promoter") is worth mentioning. Please refer to the "abstract" and "instructions" pages linked from the main FeatureExtract webpage for further information.

Run the Yeast mitochondrial transcripts on OligoWiz 2.0

  1. Create a new blank query

    Go back to the "Query" page in OligoWiz and press the button named "New" (located below the query table).

    Notice that "old" queries, including data files loaded from disk, also appears in the query table. Selecting "old" query makes it possible to inspect the parameters and to use the "clone query" function.

  2. Select the data file you have created a moment ago as the input file.

    OligoWiz will auto detect if the file is in FASTA (sequence only) or TAB (sequence + annotation) format.

  3. Select S. cerevisiae as the species database to use

  4. Set parameters for designing short probes

    Load the pre-defined parameter set for short probes ("short-mers 24-26 bp"). Please notice that we actually need to use random-priming for the position score, since we're working with mitochondrial sequences.

  5. Submit the query, and wait for the result to download

  6. Use the "www..." button in query table to inspect the online status of the query.

    If you supply OligoWiz with your email address, you'll receive an email containing a link to a similar webpage once the query is completed.

    Notice: Using the "www..." button is a good way to inspect error messages.

  7. Inspect the result

    Click on the completed query to load the data, and notice how the trascript indicating bar has changed color. When working with annotation exons are show in green and introns in blue by default. (The color scheme can be changed through the File -> Annotation Color Scheme... menu point).

    The number of transcripts are fairly small. Scroll through the transcripts until you find one named "BI2". This is one of the shorter genes and it contains contains one intron.

  8. Place probes

    Having selected an intron containing transcript, bring up the probe placement dialog. If you have previously (for some reason) deselected the "Only consider regions annotated as exons" option please turn it on it once again.

    Set the distance criteria to allow for up to 25 probes per transcript and press "Apply". This will place probes ONLY within the currently selected transcript.

    Notice how - by default - probes are on place fully within exons.

  9. More detials about working with annotation

    TIP: Go to the OligoWiz help page by pressing "help ..." in the probe placement dialog. This will open up the help webpage at the relevant position. The help page contains additional information about the use of the probe placement dialog.

    The steps in the probe placement algorithm is as follows:

    For each sequence:

    1. If any filters have been defined, mask out probe positions that do not fulfill the criteria.
    2. Place a probe at the position with the highest Total score.
    3. Mask out surroundings positions, as defined by the minimum probe distance setting.
    4. If the maximum number of probes per sequence has not been reached, go to step 2.

    The letters used to annotate introns and exons are as follows:

    E:	Exon
    I:	Intron
    
    (:	Start of exon
    ):	End of exon
    D:	Donor site (start of intron)
    A:	Acceptor site (end of intron)
    	
  10. Example: Targeting probes at intronic regions

    Deselect the "Only consider regions annotated af exons" option

    Enter the following string in the Region include field:

    I+
    	
    This regular expression means "one or more of the letter I". Remember that I means "intron".

    Press "Apply to all"

    Notice how only intron-containing sequences will be assigned probes. Notice that it is possible to sort the contents of the two tables ("Entries" and "Oligo(s) for the selected entry") by clicking at the headers of the tables. This way it is possible to sort the entry list according to the number of probes placed in each transcript.

  11. Targeting other specific regions

    Try to design probes targeting other special regions. Also try to adjust the minimum allowed distance between the probes to pack them tighter (e.g. 5bp).

    Please refer to the OligoWiz help page for a further description of regular expressions. The helppage also contains a number of examples of the use of regular expression in OligoWiz.

    A note on the difference between "Regional" and "Oligo" searches. Using "Oligo include" is the most basic way to define a filter: each oligo MUST satisfy the condition in order to be allowed. For example you can just write a single letter the oligo must contain, and otherwise ignore all the regular expression stuff. Regional searches are a bit more complicated, in the way that you must define an entire region that is included (or excluded). This is why we needed I+ above and not only I.

    The reason for having these two kind of filters, is that they can be used to specify different type of include/exclude statments. For example try to run a search with I in "Oligo include" as the only filter. Notice how probes are placed in introns - but ALSO in positions were the annotation string also contains other letters. You can click on an individual probe in the main window to inspect it (otherwise use the oligo(s).. table).

    Exercise

    • Target exons by disallowing the oligos to contains intronic positions.
    • Target Exon/Intron boundaries (Tip: Use the donor-site annotation letter D)
    • Target Intron/Exon boundaries
    • Target both acceptor and donor site (Tip: remember the OR operator: |).
    • Target exons 200bp downstream of an intron.

      Tip: use the Acceptor site as an anchor, see helppage about the wildcard (dot) operator and the range operator {min,max}.