Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

Usage instructions


Setting up a query

Interpreting the results/ placing probes

  • Export oligos/ probes


    Quick guide (Query)

    Quickstartquery

    Short walk through

    In order to use OligoWiz 2 for microarray probe design, the first step is to specify the target sequences (transcripts). This is done by supplying OligoWiz 2 with a file containing the intended target sequence in FASTA or TAB format. The TAB file may contain sequence annotation, such as exon/intron annotation. An annotation containing TAB file can be custom made or generated from a GenBank file using the FeatureExtract 1.0 server.

    Following, a species database and a series of parameters must be set (a set of default settings can be loaded). When the desired parameters are set, a query can be launched by pressing the "submit" button. After calculating scores for all possible probes in the input sequences, an *.owz.gz file is returned from the server and saved to disk.

    The probe scores, together with the Total score (red curve) and sequence annotation can be viewed in the graphical interface. The next step is to place the probes, using the "Oligo placement" dialog. The number of probes pr. input sequence and distance between probes as well as minimum score criteria can be set. In addition, the probes can be placed with respect the sequence annotation using regular expressions.

    Finally, the probes can be exported as either sense or antisense probes, with or without mismatch probes and in either tab or FASTA format.


    Getting started

    First thing you must do is to download the client program from the OligoWiz 2 web page and possible Java 1.4 or newer.
    From a Windows PC or Macintosh just click on the program icon and wait a few seconds.
    In a UNIX or Linux operative system you must write: java -jar OligoWiz2.0.jar


    Setting up a query


    Input formats

    The input for OligoWiz 2.0 contains information about the nucleotide sequence of the target sequences against which probes must be designed. Optionally sequence annotation can be included in the input file. OligoWiz 2.0 takes two input formats, FASTA and Tab. Only the tab format may contain sequence annotation

    FASTA

    A FASTA file entry begins with a ">" followed by a sequence description which is ended by a new line, the following lines contain nucleotide sequence. For each target sequence a separate entry in the FASTA file must be given. The FASTA file must be in ASCII format, i.e. a text file, not MSWord.doc or the like. An example, with 30 Bacillus subtilis coding regions can be found here

    Annotation containing tab file

    The tab file format is a tab-delimited file with one target sequence per line, containing: a sequence id, the nucleotide sequence followed by an optional sequence annotation string and comments.
    Example:
    Seq_x	ATGTCTACATATGAAGGTATGTAA	(EEEEEEEEEEEEEE)DIIIIIII	/comment

    A tab file can automatically be generated from a GenBank file using the FeatureExtract 1.0 server A detailed description of the file format can be found here.

    Loading input file

    The input file must be specified under "Main/Input FASTA or TAB file:" You may activate a browse dialog function by pressing the "..." button. When the input file has been specified you will be prompted to specify an output OWZ file, for the server output. The OWZ file may be used to load the query result from the "File/open file ..." menu, at later time.

    Setting parameters

    OligoWiz 2 requires a series of parameters in order to calculate scores for the probes. This involves setting the "Species" parameter as well as the Score parameters. The most important parameters to set are the species parameter and the oligo length. OligoWiz 2.0 also offers some default parameter settings that can be loaded (se Loading defaults), but the species parameter must always be custom set.

    Species database

    In order to calculate an appropriate cross-hybridization score as well as low-complexity score, OligoWiz 2 requires information about the species in question. A number of databases are available, and can be found in the tree structure under "Species:" The species option is set by highlighting a leaf in the tree structure, e.g. "Eukaryote/Fungi/S. cerevisiae" as shown in the Quick guide (query).

    A more detailed description of the databases can be found here.
    For some higher organisms a UniGene database is available and specified "(UniGene)" other wise the database is a genome database.

    Databases are continually added to the OligoWiz 2 server and the available databases are automatically read from the server every time the OligoWiz 2 client is started, or you press the "Connect" button in the top right corner of the query page.

    Loading defaults

    Defaults score parameters can be loaded from the "Pre-defined parameter sets" drop-down menu (point 3 in the quick guide (query) diagram). The defaults are loaded with the "Load" button and can following be viewed in the four pages "General", "Tm", "Cross-hybridization" and "Position", below. However, the user must always set the "Species" parameter.

    Score parameters

    The OligoWiz 2 client allows you to customize a series of parameters for the calculations of the scores. These parameters can be set in the "Score parameters/info" field. These are set from the four "page selectors" named: General, Tm, Cross-hybridization and Position, that are described hereunder.

    General

    Her you may:
    - Specify the length of the oligonucleotides. If the "Optimize oligo length to fit Tm" is selected, the aim length is used to calculate the optimal Tm for the delta-Tm score, i.e. the mean Tm for all oligos of this length is the aim Tm.
    - Specify the Min and Max oligo length to define the length interval within which OligoWiz selects oligos.

    Tm

    Here an optimal Tm may be specified, but we recommend you leave that to OligoWiz 2.0. When the server returns the result the aim Tm can be read from here. The hybridization chemistry may also be set to either DNA:DNA or RNA:RNA.

    Cross-hybridization

    The criteria's for the BLAST hits to be included in the Homology score calculation may be set.

    - "Minimum homology" (%) specifies the lowest degree of similarity to be taken into account for the homology calculation.

    - "Minimum length of homology stretch" (bp) specifies the shortest overlap a BLAST hit can have with an oligo without being ignored from the score calculation.

    Kane et al. (2000) show that for 50mers more than 75-80% homology or stretches over 15bp can course cross-hybridization.

    - "Maximum homology" and "(total) Max length cutoff" is used to filter BLAST hits coursed by homology to the database version of the transcript in question, or homology to very closely related paralogs.

    The "Max length cutoff" sets a maximum fraction of the entire length of the query sequence, which can be covered by an accepted BLAST hit exceeding "Maximum homology". Both "Maximum homology" and "(total) Max length cutoff" should be violated for a BLAST hit to be rejected.

    Position

    Position refers to the position within the query sequence. The score models the reverse transcription process that produces cDNA. There are five exclusive options.

    - "Poly-T primer" specifies a score based on a 3' poly-T primer. The further a probe is placed from the 3' end the lower score.

    - "Random primer" specifies a score based on a random primer that anneals various places along the transcript. The chance of having a primer annealing upstream is larger for transcripts toward the 5'end, still taking into account that the reverse transcriptase can drop off the transcript and that the transcript may be degraded

    - "5' preference" specifies a score that is one at the 5' end and decreases linear toward the 3' end

    - "3' preference" specifies a score that is one at the 3' end and decreases linear toward the 5' end

    - "Mid preference" specifies a score that is one in the middle of the sequence and decreases toward the ends

    Launching a job

    When the desired parameters have been set, a job may be launched at the OligoWiz server. The job is submitted by selecting a query in the query list and pressing the submit button. The query status can be followed in the status column. Furthermore, clicking the "www ..." button in the "inspect query" column will open a web page with server status for the query. This function may also be used to inspect error messages.
    To prepare a new query press the new button or clone button to duplicate the previously query.


    Interpreting the results/ placing probes


    View results

    When the server has returned the result, the text in the status column will change to "Completed - click to view". Double clicking the query row will open a result page. The result page/interface is described below.

    Result interface

    Quickstartquery

    A: Graphs represent scores (y-axis) along the input sequence (x-axis). The different graphs represent different score types indicated by their color (see H). For all scores it counts that 1 is optimal and 0 is worst.

    B: Color bar representing the Total (weighted) score. The Total score is also represented as a graph (thick red line). On the color bar red represent total scores higher than 0.8

    C: Sequence bar. This bar represents the currently selected sequence entry (see I). The bar is color coded if sequence annotation has been supplied. Here green indicates exons and blue indicate introns.

    D: Sequence inspection box / probe movement tool. The box represents the length of a probe and can be moved up and down the sequence bar (see C) using the mouse. The sequence and annotation at to current position is show below (see D). The eye (J) and hand (K) icons switched between sequence inspection and probe movement.

    E: Sequence and feature annotation of the currently selected oligos, or the sequence postion being inspected (see D).

    F: "Place oligos" button. Opens the dialog for automatic placement of probes.

    G: A selection of options for manual manipulation of the probes.

    H: Score management bar. Score weights and visual representation can be adjusted here.

    I: Table of all sequence entries. One row for each entry in the original FASTA or TAB file. The columns can be sorted by clicking the header.

    J: Eye icon. Put the selection box (D) in sequence inspect mode.

    K: Hand icon. Put the selection box (D) in probe adjustment mode.

    L: Visual representation of where the probes a located. Click on a probe to select it.

    M: Export oligos. Open the dialog where probes can be exported - several file formats and advanced options are available.

    N: Oligo Table. Shows all probes for the currently selected sequence entry (see I).


    Score Weights

    The weight of the scores can be adjusted individually (see section H above). For example this can be used to ignore certain scores completely, or to put more wieght on a parameter which is important for a special study.


    Place oligos/probes

    Probe placement dialog
    The probe placement dialog

    The oligo placement dialog is the main tool for probe placement in OligoWiz 2. The dialog is opened by pressing the "Place oligos..." button (see section F above) or by using the "Oligos" menu.

    General oligo placement

    This box in the placement dialog, governs the basic options for the distance between the probes and the Total score cut-off value.

    Replacement behavior

    This options governs if all existing probes should be discared prior to applying the search ruls, or if they should be kept. If old probes are kept they will be evaluated according to the search criteria before placing new probes, thus disallowing new probes in certain positions.

    Filters

    One of the main new features of OligoWiz 2 is the ability to take sequence feature annotation like intron/exon structure into account. The annotation is in the form of an annotation string of the same length as the DNA sequence. Each position in the annotation string describes the annotation for the corresponding position in the DNA sequence, e.g. "E" for exon and "I" for intron.

    TAB files created using the FeatureExtract server contains the following type of annotation:

    Sequence:   ATGTCTACATATGAAGGTATGTAATG ... TCATCATTAGATTAGAGGAACATGGAATACAACAAAACT ... ATTGGGGTATGTACGGTTAA
    Annotation: (EEEEEEEEEEEEEE)DIIIIIIIII ... IIIIIIIIIIIIIIA(EEEEEEEEEEEEEEEEEEEEEEE ... EEEEEEEEEEEEEEEEEEE)
    

    (EEEE) blocks = exons. DIIIIA blocks = introns. Some positions have been skipped ( ... ) for readbility.
    For details please refer to the FeatureExtract server: output format.

    Using the filters it is possible to restrict the selection of probes to regions of the input genes that has a certain type of annotation. By default the search is restricted to exonic regions - the regions annotated as (EE .. EE).

    Custom filter can be defined using regular expressions - advanced string matching.

    About regular expressions

    (PERL style)
    A regular expression is an advanced description of a text string to search for. Several closely related flavors are found. Here we focus on regular expression as the programming language PERL interprets it, since this is the most general accepted definition.

    In its simplest form, a regular expression is just a word or phrase, for example "human". But sometimes an alternative word is just as good, then you may use "human|Homo sapiens". Where the sign "|" (the pipe character) means OR. Several alternatives can be listed separated by series of "|"'s.

    Sometimes however, there are too many alternatives, or the alternatives may not be known, and using a wildcard becomes handy. The wildcard in regular expressions is "." (dot).

    Sometimes the number of times a certain expression is repeated is uncertain. In this case the "*" (star / multiply) sign to indicate one ot more repeats. For example "TA*" will match all of the following: "T", "TA" and "TAAAA"

    If an expression is expected an interval of times, putting "{n,m}" after the expression can be used to specify this: n is the minimum match counts and m the maximum. Alternatively "{n}" means exactly n times, and "{n,}" means at least n times.

    If more than a single character is to be repeated in the expression it must be enclosed by "(" and ")". For example "(TA){2,4}" will match "TATA", "TATATA" and "TATATATA".

    Regular expression has a series of special characters with special meanings, including "(, ), [, ], ., *, {, }, ^, !, \". However, it may be needed to search for these characters in the sequence annotation, for example in order to target the exon start and end characters "(" and "). In order to do so, the special meaning of these characters needs to be bypassed. This can be don by putting a "\" (backslash) in front of the special character. Example: "\(|\)" will search for "(" or ")".

    Oligo or Regional search unit

    When a string search is done, the program that performs the search will return a "TRUE" when it finds the string that matched the regular expression. Therefore, it is important to note which target units the search space is broken into. If the target unit is a line in a file, one will be able to retrieve the line(s) that had a match to the regular expression.

    In OligoWiz 2 two target units are available: oligo and regional. In the "Oligo Include"/"Oligo Exclude" fields the unit is the oligos. The regulatory expression written here will include or exclude oligos based on the regular expression. In the "Region Include"/"Region Exclude" fields each position in the input sequence is evaluated. The following oligo selection is then done among oligos that have all bases evaluated to be included.

    Both criteria's can be used at the same time. In this case an oligo is required to meet both criteria's.

    A more detailed description of regular expression can be found here

    Examples of probe placement

    Example 1: Targeting exons

    Region include: \(E+\)
    

    Example 2: Targeting introns

    Region include: DI+A
    

    Example 3: Targeting long exons

    Region include: \(E{200,}\)
    
    Targets exons of length 202 or more (200 + exon start and end)

    Example 4: Exons near splice junstions

    Region include: E{1,100}\)D|A\(E{1,100}
    Oligo exclude:  D|A
    
    Tagets exons 100bp upstream of donor sites and 100bp downstream of acceptor sites

    Example 5: Cover Exon/Intron junction

    Oligo include: E{4,}\)D{4,}I|I{4,}A\(E{4,}
    
    Ensure that at least 5bp og exon and 5 bp intron is both included.

    Export oligos/ probes

    Export dialog

    When probes have been selected they may be exported as either sense or antisense probes, with or without mismatch probes and in either tab or FASTA format. By default a materials and methods section are auto generated and included. This section describes all the parameters used.

    The export dialog can be opend by clicking on the "Export oligos ..." button in the main interface (see section H above) or by using the menu short cut: File -> Export oligos ...



    GETTING HELP

    Scientific problems:        Technical problems: