Events News Research CBS CBS Publications Bioinformatics
Staff Contact About Internal CBS CBS Other

MaxAlign 1.1.ws1

Maximizing alignment area


WSDL MaxAlign/MaxAlign_1_1_ws1.wsdl
Schema definitions ../common/ws_common_1_0b.xsd
ws_maxalign_1_1_ws0.xsd

We recommend that the first time users should load the WSDL file above to SoapUI and investigate the Web Service operations in that environment. SoapUI is a desktop application for inspecting, invoking, developing and functional/load/compliance testing of Web Services over HTTP. It can be downloaded free of charge from http://www.soapui.org/.

Other versions and implementations

Ver.Last updated
1.1.ws1  2012-04-19(this version, most recent)
1.1.ws0  2008-09-17

Examples of client side scripts using the service

FilenameTypeCompatibilityAuthorDescription
maxalign_ws.pl (2.5 KB) Perl 1.1 ws0 Peter Wad Sackett
This is a template example script. It reads an alignment as fasta and produces formatted output from the web service
alignment.fsa (2.6 KB)
xml-compile.pl (3.2 KB) Perl NA Peter Fischer Hallin
Helper scripts used to initiate XML::Compile's proxys (WSDL+XSD)
test_maxalign.pl (5.1 KB) Perl 1.1 ws0 Edita Bartaseviciute
This script runs the MaxAlign 1.1.ws0 Web Service. It requires no input; to be used for testing in the EMBRACE WS Registry.
example.fsa (1.3 KB)
maxalign.pl (4.0 KB) Perl 1.1 ws0 Edita Bartaseviciute
This is a template example script. It reads an alignment as fasta and produces formatted output from the web service

Documentation

    MaxAlign is a program that optimizes the alignment prior to such analyses. Specifically, it maximizes 
    the number of nucleotide(or amino acid) symbols that are present in gap-free columns - the alignment 
    area - by selecting the optimal subset of sequences to exclude from the alignment. MaxAlign can be used
    prior to phylogenetic and bioinformatical analyses as well as in other situations where this form of 
    alignment improvement is useful.

    Usage instructions

    1. Specify the input sequences
    
    All sequence headers/names MUST be different.
    All the input sequences must be in one-letter amino acid or nucleotide code. The suggested alphabet 
    (not case sensitive) is as follows: A C D E F G H I K L M N P Q R S T U V W Y -
    Gaps should be represented only by "-". Other symbols e.g. B,J,X will be considered as nucleotides 
    / amino acids. 

    Preserving selected sequences
    You might want to keep some sequences in your alignment, even at the cost of excluding some sites.
    You can do that by marking those sequences with a plus sign, "+", before their name, as in the example
    below:

    >+Sequence_1
    >Sequence_2

    Sequence_1 above will always be incorporated in the output of MaxAlign, while Sequence_2 incorporation 
    will be evaluated. Please be sure your sequence names are not starting with a plus "+" if you don't 
    want them to be marked.

    The MaxAlign web-server is freely available at http://www.cbs.dtu.dk/services/MaxAlign where 
    supplementary information can also be found. The program is also freely available as a Perl stand-alone
    package. 

    
    WEB SERVICE OPERATION

    This Web Service is fully synchronous; There is one operation:

    1. maxalign

      Input:  The following parameters and data:

              * 'alignment'   [containing multiple 'sequence' element]
                * 'sequence'
                    * 'id'         Unique identifier for the sequence
                     * 'comment'    Optional comment
                     * 'seq'        Protein or nucleotide sequences, with unique identifiers 
               (mandatory). The sequences must be written using the one letter
              amino acid/nucleotide code: `acdefghiklmnpqrstvwy' or 
              `ACDEFGHIKLMNPQRSTVWY'.  Gaps are denoted with '-' (dash).

      Output: The following parameters and data:

      Output: 
            * 'resultalignment'   [containing multiple 'sequence' element]
              * 'sequence'
                   * 'id'         Unique identifier for the sequence
                    * 'comment'    Optional comment
                    * 'seq'        protein sequences, with unique identifiers (mandatory) 

            * 'originalsequencenumber'     Number of sequences in the input alignment
            * 'originalcolumnnumber'       Number of columns in the input alignment
            * 'originalungapcolumnnumber'  Number of columns with no gaps in the input alignment
             * 'originalalignmentarea'      Alignment area of the input alignment
      * 'resultsequencenumber'       Number of sequences in the output alignment; 
                         appears only if the alignment can be improved
             * 'resultungapcolumnnumber'    Number of columns with no gaps in the output alignment; 
                   appears only if the alignment can be improved
             * 'resultalignmentarea'        Alignment area of the output alignment; 
    CONTACT
    Technical questions concerning the Web Service should go to Karunakar Bayyapu, karun@cbs.dtu.dk 
    or Kristoffer Rapacki, rapacki@cbs.dtu.dk.