MaxAlign is a program that optimizes the alignment prior to such analyses. Specifically, it maximizes the number of nucleotide(or amino acid) symbols that are present in gap-free columns - the alignment area - by selecting the optimal subset of sequences to exclude from the alignment. MaxAlign can be used prior to phylogenetic and bioinformatical analyses as well as in other situations where this form of alignment improvement is useful. Usage instructions 1. Specify the input sequences All sequence headers/names MUST be different. All the input sequences must be in one-letter amino acid or nucleotide code. The suggested alphabet (not case sensitive) is as follows: A C D E F G H I K L M N P Q R S T U V W Y - Gaps should be represented only by "-". Other symbols e.g. B,J,X will be considered as nucleotides / amino acids. Preserving selected sequences You might want to keep some sequences in your alignment, even at the cost of excluding some sites. You can do that by marking those sequences with a plus sign, "+", before their name, as in the example below: >+Sequence_1 >Sequence_2 Sequence_1 above will always be incorporated in the output of MaxAlign, while Sequence_2 incorporation will be evaluated. Please be sure your sequence names are not starting with a plus "+" if you don't want them to be marked. The MaxAlign web-server is freely available at http://www.cbs.dtu.dk/services/MaxAlign where supplementary information can also be found. The program is also freely available as a Perl stand-alone package. WEB SERVICE OPERATION This Web Service is fully synchronous; There is one operation: 1. maxalign Input: The following parameters and data: * 'alignment' [containing multiple 'sequence' element] * 'sequence' * 'id' Unique identifier for the sequence * 'comment' Optional comment * 'seq' Protein or nucleotide sequences, with unique identifiers (mandatory). The sequences must be written using the one letter amino acid/nucleotide code: `acdefghiklmnpqrstvwy' or `ACDEFGHIKLMNPQRSTVWY'. Gaps are denoted with '-' (dash). Output: The following parameters and data: Output: * 'resultalignment' [containing multiple 'sequence' element] * 'sequence' * 'id' Unique identifier for the sequence * 'comment' Optional comment * 'seq' protein sequences, with unique identifiers (mandatory) * 'originalsequencenumber' Number of sequences in the input alignment * 'originalcolumnnumber' Number of columns in the input alignment * 'originalungapcolumnnumber' Number of columns with no gaps in the input alignment * 'originalalignmentarea' Alignment area of the input alignment * 'resultsequencenumber' Number of sequences in the output alignment; appears only if the alignment can be improved * 'resultungapcolumnnumber' Number of columns with no gaps in the output alignment; appears only if the alignment can be improved * 'resultalignmentarea' Alignment area of the output alignment; CONTACT Technical questions concerning the Web Service should go to Karunakar Bayyapu, karun@cbs.dtu.dk or Kristoffer Rapacki, rapacki@cbs.dtu.dk.