deFUME 1.0 - Dynamic Exploration of Functional Metagenomics Sequencing Data
deFUME is an easy-to-use web-server for trimming, assembly and functional annotation of Sanger sequencing data derived from functional selection experiments. As input the user simply provides raw Sanger sequencing chromatograms or pre-assembled sequencing projects. Upon submission the web-server processes the information by integrating multiple analysis steps into one single workflow: read trimming, assembly of reads into contigs, open reading frame prediction, BLAST and enrichment with available metadata. As output, deFUME delivers a comprehensive sequence-overview that include functional annotations and sequence statistics. The following section provides instructions to the deFUME web-server.
Table of contents
As input, you can either choose to upload raw chromatograms in ab1 file format (.ab1), or provide pre-assembled contigs as plain sequence in Fasta format. In the latter case, deFUME will skip the chromatogram trimming and assembly process and the submitted sequence will directly be subject to functional annotation by Blast and InterPro.
Input options for raw chromatogram reads
Chromatograms (.ab1 format) must be compressed into an archive (Zip or tar) file. From the deFUME interface, select your zip or tar file from your local disk and upload. The zip file may contain multiple ab1 files. In order to compress multiple chromatogram (.ab1) files in one compressed archive you can use the following command: tar -cvzf YourCompressedFiles.tar.gz *.ab1
Input options for pre-assembled projects
An alternative input to raw sequencing data is pre-assembled sequence, and the user simply loads or copy-paste his/hers sequence in fast format in the specified input window. When choosing this option, deFUME will skip the phred assemby step. This option is useful for a variety of functional annotation analysis and expands the input to other sequencing techniques than Sanger sequencing such as next generation sequencing (NGS).
Recommended input options
Specification of sequencing primer directionality
As a useful option, deFUME allows the user to specify the directionality of the primers used for Sanger sequencing. By specifying an identifier that matches a part of the name of chromatograms generated with a forward this will be visible in the output. Example: if a users chromatograms are named FORW_01.ab1, FORW_02.ab1, FORW_03.ab1,‚Ä¶ REV_E01.ab1, REV_02.b1, REV_03.ab1,‚Ä¶, etc. then specifying a ‚ÄúForward primer identifier‚Äù as ‚ÄúFORW_‚Äù informs deFUME that all chromatograms with this identifier as part of their name is a chromatogram generated with a forward primer. This will generate a more intuitive visualization of the output. If the user inputs an identifier that does not match the chromatogram name or leaves the field empty, deFUME will randomly choose the directionality in the output.
Email for InterPro queries
Assembled open reading frames are further annotated using the InterPro server. In order to use this service at EMBL-EBI a valid email address is required.
Advanced input options
Trimming of cloning vector sequences
For sequencing data generated from a cloning vector, loading vector sequence in this field (as Fasta format) enables deFUME to remove vector sequence from the user sequencing data. This is performed prior to assembly and improves the accuracy of the assembly process.
Base calling error rate
Accuracy of base calls expressed as error probability. The standard probability is 0.01, which corresponds to a base call probability of 99% (or 1 error in 100 bases). Read more in the Wiki article or the phred accuracy assessment and phred error probabilities.
The Visual Output Page
After submitting the job the output page will load when the processing is completed. While processing, it is possible to type in an email address and get notified when the job is complete.
The deFUME output page is a table containing all assembled contigs per row and includes a visual and interactive overview of each assembled contig, specifying chromatogram areas, predicted open reading frames, Blast results and InterPro hits. deFUME is using the D3.js, jQuery and jqGrid library to render the data.
Example output page
Note that a small sample set is prepared containing a few assembled contigs that enables the user to play around with the different filter capabilities of deFUME. In order to directly view the results of this sample set click here.
Expand a contig
Expand an ORF
Associated GO Terms
Exploring your data
Left menu box
The menu on the right side contains additional filtering options. The following visual cues can be turned on and off to easy the browse-ability
Furthermore, the E-value cutoff can be adjust interactively so that only hits with an E-value below this cutoff are shown
In order to inspect the individual GO terms associated with an ORF you can click over the cell containing the 'Associate GO Terms' on ORF level.
Exporting your data
Right menu box
On Contig and ORF level
Submit the job
Click on the "Submit" button. The status of your job (either 'queued' or 'running') will be displayed and constantly updated until it terminates and the server output appears in the browser window. At any time during the wait you may enter your e-mail address and simply leave the window. Your job will continue; you will be notified via e-mail when it has completed. The e-mail message will contain the URL under which the results are stored; they will remain on the server for 24 hours for you to collect
deFUME is compatible with the major browsers available, however be sure to use the latest version. deFUME was successfully tested with Chrome 39.0.2171.95, Firefox 4.0.5, Safari Version 6.1.3, Internet Explorer 11.0.15. deFUME will for example not render properly on Internet Explorer 10.
Please read the CBS access policies for information about limitations on the daily number of submissions.
Scientific problems: Technical problems: