Genome comparison of 20 Escherichia coli genomes

Helge Tippmann1, Henni Wallinbrock1, Trudy M. Wassenaar1,2, David W. Ussery1

1Center for Biological Sequence Analysis, Technical University of Denmark , Lyngby, Denmark, and 2Molecular Microbiology and Genomics Consultants, Zotzenheim, Germany.

There are currently 20 completely sequenced genomes of E. coli publicly available. These include 3 apathogenic strains, 4 enteropathogenic EPEC, 3 enterotoxigenic ETEC, 2 uropathogenic UPEC, 2 enterohemoragic EHEC, 2 enteroaggrative EAEC, one enteroinvasive EIEC, and 3 others. The complete genome sequences (chromosomes and plasmids) of these 20 strains were compared for gene content and gene synteny. Phylogenetic trees based on rRNA genes and MLST markers were constructed. Total gene content and numbers of rRNA and tRNA genes present were compared, and, when necessary, corrected, as published annotations were not always accurate. Known virulence genes were related to the pathotype. All sequenced strains contain at least 100 genes different from all other sequenced strains (excepting one of the two O:157 and one of the two K-12 genomes), indicating a large degree of diversity within the species. Genome size or number of genes present was not conserved within a given pathotype. Comparative studies like these lead to insights on the genetic diversity and phylogenic relationship of various pathotypes within a species.


Send comments to helge@cbs.dtu.dk