Microbiology

Automated Annotation of Bacterial 16S rRNA Gene Sequences to the Species Level using the SmartGene Centroid Approach

S. Emler and G. Bloemberg. Poster presented at the Medical Biodefense Conference 2013 in Munich, Germany. October 22-25, 2013. This email address is being protected from spambots. You need JavaScript enabled to view it.

Citation: “... almost 5000 ... sequences were searched (BLAST) using the SmartGene Batch Processing technology against the Centroid Database, which is composed of one representative sequence per species ...; Centroid-unique (CU) matches were determined and considered valid, if the distance from the Centroid sequence was < median distance of all variants for this species .... All sequences were then further searched against the SmartGene Eubacteria Database, containing ... quality-filtered species variants annotated with confidence values (0-100%), with regard to their proximity to their respective centroids. ...Species assignment was successfully achieved in >90% of cases, exceeding what is usually achieved with methods used e.g. for microbiome annotation. Substantial improvement is derived from the two step approach, searching the Centroid database for accurate species discrimination, and the Eubacteria Database for closely matching species variants. ...”