See it in Search
This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.
BROCC
eagle-i ID
http://eagle-i.itmat.upenn.edu/i/0000013a-b860-843e-d69a-d90d80000000
Resource Type
Properties
-
-
Resource Description
-
BROCC is a flexible software pipeline for classifying single cell eukaryotes in microbiome samples that easily interfaces with the popular QIIME pipeline.
BROCC classifies amplicons using BLAST searches against large and relatively uncurated databases. BROCC uses blastn, but output from other versions of BLAST such as blastx can be substituted. BROCC first filters input BLAST hits for sufficient coverage and identity to the query sequence.
If a query sequence has too many hits that are below the preset coverage threshold (70% default), or BLAST did not return a hit, it is not classified, and a message is written to the output file. BROCC then determines the identity and taxonomic hierarchy of each high quality hit using a local user installed sql database and NCBI’s e-fetch tool.
BROCC then votes on the quality filtered BLAST hits, starting at the species level. At each level of the taxonomy BROCC requires the taxon with the most votes to surpass a user specified threshold for that level in order to accept it as a valid classification. If a sufficient majority is not reached, BROCC will not make a classification for that level and iterate to the next higher taxonomic level for another round of voting. BROCC filters are independently configurable at the genus and species levels, and another filter can be assigned for the remaining taxonomic levels.
BROCC also contains a user modifiable list of high level and partial assignments in its configuration file. These assignments are ignored at lower taxonomic levels where they are uninformative and can distort voting, but included in higher levels. For example, a sequence read with a kingdom level assignment only is excluded up to the kingdom level, at which point the vote is counted in the kingdom assignment. In cases where the proportion of high level and partial assignments exceeds a given threshold (default 0.70), the query sequence is unassigned and marked accordingly.
BROCC output includes both files containing classifications with standardized taxonomy (domain, kingdom, phylum, class, order, family, genus, species) and a second with the complete NCBI taxonomy, which includes subtaxa, supertaxa, and unranked intermediate taxonomic levels. The third file contains a log of the voting record, including how many votes were cast, how many votes the winning taxon received, and how many generic classifications were ignored for each query sequence. This file also indicates those queries that were unclassified. Both taxonomy files are suitable for use in the QIIME pipeline (i. e. they are in the same format as the output classifications as the QIIME assign_taxonomy.py script).
The BROCC program is implemented in Python version 2.7. It queries the NCBI taxonomy and requires local installations of SQL and BLAST.
-
-
Additional Name
-
BLAST Read and OTU Consensus Classifier
-
-
Used by
-
Bushman Laboratory
-
-
Version
-
1.1.0
-
-
Operating System
-
Unix
-
-
Data Input
-
Amplicon-based sequence sets
-
-
Data Input
-
BLAST results, output format 7
-
-
Data Output
-
QIIME-formatted taxonomy map
-
-
Data Output
-
Log file with full classification and voting details
-
-
Data Output
-
NCBI taxonomy file
-
-
Software purpose
-
Sequence-based phylogenetic analysis objective
-
-
Related Publication or Documentation
-
A tool kit for quantifying eukaryotic rRNA gene sequences from human microbiome samples
-
-
Website(s)
-
http://sourceforge.net/projects/brocc/
-
-
Related Technique
-
Metagenomics analysis
-
-
Developed by
-
Bushman, Frederic D., PhD
-
-
Developed by
-
Bittinger, Kyle., PhD
-
-
Developed by
-
0000013a-c257-7536-d69a-d90d80000000
-
-
Software license
-
GNU General Public License
-
-
Coded in
-
Python
