Bioinformatics at the University of Pennsylvania encompasses research, service, and education. The Penn Center for Bioinformatics (PCBI) is a base of operations that houses, nurtures, and catalyzes bioinformatics and computational biology research on campus. Experts from the School of Medicine and across the Penn community, PCBI members conduct independent research, but also consult and collaborate with other Penn faculty to contribute to research programs and Center grants. PCBI partners with bioinformatic cores on campus by communicating faculty needs and through its research. Finally, PCBI provides a home, teaching, and research projects for the Genomics and Computational Biology Graduate Group.
"BEERS was designed to benchmark RNA-Seq alignment algorithms and also algorithms that aim to reconstruct different isoforms and alternate splicing from RNA-Seq data. By default BEERS simulates either mouse or human paired-end RNA-Seq data modeled on the illumina platform. It starts with a large number of gene models (approx 500K) taken from about ten different published annotation efforts, and then chooses a fixed number of these genes at random (30,000 by default). This avoids biasing for or against any particular set of annotations. BEERS then introduces substitutions, indels, alternate spice forms, sequencing errors, and intron signal. BEERS can also simulate strand specific reads. BEERS does not simulate quality scores. There are four configuration files required (available on website). BEERS can also be configured to use any set of gene models. Pre-built indexes for human refseq are given on the website. Using these indexes will generate a much tamer set of transcripts."
"PaGE can be used to produce sets of differentially expressed genes with confidence measures attached. These lists are generated the False Discovery Rate method of controlling the false positives.
But PaGE is more than a differential expression analysis tool. PaGE is a tool to attach descriptive , dependable, and easily interpretable expression patterns to genes across multiple conditions, each represented by a set of replicated array experiments.
The input consists of (replicated) intensities from a collection of array experiments from two or more conditions (or from a collection of direct comparisons on 2-channel arrays). The output consists of patterns, one for each row identifier in the data file.
One condition is used as a reference to which the other types are compared. The length of a pattern equals the number of non-reference sample types. The symbols in the patterns are integers, where positive integers represent up-regulation as compared to the reference sample type and negative integers represent down-regulation.
The patterns are based on the false discovery rates for each position in the pattern, so that the number of positive and negative symbols that appear in each position of the pattern is as descriptive as the data variability allows.
The patterns generated are easily interpretable in that integers are used to represent different levels of up- or down-regulation as compared to the reference sample type."