ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unbekannt

The allele distribution in next-generation sequencing data sets is accurately described as the result of a stochastic branching process (2012)

Heinrich, V., Stange, J., Dickhaus, T., Imkeller, P., Kruger, U., Bauer, S., Mundlos, S., Robinson, P. N., Hecht, J., Krawitz, P. M.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2012-03-29

Beschreibung: With the availability of next-generation sequencing (NGS) technology, it is expected that sequence variants may be called on a genomic scale. Here, we demonstrate that a deeper understanding of the distribution of the variant call frequencies at heterozygous loci in NGS data sets is a prerequisite for sensitive variant detection. We model the crucial steps in an NGS protocol as a stochastic branching process and derive a mathematical framework for the expected distribution of alleles at heterozygous loci before measurement that is sequencing. We confirm our theoretical results by analyzing technical replicates of human exome data and demonstrate that the variance of allele frequencies at heterozygous loci is higher than expected by a simple binomial distribution. Due to this high variance, mutation callers relying on binomial distributed priors are less sensitive for heterozygous variants that deviate strongly from the expected mean frequency. Our results also indicate that error rates can be reduced to a greater degree by technical replicates than by increasing sequencing depth.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

2

Unbekannt

Enabling the democratization of the genomics revolution with a fully integrated web-based bioinformatics platform (2017)

Li, P.-E., Lo, C.-C., Anderson, J. J., Davenport, K. W., Bishop-Lilly, K. A., Xu, Y., Ahmed, S., Feng, S., Mokashi, V. P., Chain, P. S. G.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2017-01-10

Beschreibung: Continued advancements in sequencing technologies have fueled the development of new sequencing applications and promise to flood current databases with raw data. A number of factors prevent the seamless and easy use of these data, including the breadth of project goals, the wide array of tools that individually perform fractions of any given analysis, the large number of associated software/hardware dependencies, and the detailed expertise required to perform these analyses. To address these issues, we have developed an intuitive web-based environment with a wide assortment of integrated and cutting-edge bioinformatics tools in pre-configured workflows. These workflows, coupled with the ease of use of the environment, provide even novice next-generation sequencing users with the ability to perform many complex analyses with only a few mouse clicks and, within the context of the same environment, to visualize and further interrogate their results. This bioinformatics platform is an initial attempt at Empowering the Development of Genomics Expertise (EDGE) in a wide range of applications for microbial research.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

3

Unbekannt

Prioritizing and selecting likely novel miRNAs from NGS data (2016)

Backes, C., Meder, B., Hart, M., Ludwig, N., Leidinger, P., Vogel, B., Galata, V., Roth, P., Menegatti, J., Grässer, F., Ruprecht, K., Kahraman, M., Grossmann, T., Haas, J., Meese, E., Keller, A.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-04-08

Beschreibung: Small non-coding RNAs play a key role in many physiological and pathological processes. Since 2004, miRNA sequences have been catalogued in miRBase, which is currently in its 21st version. We investigated sequence and structural features of miRNAs annotated in the miRBase and compared them between different versions of this reference database. We have identified that the two most recent releases (v20 and v21) are influenced by next-generation sequencing based miRNA predictions and show significant deviation from miRNAs discovered prior to the high-throughput profiling period. From the analysis of miRBase, we derived a set of key characteristics to predict new miRNAs and applied the implemented algorithm to evaluate novel blood-borne miRNA candidates. We carried out 705 individual whole miRNA sequencings of blood cells and collected a total of 9.7 billion reads. Using miRDeep2 we initially predicted 1452 potentially novel miRNAs. After excluding false positives, 518 candidates remained. These novel candidates were ranked according to their distance to the features in the early miRBase versions allowing for an easier selection of a subset of putative miRNAs for validation. Selected candidates were successfully validated by qRT-PCR and northern blotting. In addition, we implemented a web-server for ranking potential miRNA candidates, which is available at: www.ccb.uni-saarland.de/novomirank .

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

4

Unbekannt

Identification of multi-loci hubs from 4C-seq demonstrates the functional importance of simultaneous interactions (2016)

Jiang, T., Raviram, R., Snetkova, V., Rocha, P. P., Proudhon, C., Badri, S., Bonneau, R., Skok, J. A., Kluger, Y.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-10-14

Beschreibung: Use of low resolution single cell DNA FISH and population based high resolution chromosome conformation capture techniques have highlighted the importance of pairwise chromatin interactions in gene regulation. However, it is unlikely that associations involving regulatory elements act in isolation of other interacting partners that also influence their impact. Indeed, the influence of multi-loci interactions remains something of an enigma as beyond low-resolution DNA FISH we do not have the appropriate tools to analyze these. Here we present a method that uses standard 4C-seq data to identify multi-loci interactions from the same cell. We demonstrate the feasibility of our method using 4C-seq data sets that identify known pairwise and novel tri-loci interactions involving the Tcrb and Igk antigen receptor enhancers. We further show that the three Igk enhancers, MiE, 3'E and Ed, interact simultaneously in this super-enhancer cluster, which add to our previous findings showing that loss of one element decreases interactions between all three elements as well as reducing their transcriptional output. These findings underscore the functional importance of simultaneous interactions and provide new insight into the relationship between enhancer elements. Our method opens the door for studying multi-loci interactions and their impact on gene regulation in other biological settings.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

5

Unbekannt

ChIP-Enrich: gene set enrichment testing for ChIP-seq data (2014)

Welch, R. P., Lee, C., Imbriano, P. M., Patil, S., Weymouth, T. E., Smith, R. A., Scott, L. J., Sartor, M. A.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2014-08-01

Beschreibung: Gene set enrichment testing can enhance the biological interpretation of ChIP-seq data. Here, we develop a method, ChIP-Enrich, for this analysis which empirically adjusts for gene locus length (the length of the gene body and its surrounding non-coding sequence). Adjustment for gene locus length is necessary because it is often positively associated with the presence of one or more peaks and because many biologically defined gene sets have an excess of genes with longer or shorter gene locus lengths. Unlike alternative methods, ChIP-Enrich can account for the wide range of gene locus length-to-peak presence relationships (observed in ENCODE ChIP-seq data sets). We show that ChIP-Enrich has a well-calibrated type I error rate using permuted ENCODE ChIP-seq data sets; in contrast, two commonly used gene set enrichment methods, Fisher's exact test and the binomial test implemented in Genomic Regions Enrichment of Annotations Tool (GREAT), can have highly inflated type I error rates and biases in ranking. We identify DNA-binding proteins, including CTCF, JunD and glucocorticoid receptor α (GRα), that show different enrichment patterns for peaks closer to versus further from transcription start sites. We also identify known and potential new biological functions of GRα. ChIP-Enrich is available as a web interface ( http://chip-enrich.med.umich.edu ) and Bioconductor package.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

6

Unbekannt

Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation (2013)

Szatkiewicz, J. P., Wang, W., Sullivan, P. F., Wang, W., Sun, W.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2013-02-02

Beschreibung: Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

7

Unbekannt

RNA2DNAlign: nucleotide resolution allele asymmetries through quantitative assessment of RNA and DNA paired sequencing data (2016)

Movassagh, M., Alomran, N., Mudvari, P., Dede, M., Dede, C., Kowsari, K., Restrepo, P., Cauley, E., Bahl, S., Li, M., Waterhouse, W., Tsaneva-Atanasova, K., Edwards, N., Horvath, A.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2016-12-17

Beschreibung: We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching normal and tumor exomes and transcriptomes from 90 breast cancer patients from TCGA. Under high-confidence settings, RNA2DNAlign identified 2038 distinct SNV sites associated with one of the aforementioned asymetries, the majority of which have not been linked to functionality before. The performance assessment shows very high specificity and sensitivity, due to the corroboration of signals across multiple matching datasets. RNA2DNAlign is freely available from http://github.com/HorvathLab/NGS as a self-contained binary package for 64-bit Linux systems.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

8

Unbekannt

Methods to increase reproducibility in differential gene expression via meta-analysis (2017)

Sweeney, T. E., Haynes, W. A., Vallania, F., Ioannidis, J. P., Khatri, P.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2017-01-10

Beschreibung: Findings from clinical and biological studies are often not reproducible when tested in independent cohorts. Due to the testing of a large number of hypotheses and relatively small sample sizes, results from whole-genome expression studies in particular are often not reproducible. Compared to single-study analysis, gene expression meta-analysis can improve reproducibility by integrating data from multiple studies. However, there are multiple choices in designing and carrying out a meta-analysis. Yet, clear guidelines on best practices are scarce. Here, we hypothesized that studying subsets of very large meta-analyses would allow for systematic identification of best practices to improve reproducibility. We therefore constructed three very large gene expression meta-analyses from clinical samples, and then examined meta-analyses of subsets of the datasets (all combinations of datasets with up to N/2 samples and K/2 datasets) compared to a ‘silver standard’ of differentially expressed genes found in the entire cohort. We tested three random-effects meta-analysis models using this procedure. We showed relatively greater reproducibility with more-stringent effect size thresholds with relaxed significance thresholds; relatively lower reproducibility when imposing extraneous constraints on residual heterogeneity; and an underestimation of actual false positive rate by Benjamini–Hochberg correction. In addition, multivariate regression showed that the accuracy of a meta-analysis increased significantly with more included datasets even when controlling for sample size.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

9

Unbekannt

Systematic integration of RNA-Seq statistical algorithms for accurate detection of differential gene expression patterns (2015)

Moulos, P., Hatzis, P.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-03-01

Beschreibung: RNA-Seq is gradually becoming the standard tool for transcriptomic expression studies in biological research. Although considerable progress has been recorded in the development of statistical algorithms for the detection of differentially expressed genes using RNA-Seq data, the list of detected genes can differ significantly between algorithms. We present a new method (PANDORA) that combines multiple algorithms toward a summarized result, more efficiently reflecting true experimental outcomes. This is achieved through the systematic combination of several analysis algorithms, by weighting their outcomes according to their performance with realistically simulated data sets generated from real data. Results supported by the analysis of both simulated and real data from different organisms as well as correlation with PolII occupancy demonstrate that PANDORA improves the detection of differential expression. It accomplishes this by optimizing the tradeoff between standard performance measurements, such as precision and sensitivity.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext

10

Unbekannt

Sparse expression bases in cancer reveal tumor drivers (2015)

Logsdon, B. A., Gentles, A. J., Miller, C. P., Blau, C. A., Becker, P. S., Lee, S.-I.

Oxford University Press

In: Nucleic Acids Research

zur Merkliste hinzufügen auf der Merkliste

Details

Publikationsdatum: 2015-02-18

Beschreibung: We define a new category of candidate tumor drivers in cancer genome evolution: ‘selected expression regulators’ (SERs)—genes driving dysregulated transcriptional programs in cancer evolution. The SERs are identified from genome-wide tumor expression data with a novel method, namely SPARROW ( SPAR se selected exp R essi O n regulators identified W ith penalized regression). SPARROW uncovers a previously unknown connection between cancer expression variation and driver events, by using a novel sparse regression technique. Our results indicate that SPARROW is a powerful complementary approach to identify candidate genes containing driver events that are hard to detect from sequence data, due to a large number of passenger mutations and lack of comprehensive sequence information from a sufficiently large number of samples. SERs identified by SPARROW reveal known driver mutations in multiple human cancers, along with known cancer-associated processes and survival-associated genes, better than popular methods for inferring gene expression networks. We demonstrate that when applied to acute myeloid leukemia expression data, SPARROW identifies an apoptotic biomarker ( PYCARD ) for an investigational drug obatoclax. The PYCARD and obatoclax association is validated in 30 AML patient samples.

Schlagwort(e): Computational Methods, Genomics

Print ISSN: 0305-1048

Digitale ISSN: 1362-4962

Thema: Biologie

Publiziert von Oxford University Press

Permalink

	Standort	Signatur	Erwartet	Verfügbarkeit

Andere fanden auch interessant ...

AKTUELLE ARTIKEL

S·F·X

Volltext