ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

Likelihood-based complex trait association testing for arbitrary depth sequencing data (2015)

Yan, S., Yuan, S., Xu, Z., Zhang, B., Zhang, B., Kang, G., Byrnes, A., Li, Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-09-11

Description: : In next generation sequencing (NGS)-based genetic studies, researchers typically perform genotype calling first and then apply standard genotype-based methods for association testing. However, such a two-step approach ignores genotype calling uncertainty in the association testing step and may incur power loss and/or inflated type-I error. In the recent literature, a few robust and efficient likelihood based methods including both likelihood ratio test (LRT) and score test have been proposed to carry out association testing without intermediate genotype calling. These methods take genotype calling uncertainty into account by directly incorporating genotype likelihood function (GLF) of NGS data into association analysis. However, existing LRT methods are computationally demanding or do not allow covariate adjustment; while existing score tests are not applicable to markers with low minor allele frequency (MAF). We provide an LRT allowing flexible covariate adjustment, develop a statistically more powerful score test and propose a combination strategy (UNC combo) to leverage the advantages of both tests. We have carried out extensive simulations to evaluate the performance of our proposed LRT and score test. Simulations and real data analysis demonstrate the advantages of our proposed combination strategy: it offers a satisfactory trade-off in terms of computational efficiency, applicability (accommodating both common variants and variants with low MAF) and statistical power, particularly for the analysis of quantitative trait where the power gain can be up to ~60% when the causal variant is of low frequency (MAF 〈 0.01). Availability and implementation : UNC combo and the associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/UNCcombo/ Contact: yunli@med.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

2

Unknown

R2C: improving ab initio residue contact map prediction using dynamic fusion strategy and Gaussian noise filter (2016)

Yang, J., Jin, Q.-Y., Zhang, B., Shen, H.-B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2016-08-11

Description: Motivation: Inter-residue contacts in proteins dictate the topology of protein structures. They are crucial for protein folding and structural stability. Accurate prediction of residue contacts especially for long-range contacts is important to the quality of ab inito structure modeling since they can enforce strong restraints to structure assembly. Results: In this paper, we present a new Residue-Residue Contact predictor called R 2 C that combines machine learning-based and correlated mutation analysis-based methods, together with a two-dimensional Gaussian noise filter to enhance the long-range residue contact prediction. Our results show that the outputs from the machine learning-based method are concentrated with better performance on short-range contacts; while for correlated mutation analysis-based approach, the predictions are widespread with higher accuracy on long-range contacts. An effective query-driven dynamic fusion strategy proposed here takes full advantages of the two different methods, resulting in an impressive overall accuracy improvement. We also show that the contact map directly from the prediction model contains the interesting Gaussian noise, which has not been discovered before. Different from recent studies that tried to further enhance the quality of contact map by removing its transitive noise, we designed a new two-dimensional Gaussian noise filter, which was especially helpful for reinforcing the long-range residue contact prediction. Tested on recent CASP10/11 datasets, the overall top L /5 accuracy of our final R 2 C predictor is 17.6%/15.5% higher than the pure machine learning-based method and 7.8%/8.3% higher than the correlated mutation analysis-based approach for the long-range residue contact prediction. Availability and Implementation: http://www.csbio.sjtu.edu.cn/bioinf/R2C/ Contact: hbshen@sjtu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

3

Unknown

Empowering biologists with multi-omics data: colorectal cancer as a paradigm (2015)

Zhu, J., Shi, Z., Wang, J., Zhang, B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-04-28

Description: Motivation: Recent completion of the global proteomic characterization of The Cancer Genome Atlas (TCGA) colorectal cancer (CRC) cohort resulted in the first tumor dataset with complete molecular measurements at DNA, RNA and protein levels. Using CRC as a paradigm, we describe the application of the NetGestalt framework to provide easy access and interpretation of multi-omics data. Results: The NetGestalt CRC portal includes genomic, epigenomic, transcriptomic, proteomic and clinical data for the TCGA CRC cohort, data from other CRC tumor cohorts and cell lines, and existing knowledge on pathways and networks, giving a total of more than 17 million data points. The portal provides features for data query, upload, visualization and integration. These features can be flexibly combined to serve various needs of the users, maximizing the synergy among omics data, human visualization and quantitative analysis. Using three case studies, we demonstrate that the portal not only provides user-friendly data query and visualization but also enables efficient data integration within a single omics data type, across multiple omics data types, and over biological networks. Availability and implementation: The NetGestalt CRC portal can be freely accessed at http://www.netgestalt.org . Contact: bing.zhang@vanderbilt.edu Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

4

Unknown

customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search (2013)

Wang, X., Zhang, B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-11-29

Description: : Database search is the most widely used approach for peptide and protein identification in mass spectrometry-based proteomics studies. Our previous study showed that sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in the samples and thus improve protein identification. More importantly, single nucleotide variations, short insertion and deletions and novel junctions identified from RNA-Seq data make protein database more complete and sample-specific. Here, we report an R package customProDB that enables the easy generation of customized databases from RNA-Seq data for proteomics search. This work bridges genomics and proteomics studies and facilitates cross-omics data integration. Availability and implementation: customProDB and related documents are freely available at http://bioconductor.org/packages/2.13/bioc/html/customProDB.html . Contact: bing.zhang@vanderbilt.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

5

Unknown

CasOT: a genome-wide Cas9/gRNA off-target searching tool (2014)

Xiao, A., Cheng, Z., Kong, L., Zhu, Z., Lin, S., Gao, G., Zhang, B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2014-04-11

Description: : The CRISPR/Cas or Cas9/guide RNA system is a newly developed, easily engineered and highly effective tool for gene targeting; it has considerable off-target effects in cultured human cells and in several organisms. However, the Cas9/guide RNA target site is too short for existing alignment tools to exhaustively and effectively identify potential off-target sites. CasOT is a local tool designed to find potential off-target sites in any given genome or user-provided sequence, with user-specified types of protospacer adjacent motif, and number of mismatches allowed in the seed and non-seed regions. Availability: http://eendb.zfgenetics.org/casot/ Contact: zfgenetics@gmail.com or bzhang@pku.edu.cn Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

6

Unknown

AISAIC: a software suite for accurate identification of significant aberrations in cancers (2014)

Zhang, B., Hou, X., Yuan, X., Shih, I.-M., Zhang, Z., Clarke, R., Wang, R. R., Fu, Y., Madhavan, S., Wang, Y., Yu, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2014-01-29

Description: : Accurate identification of significant aberrations in cancers (AISAIC) is a systematic effort to discover potential cancer-driving genes such as oncogenes and tumor suppressors. Two major confounding factors against this goal are the normal cell contamination and random background aberrations in tumor samples. We describe a Java AISAIC package that provides comprehensive analytic functions and graphic user interface for integrating two statistically principled in silico approaches to address the aforementioned challenges in DNA copy number analyses. In addition, the package provides a command-line interface for users with scripting and programming needs to incorporate or extend AISAIC to their customized analysis pipelines. This open-source multiplatform software offers several attractive features: (i) it implements a user friendly complete pipeline from processing raw data to reporting analytic results; (ii) it detects deletion types directly from copy number signals using a Bayes hypothesis test; (iii) it estimates the fraction of normal contamination for each sample; (iv) it produces unbiased null distribution of random background alterations by iterative aberration-exclusive permutations; and (v) it identifies significant consensus regions and the percentage of homozygous/hemizygous deletions across multiple samples. AISAIC also provides users with a parallel computing option to leverage ubiquitous multicore machines. Availability and implementation: AISAIC is available as a Java application, with a user’s guide and source code, at https://code.google.com/p/aisaic/ . Contact: yug@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

KDDN: an open-source Cytoscape app for constructing differential dependency networks with significant rewiring (2015)

Tian, Y., Zhang, B., Hoffman, E. P., Clarke, R., Zhang, Z., Shih, I.-M., Xuan, J., Herrington, D. M., Wang, Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-01-10

Description: : We have developed an integrated molecular network learning method, within a well-grounded mathematical framework, to construct differential dependency networks with significant rewiring. This knowledge-fused differential dependency networks (KDDN) method, implemented as a Java Cytoscape app, can be used to optimally integrate prior biological knowledge with measured data to simultaneously construct both common and differential networks, to quantitatively assign model parameters and significant rewiring p-values and to provide user-friendly graphical results. The KDDN algorithm is computationally efficient and provides users with parallel computing capability using ubiquitous multi-core machines. We demonstrate the performance of KDDN on various simulations and real gene expression datasets, and further compare the results with those obtained by the most relevant peer methods. The acquired biologically plausible results provide new insights into network rewiring as a mechanistic principle and illustrate KDDN’s ability to detect them efficiently and correctly. Although the principal application here involves microarray gene expressions, our methodology can be readily applied to other types of quantitative molecular profiling data. Availability: Source code and compiled package are freely available for download at http://apps.cytoscape.org/apps/kddn Contact: yuewang@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink