ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

Indel and Carryforward Correction (ICC): a new analysis approach for processing 454 pyrosequencing data (2013)

Deng, W., Maust, B. S., Westfall, D. H., Chen, L., Zhao, H., Larsen, B. B., Iyer, S., Liu, Y., Mullins, J. I.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Pyrosequencing technology provides an important new approach to more extensively characterize diverse sequence populations and detect low frequency variants. However, the promise of this technology has been difficult to realize, as careful correction of sequencing errors is crucial to distinguish rare variants (~1%) in an infected host with high sensitivity and specificity. Results: We developed a new approach, referred to as Indel and Carryforward Correction (ICC), to cluster sequences without substitutions and locally correct only indel and carryforward sequencing errors within clusters to ensure that no rare variants are lost. ICC performs sequence clustering in the order of (i) homopolymer indel patterns only, (ii) indel patterns only and (iii) carryforward errors only, without the requirement of a distance cutoff value. Overall, ICC removed 93–95% of sequencing errors found in control datasets. On pyrosequencing data from a PCR fragment derived from 15 HIV-1 plasmid clones mixed at various frequencies as low as 0.1%, ICC achieved the highest sensitivity and similar specificity compared with other commonly used error correction and variant calling algorithms. Availability and implementation: Source code is freely available for download at http://indra.mullins.microbiol.washington.edu/ICC . It is implemented in Perl and supported on Linux, Mac OS X and MS Windows. Contact: jmullins@uw.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

2

Unknown

Tiki, at the head of a new superfamily of enzymes (2013)

Sanchez-Pulido, L., Ponting, C. P.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Tiki proteins appear to antagonize Wnt signalling pathway by acting as Wnt proteases, thereby affecting Wnt solubility by its amino-terminal cleavage. Tiki1 protease activity was shown to be metal ion-dependent and was inhibited by chelating agents and thus was tentatively proposed to be a metalloprotease. Nevertheless, Tiki proteins exhibit no detectable sequence similarity to previously described metalloproteases, but instead have been reported as being homologues of TraB proteins (Pfam ID: PF01963), a widely distributed family of unknown function and structure. Here, we show that Tiki proteins are members of a new superfamily of domains contained not just in TraB proteins, but also in erythromycin esterase (Pfam ID: PF05139), DUF399 (domain of unknown function 399; Pfam ID: PF04187) and MARTX toxins that contribute to host invasion and pathogenesis by bacteria. We establish the core fold of this enzymatic domain and its catalytic residues. Contact: luis.sanchezpulido@dpag.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

3

Unknown

MeltDB 2.0-advances of the metabolomics software system (2013)

Kessler, N., Neuweger, H., Bonte, A., Langenkamper, G., Niehaus, K., Nattkemper, T. W., Goesmann, A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: The research area metabolomics achieved tremendous popularity and development in the last couple of years. Owing to its unique interdisciplinarity, it requires to combine knowledge from various scientific disciplines. Advances in the high-throughput technology and the consequently growing quality and quantity of data put new demands on applied analytical and computational methods. Exploration of finally generated and analyzed datasets furthermore relies on powerful tools for data mining and visualization. Results: To cover and keep up with these requirements, we have created MeltDB 2.0, a next-generation web application addressing storage, sharing, standardization, integration and analysis of metabolomics experiments. New features improve both efficiency and effectivity of the entire processing pipeline of chromatographic raw data from pre-processing to the derivation of new biological knowledge. First, the generation of high-quality metabolic datasets has been vastly simplified. Second, the new statistics tool box allows to investigate these datasets according to a wide spectrum of scientific and explorative questions. Availability: The system is publicly available at https://meltdb.cebitec.uni-bielefeld.de . A login is required but freely available. Contact: nkessler@cebitec.uni-bielefeld.de

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

4

Unknown

RACER: Rapid and accurate correction of errors in reads (2013)

Ilie, L., Molnar, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: High - throughput next - generation sequencing technologies enable increasingly fast and affordable sequencing of genomes and transcriptomes, with a broad range of applications. The quality of the sequencing data is crucial for all applications. A significant portion of the data produced contains errors, and ever more efficient error correction programs are needed. Results: We propose RACER (Rapid and Accurate Correction of Errors in Reads), a new software program for correcting errors in sequencing data. RACER has better error-correcting performance than existing programs, is faster and requires less memory. To support our claims, we performed extensive comparison with the existing leading programs on a variety of real datasets. Availability: RACER is freely available for non-commercial use at www.csd.uwo.ca/~ilie/RACER/ . Contact: ilie@csd.uwo.ca Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

5

Unknown

SICOP: identifying significant co-interaction patterns (2013)

Spitz, A., Zweig, K. A., Horvat, E.-A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Interactions between various types of molecules that regulate crucial cellular processes are extensively investigated by high-throughput experiments and require dedicated computational methods for the analysis of the resulting data. In many cases, these data can be represented as a bipartite graph because it describes interactions between elements of two different types such as the influence of different experimental conditions on cellular variables or the direct interaction between receptors and their activators/inhibitors. One of the major challenges in the analysis of such noisy datasets is the statistical evaluation of the relationship between any two elements of the same type. Here, we present SICOP (significant co-interaction patterns), an implementation of a method that provides such an evaluation based on the number of their common interaction partners, their so-called co-interaction. This general network analytic method, proved successful in diverse fields, provides a framework for assessing the significance of this relationship by comparison with the expected co-interaction in a suitable null model of the same bipartite graph. SICOP takes into consideration up to two distinct types of interactions such as up- or downregulation. The tool is written in Java and accepts several common input formats and supports different output formats, facilitating further analysis and visualization. Its key features include a user-friendly interface, easy installation and platform independence. Availability: The software is open source and available at cna.cs.uni-kl.de/SICOP under the terms of the GNU General Public Licence (version 3 or later). Contact: agnes.horvat@iwr.uni-heidelberg.de or zweig@cs.uni-kl.de

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

6

Unknown

mpMoRFsDB: a database of molecular recognition features in membrane proteins (2013)

Gypas, F., Tsaousis, G. N., Hamodrakas, S. J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Molecular recognition features (MoRFs) are small, intrinsically disordered regions in proteins that undergo a disorder-to-order transition on binding to their partners. MoRFs are involved in protein–protein interactions and may function as the initial step in molecular recognition. The aim of this work was to collect, organize and store all membrane proteins that contain MoRFs. Membrane proteins constitute ~30% of fully sequenced proteomes and are responsible for a wide variety of cellular functions. MoRFs were classified according to their secondary structure, after interacting with their partners. We identified MoRFs in transmembrane and peripheral membrane proteins. The position of transmembrane protein MoRFs was determined in relation to a protein’s topology. All information was stored in a publicly available mySQL database with a user-friendly web interface. A Jmol applet is integrated for visualization of the structures. mpMoRFsDB provides valuable information related to disorder-based protein–protein interactions in membrane proteins. Availability: http://bioinformatics.biol.uoa.gr/mpMoRFsDB Contact: shamodr@biol.uoa.gr

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

SPNConverter: a new link between static and dynamic complex network analysis (2013)

Dent, J. E., Yang, X., Nardini, C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : The signaling Petri net (SPN) simulator, designed to provide insights into the trends of molecules’ activity levels in response to an external stimulus, contributes to the systems biology necessity of analyzing the dynamics of large-scale cellular networks. Implemented into the freely available software, BioLayout Express 3D , the simulator is publicly available and easy to use, provided the input files are prepared in the GraphML format, typically using the network editing software, yEd, and standards specific to the software. However, analysis of complex networks represented using other systems biology formatting languages (on which popular software, such as CellDesigner and Cytoscape, are based) requires manual manipulation, a step that is prone to error and limits the use of the SPN simulator in BioLayout Express 3D . To overcome this, we present a Cytoscape plug-in that enables users to automatically convert networks for analysis with the SPN simulator from the standard systems biology markup language. The automation of this step opens the SPN simulator to a far larger user group than has previously been possible. Availability and implementation: Distributed under the GNU General Public License Version 3 at http://apps.cytoscape.org/apps/spnconverter . Contact: christine@picb.ac.cn

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

8

Unknown

CoDNaS: a database of conformational diversity in the native state of proteins (2013)

Monzon, A. M., Juritz, E., Fornasari, M. S., Parisi, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Conformational diversity is a key concept in the understanding of different issues related with protein function such as the study of catalytic processes in enzymes, protein-protein recognition, protein evolution and the origins of new biological functions. Here, we present a database of proteins with different degrees of conformational diversity. Conformational Diversity of Native State (CoDNaS) is a redundant collection of three-dimensional structures for the same protein derived from protein data bank. Structures for the same protein obtained under different crystallographic conditions have been associated with snapshots of protein dynamism and consequently could characterize protein conformers. CoDNaS allows the user to explore global and local structural differences among conformers as a function of different parameters such as presence of ligand, post-translational modifications, changes in oligomeric states and differences in pH and temperature. Additionally, CoDNaS contains information about protein taxonomy and function, disorder level and structural classification offering useful information to explore the underlying mechanism of conformational diversity and its close relationship with protein function. Currently, CoDNaS has 122 122 structures integrating 12 684 entries, with an average of 9.63 conformers per protein. Availability: The database is freely available at http://www.codnas.com.ar/ . Contact: gusparisi@gmail.com

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

9

Unknown

Modeling nucleosome position distributions from experimental nucleosome positioning maps (2013)

Schopflin, R., Teif, V. B., Muller, O., Weinberg, C., Rippe, K., Wedemann, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Recent experimental advancements allow determining positions of nucleosomes for complete genomes. However, the resulting nucleosome occupancy maps are averages of heterogeneous cell populations. Accordingly, they represent a snapshot of a dynamic ensemble at a single time point with an overlay of many configurations from different cells. To study the organization of nucleosomes along the genome and to understand the mechanisms of nucleosome translocation, it is necessary to retrieve features of specific conformations from the population average. Results: Here, we present a method for identifying non-overlapping nucleosome configurations that combines binary-variable analysis and a Monte Carlo approach with a simulated annealing scheme. In this manner, we obtain specific nucleosome configurations and optimized solutions for the complex positioning patterns from experimental data. We apply the method to compare nucleosome positioning at transcription factor binding sites in different mouse cell types. Our method can model nucleosome translocations at regulatory genomic elements and generate configurations for simulations of the spatial folding of the nucleosome chain. Availability: Source code, precompiled binaries, test data and a web-based test installation are freely available at http://bioinformatics.fh-stralsund.de/nucpos/ Contact: gero.wedemann@fh-stralsund.de Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

10

Unknown

High-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling (2013)

Yang, J., Jang, R., Zhang, Y., Shen, H.-B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Residue–residue contacts across the transmembrane helices dictate the three-dimensional topology of alpha-helical membrane proteins. However, contact determination through experiments is difficult because most transmembrane proteins are hard to crystallize. Results: We present a novel method (MemBrain) to derive transmembrane inter-helix contacts from amino acid sequences by combining correlated mutations and multiple machine learning classifiers. Tested on 60 non-redundant polytopic proteins using a strict leave-one-out cross-validation protocol, MemBrain achieves an average accuracy of 62%, which is 12.5% higher than the current best method from the literature. When applied to 13 recently solved G protein-coupled receptors, the MemBrain contact predictions helped increase the TM-score of the I-TASSER models by 37% in the transmembrane region. The number of foldable cases (TM-score 〉0.5) increased by 100%, where all G protein-coupled receptor templates and homologous templates with sequence identity 〉30% were excluded. These results demonstrate significant progress in contact prediction and a potential for contact-driven structure modeling of transmembrane proteins. Availability: www.csbio.sjtu.edu.cn/bioinf/MemBrain/ Contact: hbshen@sjtu.edu.cn or zhng@umich.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

11

Unknown

Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment (2013)

Yang, J., Roy, A., Zhang, Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Identification of protein–ligand binding sites is critical to protein function annotation and drug discovery. However, there is no method that could generate optimal binding site prediction for different protein types. Combination of complementary predictions is probably the most reliable solution to the problem. Results: We develop two new methods, one based on binding-specific substructure comparison (TM-SITE) and another on sequence profile alignment (S-SITE), for complementary binding site predictions. The methods are tested on a set of 500 non-redundant proteins harboring 814 natural, drug-like and metal ion molecules. Starting from low-resolution protein structure predictions, the methods successfully recognize 〉51% of binding residues with average Matthews correlation coefficient (MCC) significantly higher (with P -value 〈10 –9 in student t -test) than other state-of-the-art methods, including COFACTOR, FINDSITE and ConCavity. When combining TM-SITE and S-SITE with other structure-based programs, a consensus approach (COACH) can increase MCC by 15% over the best individual predictions. COACH was examined in the recent community-wide COMEO experiment and consistently ranked as the best method in last 22 individual datasets with the Area Under the Curve score 22.5% higher than the second best method. These data demonstrate a new robust approach to protein–ligand binding site recognition, which is ready for genome-wide structure-based function annotations. Availability: http://zhanglab.ccmb.med.umich.edu/COACH/ Contact: zhng@umich.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

12

Unknown

Inferring nucleosome positions with their histone mark annotation from ChIP data (2013)

Mammana, A., Vingron, M., Chung, H.-R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: The nucleosome is the basic repeating unit of chromatin. It contains two copies each of the four core histones H2A, H2B, H3 and H4 and about 147 bp of DNA. The residues of the histone proteins are subject to numerous post-translational modifications, such as methylation or acetylation. Chromatin immunoprecipitiation followed by sequencing (ChIP-seq) is a technique that provides genome-wide occupancy data of these modified histone proteins, and it requires appropriate computational methods. Results: We present NucHunter, an algorithm that uses the data from ChIP-seq experiments directed against many histone modifications to infer positioned nucleosomes. NucHunter annotates each of these nucleosomes with the intensities of the histone modifications. We demonstrate that these annotations can be used to infer nucleosomal states with distinct correlations to underlying genomic features and chromatin-related processes, such as transcriptional start sites, enhancers, elongation by RNA polymerase II and chromatin-mediated repression. Thus, NucHunter is a versatile tool that can be used to predict positioned nucleosomes from a panel of histone modification ChIP-seq experiments and infer distinct histone modification patterns associated to different chromatin states. Availability: The software is available at http://epigen.molgen.mpg.de/nuchunter/ . Contact: chung@molgen.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

13

Unknown

Bayesian consensus clustering (2013)

Lock, E. F., Dunson, D. B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: In biomedical research a growing number of platforms and technologies are used to measure diverse but related information, and the task of clustering a set of objects based on multiple sources of data arises in several applications. Most current approaches to multisource clustering either independently determine a separate clustering for each data source or determine a single ‘joint’ clustering for all data sources. There is a need for more flexible approaches that simultaneously model the dependence and the heterogeneity of the data sources. Results: We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separate clusterings adhere loosely to an overall consensus clustering, and hence they are not independent. We describe a computationally scalable Bayesian framework for simultaneous estimation of both the consensus clustering and the source-specific clusterings. We demonstrate that this flexible approach is more robust than joint clustering of all data sources, and is more powerful than clustering each data source independently. We present an application to subtype identification of breast cancer tumor samples using publicly available data from The Cancer Genome Atlas. Availability: R code with instructions and examples is available at http://people.duke.edu/%7Eel113/software.html . Contact: Eric.Lock@duke.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

14

Unknown

Novel human lncRNA-disease association inference based on lncRNA expression profiles (2013)

Chen, X., Yan, G.-Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: More and more evidences have indicated that long–non-coding RNAs (lncRNAs) play critical roles in many important biological processes. Therefore, mutations and dysregulations of these lncRNAs would contribute to the development of various complex diseases. Developing powerful computational models for potential disease-related lncRNAs identification would benefit biomarker identification and drug discovery for human disease diagnosis, treatment, prognosis and prevention. Results : In this article, we proposed the assumption that similar diseases tend to be associated with functionally similar lncRNAs. Then, we further developed the method of Laplacian Regularized Least Squares for LncRNA–Disease Association (LRLSLDA) in the semisupervised learning framework. Although known disease–lncRNA associations in the database are rare, LRLSLDA still obtained an AUC of 0.7760 in the leave-one-out cross validation, significantly improving the performance of previous methods. We also illustrated the performance of LRLSLDA is not sensitive (even robust) to the parameters selection and it can obtain a reliable performance in all the test classes. Plenty of potential disease–lncRNA associations were publicly released and some of them have been confirmed by recent results in biological experiments. It is anticipated that LRLSLDA could be an effective and important biological tool for biomedical research. Availability: The code of LRLSLDA is freely available at http://asdcd.amss.ac.cn/Software/Details/2 . Contact: xingchen@amss.ac.cn or yangy@amt.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

15

Unknown

Genome compression: a novel approach for large collections (2013)

Deorowicz, S., Danek, A., Grabowski, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Genomic repositories are rapidly growing, as witnessed by the 1000 Genomes or the UK10K projects. Hence, compression of multiple genomes of the same species has become an active research area in the past years. The well-known large redundancy in human sequences is not easy to exploit because of huge memory requirements from traditional compression algorithms. Results: We show how to obtain several times higher compression ratio than of the best reported results, on two large genome collections (1092 human and 775 plant genomes). Our inputs are variant call format files restricted to their essential fields. More precisely, our novel Ziv-Lempel-style compression algorithm squeezes a single human genome to ~400 KB. The key to high compression is to look for similarities across the whole collection, not just against one reference sequence, what is typical for existing solutions. Availability: http://sun.aei.polsl.pl/tgc (also as Supplementary Material) under a free license. Supplementary data: Supplementary data are available at Bioinformatics online. Contact: sebastian.deorowicz@polsl.pl

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

16

Unknown

Near-optimal experimental design for model selection in systems biology (2013)

Busetto, A. G., Hauser, A., Krummenacher, G., Sunnaker, M., Dimopoulos, S., Ong, C. S., Stelling, J., Buhmann, J. M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Biological systems are understood through iterations of modeling and experimentation. Not all experiments, however, are equally valuable for predictive modeling. This study introduces an efficient method for experimental design aimed at selecting dynamical models from data. Motivated by biological applications, the method enables the design of crucial experiments: it determines a highly informative selection of measurement readouts and time points. Results: We demonstrate formal guarantees of design efficiency on the basis of previous results. By reducing our task to the setting of graphical models, we prove that the method finds a near-optimal design selection with a polynomial number of evaluations. Moreover, the method exhibits the best polynomial-complexity constant approximation factor, unless P = NP. We measure the performance of the method in comparison with established alternatives, such as ensemble non-centrality, on example models of different complexity. Efficient design accelerates the loop between modeling and experimentation: it enables the inference of complex mechanisms, such as those controlling central metabolic operation. Availability: Toolbox ‘NearOED’ available with source code under GPL on the Machine Learning Open Source Software Web site (mloss.org). Contact: busettoa@inf.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

17

Unknown

omiRas: a Web server for differential expression analysis of miRNAs derived from small RNA-Seq data (2013)

Muller, S., Rycak, L., Winter, P., Kahl, G., Koch, I., Rotter, B.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : Small RNA deep sequencing is widely used to characterize non-coding RNAs (ncRNAs) differentially expressed between two conditions, e.g. healthy and diseased individuals and to reveal insights into molecular mechanisms underlying condition-specific phenotypic traits. The ncRNAome is composed of a multitude of RNAs, such as transfer RNA, small nucleolar RNA and microRNA (miRNA), to name few. Here we present omiRas, a Web server for the annotation, comparison and visualization of interaction networks of ncRNAs derived from next-generation sequencing experiments of two different conditions. The Web tool allows the user to submit raw sequencing data and results are presented as: (i) static annotation results including length distribution, mapping statistics, alignments and quantification tables for each library as well as lists of differentially expressed ncRNAs between conditions and (ii) an interactive network visualization of user-selected miRNAs and their target genes based on the combination of several miRNA–mRNA interaction databases. Availability and Implementation: The omiRas Web server is implemented in Python, PostgreSQL, R and can be accessed at: http://tools.genxpro.net/omiras/ . Contact: rotter@genxpro.de Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

18

Unknown

PARSEC: PAtteRn SEarch and Contextualization (2013)

Allot, A., Anno, Y.-N., Poidevin, L., Ripp, R., Poch, O., Lecompte, O.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : We present PARSEC (PAtteRn Search and Contextualization), a new open source platform for guided discovery, allowing localization and biological characterization of short genomic sites in entire eukaryotic genomes. PARSEC can search for a sequence or a degenerated pattern. The retrieved set of genomic sites can be characterized in terms of (i) conservation in model organisms, (ii) genomic context (proximity to genes) and (iii) function of neighboring genes. These modules allow the user to explore, visualize, filter and extract biological knowledge from a set of short genomic regions such as transcription factor binding sites. Availability: Web site implemented in Java, JavaScript and C++, with all major browsers supported. Freely available at lbgi.fr/parsec. Source code is freely available at sourceforge.net/projects/genomicparsec. Contact: odile.lecompte@unistra.fr Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

19

Unknown

SPOCS: software for predicting and visualizing orthology/paralogy relationships among genomes (2013)

Curtis, D. S., Phillips, A. R., Callister, S. J., Conlan, S., McCue, L. A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : At the rate that prokaryotic genomes can now be generated, comparative genomics studies require a flexible method for quickly and accurately predicting orthologs among the rapidly changing set of genomes available. SPOCS implements a graph-based ortholog prediction method to generate a simple tab-delimited table of orthologs and in addition, html files that provide a visualization of the predicted ortholog/paralog relationships to which gene/protein expression metadata may be overlaid. Availability and Implementation: A SPOCS web application is freely available at http://cbb.pnnl.gov/portal/tools/spocs.html . Source code for Linux systems is also freely available under an open source license at http://cbb.pnnl.gov/portal/software/spocs.html ; the Boost C++ libraries and BLAST are required. Contact: leeann.mccue@pnnl.gov

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

20

Unknown

CellMissy: a tool for management, storage and analysis of cell migration data produced in wound healing-like assays (2013)

Masuzzo, P., Hulstaert, N., Huyck, L., Ampe, C., Van Troys, M., Martens, L.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : Automated image processing has allowed cell migration research to evolve to a high-throughput research field. As a consequence, there is now an unmet need for data management in this domain. The absence of a generic management system for the quantitative data generated in cell migration assays results in each dataset being treated in isolation, making data comparison across experiments difficult. Moreover, by integrating quality control and analysis capabilities into such a data management system, the common practice of having to manually transfer data across different downstream analysis tools will be markedly sped up and made more robust. In addition, access to a data management solution creates gateways for data standardization, meta-analysis and structured public data dissemination. We here present CellMissy, a cross-platform data management system for cell migration data with a focus on wound healing data. CellMissy simplifies and automates data management, storage and analysis from the initial experimental set-up to data exploration. Availability and implementation: CellMissy is a cross-platform open-source software developed in Java. Source code and cross-platform binaries are freely available under the Apache2 open source license at http://cellmissy.googlecode.com . Contact: lennart.martens@ugent.be Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

21

Unknown

On representative and illustrative comparisons with real data in bioinformatics: response to the letter to the editor by Smith et al. (2013)

Boulesteix, A.-L.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

22

Unknown

targetHub: a programmable interface for miRNA-gene interactions (2013)

Manyam, G., Ivan, C., Calin, G. A., Coombes, K. R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: With the expansion of high-throughput technologies, understanding different kinds of genome-level data is a common task. MicroRNA (miRNA) is increasingly profiled using high-throughput technologies (microarrays or next-generation sequencing). The downstream analysis of miRNA targets can be difficult. Although there are many databases and algorithms to predict miRNA targets, there are few tools to integrate miRNA–gene interaction data into high-throughput genomic analyses. Results: We present targetHub, a CouchDB database of miRNA–gene interactions. TargetHub provides a programmer-friendly interface to access miRNA targets. The Web site provides RESTful access to miRNA–gene interactions with an assortment of gene and miRNA identifiers. It can be a useful tool to integrate miRNA target interaction data directly into high-throughput bioinformatics analyses. Availability: TargetHub is available on the web at http://app1.bioinformatics.mdanderson.org/tarhub/_design/basic/index.html . Contact: coombes.3@osu.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

23

Unknown

Exploring the role of human miRNAs in virus-host interactions using systematic overlap analysis (2013)

Li, Z., Cui, X., Li, F., Li, P., Ni, M., Wang, S., Bo, X.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Human miRNAs have recently been found to have important roles in viral replication. Understanding the patterns and details of human miRNA interactions during virus–host interactions may help uncover novel antiviral therapies. Based on the abundance of knowledge available regarding protein–protein interactions (PPI), virus–host protein interactions, experimentally validated human miRNA-target pairs and transcriptional regulation of human miRNAs, it is possible to explore the complex regulatory network that exists between viral proteins and human miRNAs at the system level. Results: By integrating current data regarding the virus–human interactome and human miRNA-target pairs, the overlap between targets of viral proteins and human miRNAs was identified and found to represent topologically important proteins (e.g. hubs or bottlenecks) at the global center of the human PPI network. Viral proteins and human miRNAs were also found to significantly target human PPI pairs. Furthermore, an overlap analysis of virus targets and transcription factors (TFs) of human miRNAs revealed that viral proteins preferentially target human miRNA TFs, representing a new pattern of virus–host interactions. Potential feedback loops formed by viruses, human miRNAs and miRNA TFs were also identified, and these may be exploited by viruses resulting in greater virulence and more effective replication strategies. Contact: boxc@bmi.ac.cn or ni.ming@163.com or sqwang@bmi.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

24

Unknown

Assessing association between protein truncating variants and quantitative traits (2013)

Rivas, M. A., Pirinen, M., Neville, M. J., Gaulton, K. J., Moutsianas, L., Go; T2; D Consortium, Lindgren, C. M., Karpe, F., McCarthy, M. I., Donnelly, P.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: In sequencing studies of common diseases and quantitative traits, power to test rare and low frequency variants individually is weak. To improve power, a common approach is to combine statistical evidence from several genetic variants in a region. Major challenges are how to do the combining and which statistical framework to use. General approaches for testing association between rare variants and quantitative traits include aggregating genotypes and trait values, referred to as ‘collapsing’, or using a score-based variance component test. However, little attention has been paid to alternative models tailored for protein truncating variants. Recent studies have highlighted the important role that protein truncating variants, commonly referred to as ‘loss of function’ variants, may have on disease susceptibility and quantitative levels of biomarkers. We propose a Bayesian modelling framework for the analysis of protein truncating variants and quantitative traits. Results: Our simulation results show that our models have an advantage over the commonly used methods. We apply our models to sequence and exome-array data and discover strong evidence of association between low plasma triglyceride levels and protein truncating variants at APOC3 (Apolipoprotein C3). Availability: Software is available from http://www.well.ox.ac.uk/~rivas/mamba Contact: donnelly@well.ox.ac.uk

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

25

Unknown

A pattern matching approach to the automatic selection of particles from low-contrast electron micrographs (2013)

Abrishami, V., Zaldivar-Peraza, A., de la Rosa-Trevin, J. M., Vargas, J., Oton, J., Marabini, R., Shkolnisky, Y., Carazo, J. M., Sorzano, C. O. S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Structural information of macromolecular complexes provides key insights into the way they carry out their biological functions. Achieving high-resolution structural details with electron microscopy requires the identification of a large number (up to hundreds of thousands) of single particles from electron micrographs, which is a laborious task if it has to be manually done and constitutes a hurdle towards high-throughput. Automatic particle selection in micrographs is far from being settled and new and more robust algorithms are required to reduce the number of false positives and false negatives. Results: In this article, we introduce an automatic particle picker that learns from the user the kind of particles he is interested in. Particle candidates are quickly and robustly classified as particles or non-particles. A number of new discriminative shape-related features as well as some statistical description of the image grey intensities are used to train two support vector machine classifiers. Experimental results demonstrate that the proposed method: (i) has a considerably low computational complexity and (ii) provides results better or comparable with previously reported methods at a fraction of their computing time. Availability: The algorithm is fully implemented in the open-source Xmipp package and downloadable from http://xmipp.cnb.csic.es . Contact: vabrishami@cnb.csic.es or coss@cnb.csic.es Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

26

Unknown

OncoSNP-SEQ: a statistical approach for the identification of somatic copy number alterations from next-generation sequencing of cancer genomes (2013)

Yau, C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Recent major cancer genome sequencing studies have used whole-genome sequencing to detect various types of genomic variation. However, a number of these studies have continued to rely on SNP array information to provide additional results for copy number and loss-of-heterozygosity estimation and assessing tumour purity. OncoSNP-SEQ is a statistical model-based approach for inferring copy number profiles directly from high-coverage whole genome sequencing data that is able to account for unknown tumour purity and ploidy. Availability: MATLAB code is available at the following URL: https://sites.google.com/site/oncosnpseq/ . Contact : c.yau@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

27

Unknown

MIG: Multi-Image Genome viewer (2013)

Mc; Gowan, S. J., Hughes, J. R., Han, Z.-P., Taylor, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Multi-Image Genome (MIG) viewer is a web-based application for visualizing, querying and filtering many thousands of genome browser regions as well as for exporting the data in a variety of formats. This methodology has been used successfully to analyze ChIP-Seq data and RNA-Seq data and to detect somatic mutations in genome resequencing projects. Availability: MIG is available at https://mig.molbiol.ox.ac.uk/mig/ Contact: simon.mcgowan@imm.ox.ac.uk

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

28

Unknown

ASPeak: an abundance sensitive peak detection algorithm for RIP-Seq (2013)

Kucukural, A., Ozadam, H., Singh, G., Moore, M. J., Cenik, C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Unlike DNA, RNA abundances can vary over several orders of magnitude. Thus, identification of RNA–protein binding sites from high-throughput sequencing data presents unique challenges. Although peak identification in ChIP-Seq data has been extensively explored, there are few bioinformatics tools tailored for peak calling on analogous datasets for RNA-binding proteins. Here we describe ASPeak (abundance sensitive peak detection algorithm), an implementation of an algorithm that we previously applied to detect peaks in exon junction complex RNA immunoprecipitation in tandem experiments. Our peak detection algorithm yields stringent and robust target sets enabling sensitive motif finding and downstream functional analyses. Availability: ASPeak is implemented in Perl as a complete pipeline that takes bedGraph files as input. ASPeak implementation is freely available at https://sourceforge.net/projects/as-peak under the GNU General Public License. ASPeak can be run on a personal computer, yet is designed to be easily parallelizable. ASPeak can also run on high performance computing clusters providing efficient speedup. The documentation and user manual can be obtained from http://master.dl.sourceforge.net/project/as-peak/manual.pdf . Contact: alper.kucukural@umassmed.edu or ccenik@stanford.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

29

Unknown

Kinannote, a computer program to identify and classify members of the eukaryotic protein kinase superfamily (2013)

Goldberg, J. M., Griggs, A. D., Smith, J. L., Haas, B. J., Wortman, J. R., Zeng, Q.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Kinases of the eukaryotic protein kinase superfamily are key regulators of most aspects eukaryotic cellular behavior and have provided several drug targets including kinases dysregulated in cancers. The rapid increase in the number of genomic sequences has created an acute need to identify and classify members of this important class of enzymes efficiently and accurately. Results: Kinannote produces a draft kinome and comparative analyses for a predicted proteome using a single line command, and it is currently the only tool that automatically classifies protein kinases using the controlled vocabulary of Hanks and Hunter [Hanks and Hunter (1995)]. A hidden Markov model in combination with a position-specific scoring matrix is used by Kinannote to identify kinases, which are subsequently classified using a BLAST comparison with a local version of KinBase, the curated protein kinase dataset from www.kinase.com . Kinannote was tested on the predicted proteomes from four divergent species. The average sensitivity and precision for kinome retrieval from the test species are 94.4 and 96.8%. The ability of Kinannote to classify identified kinases was also evaluated, and the average sensitivity and precision for full classification of conserved kinases are 71.5 and 82.5%, respectively. Kinannote has had a significant impact on eukaryotic genome annotation, providing protein kinase annotations for 36 genomes made public by the Broad Institute in the period spanning 2009 to the present. Availability: Kinannote is freely available at http://sourceforge.net/projects/kinannote . Contact: jmgold@broadinstitute.org Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

30

Unknown

Statistical agglomeration: peak summarization for direct infusion lipidomics (2013)

Smith, R., Anthonymuthu, T. S., Ventura, D., Prince, J. T.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Quantification of lipids is a primary goal in lipidomics. In direct infusion/injection (or shotgun) lipidomics, accurate downstream identification and quantitation requires accurate summarization of repetitive peak measurements. Imprecise peak summarization multiplies downstream error by propagating into species identification and intensity estimation. To our knowledge, this is the first analysis of direct infusion peak summarization in the literature. Results: We present two novel peak summarization algorithms for direct infusion samples and compare them with an off-machine ad hoc summarization algorithm as well as with the propriety Xcalibur algorithm. Our statistical agglomeration algorithm reduces peakwise error by 38% mass/charge (m/z) and 44% (intensity) compared with the ad hoc method over three datasets. Pointwise error is reduced by 23% (m/z). Compared with Xcalibur, our statistical agglomeration algorithm produces 68% less m/z error and 51% less intensity error on average on two comparable datasets. Availability: The source code for Statistical Agglomeration and the datasets used are freely available for non-commercial purposes at https://github.com/optimusmoose/statistical_agglomeration . Modified Bin Aggolmeration is freely available in MSpire, an open source mass spectrometry package at https://github.com/princelab/mspire/ . Contact: 2robsmith@gmail.com or jtprince@chem.byu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

31

Unknown

Graph-based peak alignment algorithms for multiple liquid chromatography-mass spectrometry datasets (2013)

Wang, J., Lam, H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Liquid chromatography coupled to mass spectrometry (LC-MS) is the dominant technological platform for proteomics. An LC-MS analysis of a complex biological sample can be visualized as a ‘map’ of which the positional coordinates are the mass-to-charge ratio (m/z) and chromatographic retention time (RT) of the chemical species profiled. Label-free quantitative proteomics requires the alignment and comparison of multiple LC-MS maps to ascertain the reproducibility of experiments or reveal proteome changes under different conditions. The main challenge in this task lies in correcting inevitable RT shifts. Similar, but not identical, LC instruments and settings can cause peptides to elute at very different times and sometimes in a different order, violating the assumptions of many state-of-the-art alignment tools. To meet this challenge, we developed LWBMatch, a new algorithm based on weighted bipartite matching. Unlike existing tools, which search for accurate warping functions to correct RT shifts, we directly seek a peak-to-peak mapping by maximizing a global similarity function between two LC-MS maps. For alignment tasks with large RT shifts (〉500 s), an approximate warping function is determined by locally weighted scatterplot smoothing of potential matched features, detected using a novel voting scheme based on co-elution. For validation, we defined the ground truth for alignment success based on tandem mass spectrometry identifications from sequence searching. We showed that our method outperforms several existing tools in terms of precision and recall, and is capable of aligning maps from different instruments and settings. Availability: Available at https://sourceforge.net/projects/rt-alignment/ . Contact: kehlam@ust.hk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

32

Unknown

SW#-GPU-enabled exact alignments on genome scale (2013)

Korpar, M., Sikic, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : We propose SW#, a new CUDA graphical processor unit-enabled and memory-efficient implementation of dynamic programming algorithm, for local alignment. It can be used as either a stand-alone application or a library. Although there are other graphical processor unit implementations of the Smith–Waterman algorithm, SW# is the only one publicly available that can produce sequence alignments on genome-wide scale. For long sequences, it is at least a few hundred times faster than a CPU version of the same algorithm. Availability: Source code and installation instructions freely available for download at http://complex.zesoi.fer.hr/SW.html . Contact: mile.sikic@fer.hr Supplementary information: Supplementary results are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

33

Unknown

isomiRID: a framework to identify microRNA isoforms (2013)

de Oliveira, L. F. V., Christoff, A. P., Margis, R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : MicroRNAs (miRNAs) have been extensively studied owing to their important regulatory roles in genic expression. An increasingly number of reports are performing extensive data mining in small RNA sequencing libraries to detect miRNAs isoforms and also 5' and 3' post-transcriptional nucleotide additions, as well as edited miRNAs sequences. A ready to use pipeline, isomiRID, was developed to standardize and automatize the search for miRNAs isoforms in high-throughput small RNA sequencing libraries. Availability: isomiRID is a command line Python script available at http://www.ufrgs.br/RNAi/isomiRID/ . Contact: rogerio.margis@ufrgs.br Supplementary information: Supplementary Date are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

34

Unknown

Analysis of base-pairing probabilities of RNA molecules involved in protein-RNA interactions (2013)

Iwakiri, J., Kameda, T., Asai, K., Hamada, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Understanding the details of protein–RNA interactions is important to reveal the functions of both the RNAs and the proteins. In these interactions, the secondary structures of the RNAs play an important role. Because RNA secondary structures in protein–RNA complexes are variable, considering the ensemble of RNA secondary structures is a useful approach. In particular, recent studies have supported the idea that, in the analysis of RNA secondary structures, the base-pairing probabilities (BPPs) of RNAs (i.e. the probabilities of forming a base pair in the ensemble of RNA secondary structures) provide richer and more robust information about the structures than a single RNA secondary structure, for example, the minimum free energy structure or a snapshot of structures in the Protein Data Bank. However, there has been no investigation of the BPPs in protein–RNA interactions. Results: In this study, we analyzed BPPs of RNA molecules involved in known protein–RNA complexes in the Protein Data Bank. Our analysis suggests that, in the tertiary structures, the BPPs (which are computed using only sequence information) for unpaired nucleotides with intermolecular hydrogen bonds (hbonds) to amino acids were significantly lower than those for unpaired nucleotides without hbonds. On the other hand, no difference was found between the BPPs for paired nucleotides with and without intermolecular hbonds. Those findings were commonly supported by three probabilistic models, which provide the ensemble of RNA secondary structures, including the McCaskill model based on Turner’s free energy of secondary structures. Contact: iwakiri@cb.k.u-tokyo.ac.jp or mhamada@cb.k.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

35

Unknown

Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions (2013)

Shugay, M., Ortiz de Mendibil, I., Vizmanos, J. L., Novo, F. J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Gene fusions resulting from chromosomal aberrations are an important cause of cancer. The complexity of genomic changes in certain cancer types has hampered the identification of gene fusions by molecular cytogenetic methods, especially in carcinomas. This is changing with the advent of next-generation sequencing, which is detecting a substantial number of new fusion transcripts in individual cancer genomes. However, this poses the challenge of identifying those fusions with greater oncogenic potential amid a background of ‘passenger’ fusion sequences. Results: In the present work, we have used some recently identified genomic hallmarks of oncogenic fusion genes to develop a pipeline for the classification of fusion sequences, namely, Oncofuse. The pipeline predicts the oncogenic potential of novel fusion genes, calculating the probability that a fusion sequence behaves as ‘driver’ of the oncogenic process based on features present in known oncogenic fusions. Cross-validation and extensive validation tests on independent datasets suggest a robust behavior with good precision and recall rates. We believe that Oncofuse could become a useful tool to guide experimental validation studies of novel fusion sequences found during next-generation sequencing analysis of cancer transcriptomes. Availability and implementation: Oncofuse is a naive Bayes Network Classifier trained and tested using Weka machine learning package. The pipeline is executed by running a Java/Groovy script, available for download at www.unav.es/genetica/oncofuse.html . Contact: fnovo@unav.es Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

36

Unknown

MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples (2013)

Behr, J., Kahles, A., Zhong, Y., Sreedharan, V. T., Drewe, P., Ratsch, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. Results: We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction. Availability: MITIE is implemented in C++ and is available from http://bioweb.me/mitie under the GPL license. Contact: Jonas_Behr@web.de and raetsch@cbio.mskcc.org Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

37

Unknown

A distance-based test of association between paired heterogeneous genomic data (2013)

Minas, C., Curry, E., Montana, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation : Due to rapid technological advances, a wide range of different measurements can be obtained from a given biological sample including single nucleotide polymorphisms, copy number variation, gene expression levels, DNA methylation and proteomic profiles. Each of these distinct measurements provides the means to characterize a certain aspect of biological diversity, and a fundamental problem of broad interest concerns the discovery of shared patterns of variation across different data types. Such data types are heterogeneous in the sense that they represent measurements taken at different scales or represented by different data structures. Results : We propose a distance-based statistical test, the generalized RV (GRV) test, to assess whether there is a common and non-random pattern of variability between paired biological measurements obtained from the same random sample. The measurements enter the test through the use of two distance measures, which can be chosen to capture a particular aspect of the data. An approximate null distribution is proposed to compute P -values in closed-form and without the need to perform costly Monte Carlo permutation procedures. Compared with the classical Mantel test for association between distance matrices, the GRV test has been found to be more powerful in a number of simulation settings. We also demonstrate how the GRV test can be used to detect biological pathways in which genetic variability is associated to variation in gene expression levels in an ovarian cancer sample, and present results obtained from two independent cohorts. Availability : R code to compute the GRV test is freely available from http://www2.imperial.ac.uk/~gmontana Contact : g.montana@imperial.ac.uk Supplementary data : Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

38

Unknown

Identification of active transcription factor and miRNA regulatory pathways in Alzheimer's disease (2013)

Jiang, W., Zhang, Y., Meng, F., Lian, B., Chen, X., Yu, X., Dai, E., Wang, S., Liu, X., Li, X., Wang, L., Li, X.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: Alzheimer’s disease (AD) is a severe neurodegenerative disease of the central nervous system that may be caused by perturbation of regulatory pathways rather than the dysfunction of a single gene. However, the pathology of AD has yet to be fully elucidated. Results: In this study, we systematically analyzed AD-related mRNA and miRNA expression profiles as well as curated transcription factor (TF) and miRNA regulation to identify active TF and miRNA regulatory pathways in AD. By mapping differentially expressed genes and miRNAs to the curated TF and miRNA regulatory network as active seed nodes, we obtained a potential active subnetwork in AD. Next, by using the breadth-first-search technique, potential active regulatory pathways, which are the regulatory cascade of TFs, miRNAs and their target genes, were identified. Finally, based on the known AD-related genes and miRNAs, the hypergeometric test was used to identify active pathways in AD. As a result, nine pathways were found to be significantly activated in AD. A comprehensive literature review revealed that eight out of nine genes and miRNAs in these active pathways were associated with AD. In addition, we inferred that the pathway hsa-miR-146a-〉STAT1-〉MYC, which is the source of all nine significantly active pathways, may play an important role in AD progression, which should be further validated by biological experiments. Thus, this study provides an effective approach to finding active TF and miRNA regulatory pathways in AD and can be easily applied to other complex diseases. Contact: lixia@hrbmu.edu.cn or lw2247@gmail.com . Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

39

Unknown

Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier (2013)

Sheppard, S., Lawson, N. D., Zhu, L. J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: 3' end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 3' ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic filters have been applied in these cases, they typically result in a high proportion of both false-positive and -negative classifications. Therefore, there is a need to develop improved algorithms to better identify mis-priming events in oligo-dT primed sequences. Results: By analyzing sequence features flanking 3' ends derived from oligo-dT-based sequencing, we developed a naïve Bayes classifier to classify them as true or false/internally primed. The resulting algorithm is highly accurate, outperforms previous heuristic filters and facilitates identification of novel polyadenylation sites. Contact: nathan.lawson@umassmed.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

40

Unknown

A Turing test for artificial expression data (2013)

Maier, R., Zimmer, R., Kuffner, R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: The lack of reliable, comprehensive gold standards complicates the development of many bioinformatics tools, particularly for the analysis of expression data and biological networks. Simulation approaches can provide provisional gold standards, such as regulatory networks, for the assessment of network inference methods. However, this just defers the problem, as it is difficult to assess how closely simulators emulate the properties of real data. Results: In analogy to Turing’s test discriminating humans and computers based on responses to questions, we systematically compare real and artificial systems based on their gene expression output. Different expression data analysis techniques such as clustering are applied to both types of datasets. We define and extract distributions of properties from the results, for instance, distributions of cluster quality measures or transcription factor activity patterns. Distributions of properties are represented as histograms to enable the comparison of artificial and real datasets. We examine three frequently used simulators that generate expression data from parameterized regulatory networks. We identify features distinguishing real from artificial datasets that suggest how simulators could be adapted to better emulate real datasets and, thus, become more suitable for the evaluation of data analysis tools. Availability: See http://www2.bio.ifi.lmu.de/~kueffner/attfad/ and the supplement for precomputed analyses; other compendia can be analyzed via the CRAN package attfad. The full datasets can be obtained from http://www2.bio.ifi.lmu.de/~kueffner/attfad/data.tar.gz . Contact: robert.kueffner@bio.ifi.lmu.de Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

41

Unknown

MLML: consistent simultaneous estimates of DNA methylation and hydroxymethylation (2013)

Qu, J., Zhou, M., Song, Q., Hong, E. E., Smith, A. D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: The two major epigenetic modifications of cytosines, 5-methylcytosine (5-mC) and 5-hydroxymethylcytosine (5-hmC), coexist with each other in a range of mammalian cell populations. Increasing evidence points to important roles of 5-hmC in demethylation of 5-mC and epigenomic regulation in development. Recently developed experimental methods allow direct single-base profiling of either 5-hmC or 5-mC. Meaningful analyses seem to require combining these experiments with bisulfite sequencing, but doing so naively produces inconsistent estimates of 5-mC or 5-hmC levels. Results: We present a method to jointly model read counts from bisulfite sequencing, oxidative bisulfite sequencing and Tet-Assisted Bisulfite sequencing, providing simultaneous estimates of 5-hmC and 5-mC levels that are consistent across experiment types. Availability: http://smithlab.usc.edu/software/mlml Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

42

Unknown

Incorporating prior knowledge into Gene Network Study (2013)

Wang, Z., Xu, W., San Lucas, F. A., Liu, Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Motivation: A major goal in genomic research is to identify genes that may jointly influence a biological response. From many years of intensive biomedical research, a large body of biological knowledge, or pathway information, has accumulated in available databases. There is a strong interest in leveraging these pathways to improve the statistical power and interpretability in studying gene networks associated with complex phenotypes. This prior information is a valuable complement to large-scale genomic data such as gene expression data generated from microarrays. However, it is a non-trivial task to effectively integrate available biological knowledge into gene expression data when reconstructing gene networks. Results: In this article, we developed and applied a Lasso method from a Bayesian perspective, a method we call prior Lasso (pLasso), for the reconstruction of gene networks. In this method, we partition edges between genes into two subsets: one subset of edges is present in known pathways, whereas the other has no prior information associated. Our method assigns different prior distributions to each subset according to a modified Bayesian information criterion that incorporates prior knowledge on both the network structure and the pathway information. Simulation studies have indicated that the method is more effective in recovering the underlying network than a traditional Lasso method that does not use the prior information. We applied pLasso to microarray gene expression datasets, where we used information from the Pathway Commons (PC) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as prior information for the network reconstruction, and successfully identified network hub genes associated with clinical outcome in cancer patients. Availability: The source code is available at http://nba.uth.tmc.edu/homepage/liu/pLasso . Contact: Yin.Liu@uth.tmc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

43

Unknown

Pclust: protein network visualization highlighting experimental data (2013)

Li, W., Kinch, L. N., Grishin, N. V.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : One approach to infer functions of new proteins from their homologs utilizes visualization of an all-against-all pairwise similarity network (A2ApsN) that exploits the speed of BLAST and avoids the complexity of multiple sequence alignment. However, identifying functions of the protein clusters in A2ApsN is never trivial, due to a lack of linking characterized proteins to their relevant information in current software packages. Given the database errors introduced by automatic annotation transfer, functional deduction should be made from proteins with experimental studies, i.e. ‘reference proteins’. Here, we present a web server, termed Pclust, which provides a user-friendly interface to visualize the A2ApsN, placing emphasis on such ‘reference proteins’ and providing access to their full information in source databases, e.g. articles in PubMed. The identification of ‘reference proteins’ and the ease of cross-database linkage will facilitate understanding the functions of protein clusters in the network, thus promoting interpretation of proteins of interest. Availability: The Pclust server is freely available at http://prodata.swmed.edu/pclust Contact: grishin@chop.swmed.edu Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

44

Unknown

WebRASP: a server for computing energy scores to assess the accuracy and stability of RNA 3D structures (2013)

Norambuena, T., Cares, J. F., Capriotti, E., Melo, F.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : The understanding of the biological role of RNA molecules has changed. Although it is widely accepted that RNAs play important regulatory roles without necessarily coding for proteins, the functions of many of these non-coding RNAs are unknown. Thus, determining or modeling the 3D structure of RNA molecules as well as assessing their accuracy and stability has become of great importance for characterizing their functional activity. Here, we introduce a new web application, WebRASP, that uses knowledge-based potentials for scoring RNA structures based on distance-dependent pairwise atomic interactions. This web server allows the users to upload a structure in PDB format, select several options to visualize the structure and calculate the energy profile. The server contains online help, tutorials and links to other related resources. We believe this server will be a useful tool for predicting and assessing the quality of RNA 3D structures. Availability and implementation: The web server is available at http://melolab.org/webrasp . It has been tested on the most popular web browsers and requires Java plugin for Jmol visualization. Contact: fmelo@bio.puc.cl

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

45

Unknown

Corrigendum of 'High throughput analysis of epistasis in genome-wide association studies with BiForce' (2013)

Gyenesei, A., Semple, C. A. M., Haley, C. S., Wei, W.-H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: Contact: Wenhua.Wei@igmm.ed.ac.uk

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

46

Unknown

Scaffold network generator: a tool for mining molecular structures (2013)

Matlock, M. K., Zaretzki, J. M., Swamidass, S. J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : Scaffold network generator (SNG) is an open-source command-line utility that computes the hierarchical network of scaffolds that define a large set of input molecules. Scaffold networks are useful for visualizing, analysing and understanding the chemical data that is increasingly available through large public repositories like PubChem. For example, some groups have used scaffold networks to identify missed-actives in high-throughput screens of small molecules with bioassays. Substantially improving on existing software, SNG is robust enough to work on millions of molecules at a time with a simple command-line interface. Availability and implementation: SNG is accessible at http://swami.wustl.edu/sng Contact: swamidass@wustl.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

47

Unknown

The BioPAX Validator (2013)

Rodchenkov, I., Demir, E., Sander, C., Bader, G. D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : BioPAX is a community-developed standard language for biological pathway data. A key functionality required for efficient BioPAX data exchange is validation— detecting errors and inconsistencies in BioPAX documents. The BioPAX Validator is a command-line tool, Java library and online web service for BioPAX that performs 〉100 classes of consistency checks. Availability and implementation: The validator recognizes common syntactic errors and semantic inconsistencies and reports them in a customizable human readable format. It can also automatically fix some errors and normalize BioPAX data. Since its release, the validator has become a critical tool for the pathway informatics community, detecting thousands of errors and helping substantially increase the conformity and uniformity of BioPAX-formatted data. The BioPAX Validator is open source and released under LGPL v3 license. All sources, binaries and documentation can be found at sf.net/p/biopax, and the latest stable version of the web application is available at biopax.org/validator. Contact: igor.rodchenkov@utoronto.ca or gary.bader@utoronto.ca

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

48

Unknown

nCal: an R package for non-linear calibration (2013)

Fong, Y., Sebestyen, K., Yu, X., Gilbert, P., Self, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-10-04

Description: : Non-linear calibration is a widely used method for quantifying biomarkers wherein concentration-response curves estimated using samples of known concentrations are used to predict the biomarker concentrations in the samples of interest. The R package nCal fills an important gap in the open source, stand-alone software for performing non-linear calibration. For curve fitting, nCal provides a new implementation of a robust, Bayesian hierarchical five-parameter logistic model. nCal supports a simple graphical user interface that can be used by laboratory scientists, and contains functionality for importing data from the multiplex bead array assay instrumentation. Availability: The R package ‘nCal’ is available from http://cran.r-project.org/web/packages/nCal/ under GPL-2 or later. Contact: yfong@fhcrc.org Supplementary information: Supplementary information is available in the form of an R package vignette at the above repository and an FAQ at http://research.fhcrc.org/youyifong/en/resources/ncal.html .

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

49

Unknown

Shimmer: detection of genetic alterations in tumors using next-generation sequence data (2013)

Hansen, N. F., Gartner, J. J., Mei, L., Samuels, Y., Mullikin, J. C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Extensive DNA sequencing of tumor and matched normal samples using exome and whole-genome sequencing technologies has enabled the discovery of recurrent genetic alterations in cancer cells, but variability in stromal contamination and subclonal heterogeneity still present a severe challenge to available detection algorithms. Results: Here, we describe publicly available software, Shimmer, which accurately detects somatic single-nucleotide variants using statistical hypothesis testing with multiple testing correction. This program produces somatic single-nucleotide variant predictions with significantly higher sensitivity and accuracy than other available software when run on highly contaminated or heterogeneous samples, and it gives comparable sensitivity and accuracy when run on samples of high purity. Availability: http://www.github.com/nhansen/Shimmer Contact: nhansen@mail.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

50

Unknown

Mendel: the Swiss army knife of genetic analysis programs (2013)

Lange, K., Papp, J. C., Sinsheimer, J. S., Sripracha, R., Zhou, H., Sobel, E. M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: : Mendel is one of the few statistical genetics packages that provide a full spectrum of gene mapping methods, ranging from parametric linkage in large pedigrees to genome-wide association with rare variants. Our latest additions to Mendel anticipate and respond to the needs of the genetics community. Compared with earlier versions, Mendel is faster and easier to use and has a wider range of applications. Supported platforms include Linux, MacOS and Windows. Availability : Free from www.genetics.ucla.edu/software/mendel Contact: klange@ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

51

Unknown

A hierarchical model of transcriptional dynamics allows robust estimation of transcription rates in populations of single cells with variable gene copy number (2013)

Woodcock, D. J., Vance, K. W., Komorowski, M., Koentges, G., Finkenstadt, B., Rand, D. A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: cis -regulatory DNA sequence elements, such as enhancers and silencers, function to control the spatial and temporal expression of their target genes. Although the overall levels of gene expression in large cell populations seem to be precisely controlled, transcription of individual genes in single cells is extremely variable in real time. It is, therefore, important to understand how these cis -regulatory elements function to dynamically control transcription at single-cell resolution. Recently, statistical methods have been proposed to back calculate the rates involved in mRNA transcription using parameter estimation of a mathematical model of transcription and translation. However, a major complication in these approaches is that some of the parameters, particularly those corresponding to the gene copy number and transcription rate, cannot be distinguished; therefore, these methods cannot be used when the copy number is unknown. Results: Here, we develop a hierarchical Bayesian model to estimate biokinetic parameters from live cell enhancer–promoter reporter measurements performed on a population of single cells. This allows us to investigate transcriptional dynamics when the copy number is variable across the population. We validate our method using synthetic data and then apply it to quantify the function of two known developmental enhancers in real time and in single cells. Availability: Supporting information is submitted with the article. Contact: d.j.woodcock@warwick.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

52

Unknown

Predicting the functional consequences of cancer-associated amino acid substitutions (2013)

Shihab, H. A., Gough, J., Cooper, D. N., Day, I. N. M., Gaunt, T. R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: The number of missense mutations being identified in cancer genomes has greatly increased as a consequence of technological advances and the reduced cost of whole-genome/whole-exome sequencing methods. However, a high proportion of the amino acid substitutions detected in cancer genomes have little or no effect on tumour progression (passenger mutations). Therefore, accurate automated methods capable of discriminating between driver (cancer-promoting) and passenger mutations are becoming increasingly important. In our previous work, we developed the Functional Analysis through Hidden Markov Models (FATHMM) software and, using a model weighted for inherited disease mutations, observed improved performances over alternative computational prediction algorithms. Here, we describe an adaptation of our original algorithm that incorporates a cancer-specific model to potentiate the functional analysis of driver mutations. Results: The performance of our algorithm was evaluated using two separate benchmarks. In our analysis, we observed improved performances when distinguishing between driver mutations and other germ line variants (both disease-causing and putatively neutral mutations). In addition, when discriminating between somatic driver and passenger mutations, we observed performances comparable with the leading computational prediction algorithms: SPF-Cancer and TransFIC. Availability and implementation: A web-based implementation of our cancer-specific model, including a downloadable stand-alone package, is available at http://fathmm.biocompute.org.uk . Contact: fathmm@biocompute.org.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

53

Unknown

Exome-based analysis for RNA epigenome sequencing data (2013)

Meng, J., Cui, X., Rao, M. K., Chen, Y., Huang, Y.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Fragmented RNA immunoprecipitation combined with RNA sequencing enabled the unbiased study of RNA epigenome at a near single-base resolution; however, unique features of this new type of data call for novel computational techniques. Result: Through examining the connections of RNA epigenome sequencing data with two well-studied data types, ChIP-Seq and RNA-Seq, we unveiled the salient characteristics of this new data type. The computational strategies were discussed accordingly, and a novel data processing pipeline was proposed that combines several existing tools with a newly developed exome-based approach ‘exomePeak’ for detecting, representing and visualizing the post-transcriptional RNA modification sites on the transcriptome. Availability: The MATLAB package ‘exomePeak’ and additional details are available at http://compgenomics.utsa.edu/exomePeak/ . Contact: yufei.huang@utsa.edu or jmeng@mit.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

54

Unknown

Novel algorithms and the benefits of comparative validation (2013)

Smith, R., Ventura, D., Prince, J. T.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Contact: 2robsmith@gmail.com

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

55

Unknown

International Society for Computational Biology Honors Goncalo Abecasis with Top Bioinformatics/Computational Biology Award for 2013 (2013)

Fogg, C. N., Kovats, D. E.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

56

Unknown

HOMECAT: consensus homologs mapping for interspecific knowledge transfer and functional genomic data integration (2013)

Zorzan, S., Lorenzetto, E., Ettorre, M., Pontelli, V., Laudanna, C., Buffelli, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Comparative studies are encouraged by the fast increase of data availability from the latest high-throughput techniques, in particular from functional genomic studies. Yet, the size of datasets, the challenge of complete orthologs findings and not last, the variety of identification formats, make information integration challenging. With HOMECAT, we aim to facilitate cross-species relationship identification and data mapping, by combining orthology predictions from several publicly available sources, a convenient interface for high-throughput data download and automatic identifier conversion into a Cytoscape plug-in, that provides both an integration with a large set of bioinformatics tools, as well as a user-friendly interface. Availability: HOMECAT and the Supplementary Materials are freely available at http://www.cbmc.it/homecat/ . Contact: simone.zorzan@univr.it Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

57

Unknown

INstruct: a database of high-quality 3D structurally resolved protein interactome networks (2013)

Meyer, M. J., Das, J., Wang, X., Yu, H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: : INstruct is a database of high-quality, 3D, structurally resolved protein interactome networks in human and six model organisms. INstruct combines the scale of available high-quality binary protein interaction data with the specificity of atomic-resolution structural information derived from co-crystal evidence using a tested interaction interface inference method. Its web interface is designed to allow for flexible search based on standard and organism-specific protein and gene-naming conventions, visualization of protein architecture highlighting interaction interfaces and viewing and downloading custom 3D structurally resolved interactome datasets. Availability: INstruct is freely available on the web at http://instruct.yulab.org with all major browsers supported. Contact: haiyuan.yu@cornell.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

58

Unknown

OCSANA: optimal combinations of interventions from network analysis (2013)

Vera-Licona, P., Bonnet, E., Barillot, E., Zinovyev, A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Targeted therapies interfering with specifically one protein activity are promising strategies in the treatment of diseases like cancer. However, accumulated empirical experience has shown that targeting multiple proteins in signaling networks involved in the disease is often necessary. Thus, one important problem in biomedical research is the design and prioritization of optimal combinations of interventions to repress a pathological behavior, while minimizing side-effects. OCSANA (optimal combinations of interventions from network analysis) is a new software designed to identify and prioritize optimal and minimal combinations of interventions to disrupt the paths between source nodes and target nodes. When specified by the user, OCSANA seeks to additionally minimize the side effects that a combination of interventions can cause on specified off-target nodes. With the crucial ability to cope with very large networks, OCSANA includes an exact solution and a novel selective enumeration approach for the combinatorial interventions’ problem. Availability: The latest version of OCSANA, implemented as a plugin for Cytoscape and distributed under LGPL license, is available together with source code at http://bioinfo.curie.fr/projects/ocsana . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: paola.vera-licona@curie.fr

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

59

Unknown

CellH5: a format for data exchange in high-content screening (2013)

Sommer, C., Held, M., Fischer, B., Huber, W., Gerlich, D. W.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: : High-throughput microscopy data require a diversity of analytical approaches. However, the construction of workflows that use algorithms from different software packages is difficult owing to a lack of interoperability. To overcome this limitation, we present CellH5, an HDF5 data format for cell-based assays in high-throughput microscopy, which stores high-dimensional image data along with inter-object relations in graphs. CellH5Browser, an interactive gallery image browser, demonstrates the versatility and performance of the file format on live imaging data of dividing human cells. CellH5 provides new opportunities for integrated data analysis by multiple software platforms. Availability: Source code is freely available at www.github.com/cellh5 under the GPL license and at www.bioconductor.org/packages/release/bioc/html/rhdf5.html under the Artistic-2.0 license. Demo datasets and the CellH5Browser are available at www.cellh5.org . A Fiji importer for cellh5 will be released soon. Contact: daniel.gerlich@imba.oeaw.ac.at or christoph.sommer@imba.oeaw.ac.at Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

60

Unknown

rRNA:mRNA pairing alters the length and the symmetry of mRNA-protected fragments in ribosome profiling experiments (2013)

O'Connor, P. B. F., Li, G.-W., Weissman, J. S., Atkins, J. F., Baranov, P. V.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Ribosome profiling is a new technique that allows monitoring locations of translating ribosomes on mRNA at a whole transcriptome level. A recent ribosome profiling study demonstrated that internal Shine–Dalgarno (SD) sequences have a major global effect on translation rates in bacteria: ribosomes pause at SD sites in mRNA. Therefore, it is important to understand how SD sites effect mRNA movement through the ribosome and generation of ribosome footprints. Results: Here, we provide evidence that in addition to pausing effect, internal SD sequences induce a caterpillar-like movement of mRNA through the ribosome cavity. Once an SD site binds to the ribosome, it remains attached to it while the ribosome decodes a few subsequent codons. This leads to asymmetric progressive elongation of ribosome footprints at the 3'-end. It is likely that internal SD sequences induce a pause not on a single, but on several adjacent codons. This finding is important for our understanding of mRNA movement through the ribosome and also should facilitate interpretation of ribosome profiling data. Contact: brave.oval.pan@gmail.com

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

61

Unknown

Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data (2013)

Birol, I., Raymond, A., Jackman, S. D., Pleasance, S., Coope, R., Taylor, G. A., Yuen, M. M. S., Keeling, C. I., Brand, D., Vandervalk, B. P., Kirk, H., Pandoh, P., Moore, R. A., Zhao, Y., Mungall, A. J., Jaquish, B., Yanchuk, A., Ritland, C., Boyle, B., Bousquet, J., Ritland, K., Mac; Kay, J., Bohlmann, J., Jones, S. J. M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: White spruce ( Picea glauca ) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20 356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies. Availability: The Picea glauca genome sequencing and assembly data are available through NCBI (Accession#: ALWZ0100000000 PID: PRJNA83435). http://www.ncbi.nlm.nih.gov/bioproject/83435 . Contact: ibirol@bcgsc.ca Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

62

Unknown

An HMM-based algorithm for evaluating rates of receptor-ligand binding kinetics from thermal fluctuation data (2013)

Ju, L., Wang, Y. D., Hung, Y., Wu, C.-F. J., Zhu, C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Abrupt reduction/resumption of thermal fluctuations of a force probe has been used to identify association/dissociation events of protein–ligand bonds. We show that off-rate of molecular dissociation can be estimated by the analysis of the bond lifetime, while the on-rate of molecular association can be estimated by the analysis of the waiting time between two neighboring bond events. However, the analysis relies heavily on subjective judgments and is time-consuming. To automate the process of mapping out bond events from thermal fluctuation data, we develop a hidden Markov model (HMM)-based method. Results: The HMM method represents the bond state by a hidden variable with two values: bound and unbound. The bond association/dissociation is visualized and pinpointed. We apply the method to analyze a key receptor–ligand interaction in the early stage of hemostasis and thrombosis: the von Willebrand factor (VWF) binding to platelet glycoprotein Ibα (GPIbα). The numbers of bond lifetime and waiting time events estimated by the HMM are much more than those estimated by a descriptive statistical method from the same set of raw data. The kinetic parameters estimated by the HMM are in excellent agreement with those by a descriptive statistical analysis, but have much smaller errors for both wild-type and two mutant VWF-A1 domains. Thus, the computerized analysis allows us to speed up the analysis and improve the quality of estimates of receptor–ligand binding kinetics. Contact: jeffwu@isye.gatech.edu or cheng.zhu@bme.gatech.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

63

Unknown

A powerful and efficient set test for genetic markers that handles confounders (2013)

Listgarten, J., Lippert, C., Kang, E. Y., Xiang, J., Kadie, C. M., Heckerman, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Approaches for testing sets of variants, such as a set of rare or common variants within a gene or pathway, for association with complex traits are important. In particular, set tests allow for aggregation of weak signal within a set, can capture interplay among variants and reduce the burden of multiple hypothesis testing. Until now, these approaches did not address confounding by family relatedness and population structure, a problem that is becoming more important as larger datasets are used to increase power. Results: We introduce a new approach for set tests that handles confounders. Our model is based on the linear mixed model and uses two random effects—one to capture the set association signal and one to capture confounders. We also introduce a computational speedup for two random-effects models that makes this approach feasible even for extremely large cohorts. Using this model with both the likelihood ratio test and score test, we find that the former yields more power while controlling type I error. Application of our approach to richly structured Genetic Analysis Workshop 14 data demonstrates that our method successfully corrects for population structure and family relatedness, whereas application of our method to a 15 000 individual Crohn’s disease case–control cohort demonstrates that it additionally recovers genes not recoverable by univariate analysis. Availability: A Python-based library implementing our approach is available at http://mscompbio.codeplex.com . Contact: jennl@microsoft.com or lippert@microsoft.com or heckerma@microsoft.com Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

64

Unknown

Promoter-proximal CCCTC-factor binding is associated with an increase in the transcriptional pausing index (2013)

Paredes, S. H., Melgar, M. F., Sethupathy, P.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: It has been known for more than 2 decades that after RNA polymerase II (RNAPII) initiates transcription, it can enter into a paused or stalled state immediately downstream of the transcription start site before productive elongation. Recent advances in high-throughput genomic technologies facilitated the discovery that RNAPII pausing at promoters is a widespread physiologically regulated phenomenon. The molecular underpinnings of pausing are incompletely understood. The CCCTC-factor (CTCF) is a ubiquitous nuclear factor that has diverse regulatory functions, including a recently discovered role in promoting RNAPII pausing at splice sites. Results: In this study, we analyzed CTCF binding sites and nascent transcriptomic data from three different cell types, and found that promoter-proximal CTCF binding is significantly associated with RNAPII pausing. Contact: praveen_sethupathy@med.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

65

Unknown

A multi-layer inference approach to reconstruct condition-specific genes and their regulation (2013)

Wu, M., Liu, L., Hijazi, H., Chan, C.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: An important topic in systems biology is the reverse engineering of regulatory mechanisms through reconstruction of context-dependent gene networks. A major challenge is to identify the genes and the regulations specific to a condition or phenotype, given that regulatory processes are highly connected such that a specific response is typically accompanied by numerous collateral effects. In this study, we design a multi-layer approach that is able to reconstruct condition-specific genes and their regulation through an integrative analysis of large-scale information of gene expression, protein interaction and transcriptional regulation (transcription factor-target gene relationships). We establish the accuracy of our methodology against synthetic datasets, as well as a yeast dataset. We then extend the framework to the application of higher eukaryotic systems, including human breast cancer and Arabidopsis thaliana cold acclimation. Our study identified TACSTD2 (TROP2) as a target gene for human breast cancer and discovered its regulation by transcription factors CREB, as well as NFkB. We also predict KIF2C is a target gene for ER–/HER2– breast cancer and is positively regulated by E2F1. The predictions were further confirmed through experimental studies. Availability: The implementation and detailed protocol of the layer approach is available at http://www.egr.msu.edu/changroup/Protocols/Three-layer%20approach%20to%20reconstruct%20condition.html . Contact: krischan@egr.msu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

66

Unknown

Systematic tracking of dysregulated modules identifies novel genes in cancer (2013)

Srihari, S., Ragan, M. A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: Deciphering the modus operandi of dysregulated cellular mechanisms in cancer is critical to implicate novel cancer genes and develop effective anti-cancer therapies. Fundamental to this is meticulous tracking of the behavior of core modules, including complexes and pathways across specific conditions in cancer. Results: Here, we performed a straightforward yet systematic identification and comparison of modules across pancreatic normal and cancer tissue conditions by integrating PPI, gene-expression and mutation data. Our analysis revealed interesting change-patterns in gene composition and expression correlation particularly affecting modules responsible for genome stability. Although in most cases these changes indicated impairment of essential functions (e.g. of DNA damage repair), in several other cases we noticed strengthening of modules possibly abetting cancer. Some of these compensatory modules showed switches in transcription regulation and recruitment of tumor inducers (e.g. SOX2 through overexpression). In-depth analysis revealed novel genes in pancreatic cancer, which showed susceptibility to copy-number alterations (e.g. for USP15 in 17 of 67 cases), supported by literature evidence for their involvement in other tumors (e.g. USP15 in glioblastoma). Two of the identified genes, YWHAE and DISC1, further supported the nexus between neural genes and pancreatic carcinogenesis. Extension of this assessment to BRCA1 and BRCA2 breast tumors showed specific differences even across the two sub-types and revealed novel genes involved therein (e.g. TRIM5 and NCOA6). Availability: Our software CONTOURv1 is available at: http://bioinformatics.org.au/tools-data/ . Contact: m.ragan@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

67

Unknown

Learning gene network structure from time laps cell imaging in RNAi Knock downs (2013)

Failmezger, H., Praveen, P., Tresch, A., Frohlich, H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: Motivation: As RNA interference is becoming a standard method for targeted gene perturbation, computational approaches to reverse engineer parts of biological networks based on measurable effects of RNAi become increasingly relevant. The vast majority of these methods use gene expression data, but little attention has been paid so far to other data types. Results : Here we present a method, which can infer gene networks from high-dimensional phenotypic perturbation effects on single cells recorded by time-lapse microscopy. We use data from the Mitocheck project to extract multiple shape, intensity and texture features at each frame. Features from different cells and movies are then aligned along the cell cycle time. Subsequently we use Dynamic Nested Effects Models (dynoNEMs) to estimate parts of the network structure between perturbed genes via a Markov Chain Monte Carlo approach. Our simulation results indicate a high reconstruction quality of this method. A reconstruction based on 22 gene knock downs yielded a network, where all edges could be explained via the biological literature. Availability : The implementation of dynoNEMs is part of the Bioconductor R-package nem . Contact: frohlich@bit.uni-bonn.de Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

68

Unknown

UPDtool: a tool for detection of iso- and heterodisomy in parent-child trios using SNP microarrays (2013)

Schroeder, C., Sturm, M., Dufke, A., Mau-Holzmann, U., Eggermann, T., Poths, S., Riess, O., Bonin, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-06-06

Description: : UPDtool is a computational tool for detection and classification of uniparental disomy (UPD) in trio SNP-microarray experiments. UPDs are rare events of chromosomal malsegregation and describe the condition of two homologous chromosomes or homologous chromosomal segments that were inherited from one parent. The occurrence of UPD can be of major clinical relevance. Though high-throughput molecular screening techniques are widely used, detection of UPDs and especially the subclassification remains complex. We developed UPDtool to detect and classify UPDs from SNP microarray data of parent–child trios. The algorithm was tested using five positive controls including both iso- and heterodisomic segmental UPDs and 30 trios from the HapMap project as negative controls. With UPDtool, we were able to correctly identify all occurrences of non-mosaic UPD within our positive controls, whereas no occurrence of UPD was found within our negative controls. In addition, the chromosomal breakage points could be determined more precisely than by microsatellite analysis. Our results were compared with both the gold standard, microsatellite analysis and SNPtrio, another program available for UPD detection. UPDtool is platform independent, light weight and flexible. Because of its simple input format, UPDtool may also be used with other high-throughput technologies (e.g. next-generation sequencing). Availability and implementation: UPDtool executables, documentation and examples can be downloaded from http://www.uni-tuebingen.de/uni/thk/de/f-genomik-software.html . Contact: christopher.schroeder@med.uni-tuebingen.de Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

69

Unknown

Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads (2013)

Zhang, K., Zhi, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Hidden Markov model, based on Li and Stephens model that takes into account chromosome sharing of multiple individuals, results in mainstream haplotype phasing algorithms for genotyping arrays and next-generation sequencing (NGS) data. However, existing methods based on this model assume that the allele count data are independently observed at individual sites and do not consider haplotype informative reads, i.e. reads that cover multiple heterozygous sites, which carry useful haplotype information. In our previous work, we developed a new hidden Markov model to incorporate a two-site joint emission term that captures the haplotype information across two adjacent sites. Although our model improves the accuracy of genotype calling and haplotype phasing, haplotype information in reads covering non-adjacent sites and/or more than two adjacent sites is not used because of the severe computational burden. Results: We develop a new probabilistic model for genotype calling and haplotype phasing from NGS data that incorporates haplotype information of multiple adjacent and/or non-adjacent sites covered by a read over an arbitrary distance. We develop a new hybrid Markov Chain Monte Carlo algorithm that combines the Gibbs sampling algorithm of HapSeq and Metropolis–Hastings algorithm and is computationally feasible. We show by simulation and real data from the 1000 Genomes Project that our model offers superior performance for haplotype phasing and genotype calling for population NGS data over existing methods. Availability: HapSeq2 is available at www.ssg.uab.edu/hapseq/ . Contact: dzhi@uab.edu or kzhang@uab.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

70

Unknown

PhosphoChain: a novel algorithm to predict kinase and phosphatase networks from high-throughput expression data (2013)

Chen, W.-M., Danziger, S. A., Chiang, J.-H., Aitchison, J. D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Protein phosphorylation is critical for regulating cellular activities by controlling protein activities, localization and turnover, and by transmitting information within cells through signaling networks. However, predictions of protein phosphorylation and signaling networks remain a significant challenge, lagging behind predictions of transcriptional regulatory networks into which they often feed. Results: We developed PhosphoChain to predict kinases, phosphatases and chains of phosphorylation events in signaling networks by combining mRNA expression levels of regulators and targets with a motif detection algorithm and optional prior information. PhosphoChain correctly reconstructed ~78% of the yeast mitogen-activated protein kinase pathway from publicly available data. When tested on yeast phosphoproteomic data from large-scale mass spectrometry experiments, PhosphoChain correctly identified ~27% more phosphorylation sites than existing motif detection tools (NetPhosYeast and GPS2.0), and predictions of kinase–phosphatase interactions overlapped with ~59% of known interactions present in yeast databases. PhosphoChain provides a valuable framework for predicting condition-specific phosphorylation events from high-throughput data. Availability: PhosphoChain is implemented in Java and available at http://virgo.csie.ncku.edu.tw/PhosphoChain/ or http://aitchisonlab.com/PhosphoChain Contact: john.aitchison@systemsbiology.org or jchiang@mail.ncku.edu.tw Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

71

Unknown

Twist-DNA: computing base-pair and bubble opening probabilities in genomic superhelical DNA (2013)

Jost, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Local opening of the DNA double helix is required in many fundamental biological processes and is, in part, controlled by the degree of superhelicity imposed in vivo by the protein machinery. In particular, positions of superhelically destabilized regions correlate with regulatory sites along the genome. Based on a self-consistent linearization of a thermodynamic model of superhelical DNA introduced by Benham, we have developed a program that predicts the locations of these regions by efficiently computing base-pair and bubble opening probabilities in genomic DNA. The program allows visualization of results in standard genome browsers to compare DNA opening properties with other available datasets. Availability and implementation: Source codes freely available for download at http://www.cbp.ens-lyon.fr/doku.php?id=developpement:productions:logiciels:twistdna , implemented in Fortran90 and supported on any Unix-based operating system (Linux, Mac OS X). Contact: daniel.jost@ens-lyon.fr Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

72

Unknown

Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities (2013)

Taghavi, Z., Movahedi, N. S., Draghici, S., Chitsaz, H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. Results: Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. Availability: Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/ . Contact: ztaghavi@wayne.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

73

Unknown

nhmmer: DNA homology search with profile HMMs (2013)

Wheeler, T. J., Eddy, S. R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Sequence database searches are an essential part of molecular biology, providing information about the function and evolutionary history of proteins, RNA molecules and DNA sequence elements. We present a tool for DNA/DNA sequence comparison that is built on the HMMER framework, which applies probabilistic inference methods based on hidden Markov models to the problem of homology search. This tool, called nhmmer, enables improved detection of remote DNA homologs, and has been used in combination with Dfam and RepeatMasker to improve annotation of transposable elements in the human genome. Availability: nhmmer is a part of the new HMMER3.1 release. Source code and documentation can be downloaded from http://hmmer.org . HMMER3.1 is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X. Contact: wheelert@janelia.hhmi.org

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

74

Unknown

deltaGseg: macrostate estimation via molecular dynamics simulations and multiscale time series analysis (2013)

Low, D. H. P., Motakis, E.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Binding free energy calculations obtained through molecular dynamics simulations reflect intermolecular interaction states through a series of independent snapshots. Typically, the free energies of multiple simulated series (each with slightly different starting conditions) need to be estimated. Previous approaches carry out this task by moving averages at certain decorrelation times, assuming that the system comes from a single conformation description of binding events. Here, we discuss a more general approach that uses statistical modeling, wavelets denoising and hierarchical clustering to estimate the significance of multiple statistically distinct subpopulations, reflecting potential macrostates of the system. We present the deltaGseg R package that performs macrostate estimation from multiple replicated series and allows molecular biologists/chemists to gain physical insight into the molecular details that are not easily accessible by experimental techniques. Availability: deltaGseg is a Bioconductor R package available at http://bioconductor.org/packages/release/bioc/html/deltaGseg.html . Contact: emotakis@hotmail.com

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

75

Unknown

DRAW+SneakPeek: Analysis workflow and quality metric management for DNA-seq experiments (2013)

Lin, C.-F., Valladares, O., Childress, D. M., Klevak, E., Geller, E. T., Hwang, Y.-C., Tsai, E. A., Schellenberg, G. D., Wang, L.-S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : We report our new DRAW+SneakPeek software for DNA-seq analysis. DNA resequencing analysis workflow (DRAW) automates the workflow of processing raw sequence reads including quality control, read alignment and variant calling on high-performance computing facilities such as Amazon elastic compute cloud. SneakPeek provides an effective interface for reviewing dozens of quality metrics reported by DRAW, so users can assess the quality of data and diagnose problems in their sequencing procedures. Both DRAW and SneakPeek are freely available under the MIT license, and are available as Amazon machine images to be used directly on Amazon cloud with minimal installation. Availability: DRAW+SneakPeek is released under the MIT license and is available for academic and nonprofit use for free. The information about source code, Amazon machine images and instructions on how to install and run DRAW+SneakPeek locally and on Amazon elastic compute cloud is available at the National Institute on Aging Genetics of Alzheimer’s Disease Data Storage Site ( http://www.niagads.org/ ) and Wang lab Web site ( http://wanglab.pcbi.upenn.edu/ ). Contact: gerardsc@mail.med.upenn.edu or lswang@mail.med.upenn.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

76

Unknown

TFcheckpoint: a curated compendium of specific DNA-binding RNA polymerase II transcription factors (2013)

Chawla, K., Tripathi, S., Thommesen, L., Laegreid, A., Kuiper, M.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Gene regulatory network assembly and analysis requires high-quality knowledge sources that cover functional aspects of the various components of the gene regulatory machinery. A multiplicity of resources exists with information about mammalian transcription factors (TFs); yet, only few of these provide sufficiently accurate classifications of the functional roles of individual TFs, or standardized evidence that would justify the information on which these functional classifications are based. We compiled the list of all putative TFs from nine different resources, ignored factors such as general TFs, mediator complexes and chromatin modifiers, and for the remaining factors checked the available literature for references that support their function as a true sequence-specific DNA-binding RNA polymerase II TF (DbTF). The results are available in the TFcheckpoint database, an exhaustive collection of TFs annotated according to experimental and other evidence on their function as true DbTFs. TFcheckpoint.org provides a high-quality and comprehensive knowledge source for genome-scale regulatory network studies. Availability: The TFcheckpoint database is freely available at www.tfcheckpoint.org Contact: martin.kuiper@ntnu.no Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

77

Unknown

ROBNCA: robust network component analysis for recovering transcription factor activities (2013)

Noor, A., Ahmad, A., Serpedin, E., Nounou, M., Nounou, H.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF) – gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. Results: We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the n on-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF – gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escheric h ia coli data , and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. Availability: The ROBNCA software is available at http://people.tamu.edu/ ~ amina/ROBNCA Contact: serpedin@ece.tamu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

78

Unknown

SAPA tool: finding protein regions by combination of amino acid composition, scaled profiles, patterns and rules (2013)

Maier, J., Adzhubei, A. A., Egge-Jacobsen, W.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Functional modules within protein sequences are often extracted by consensus sequence patterns representing a linear motif; however, other functional regions may only be described by combined features such as amino acid composition, profiles of amino acid properties and randomly distributed short sequence motifs. If only a small number of functional examples are well characterized, the researcher needs a tool to extract similar sequences for further investigation. Availability and Implementation: We provide the web application ‘SAPA tool’, which allows the user to search with combined properties, ranks the extracted target regions by an integrated score, estimates false discovery rates by using decoy sequences and provides them as a sequence file or spreadsheet. Source code, user manual and the web application implemented in Perl, HTML, CSS and JavaScript and running on Apache are freely available at http://sapa-tool.uio.no/sapa/ Contact: josef.maier@istls.de or w.m.egge-jacobsen@imbv.uio.no Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

79

Unknown

GALANT: a Cytoscape plugin for visualizing data as functional landscapes projected onto biological networks (2013)

Camilo, E., Bovolenta, L. A., Acencio, M. L., Rybarczyk-Filho, J. L., Castro, M. A. A., Moreira, J. C. F., Lemke, N.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : Network-level visualization of functional data is a key aspect of both analysis and understanding of biological systems. In a continuing effort to create clear and integrated visualizations that facilitate the gathering of novel biological insights despite the overwhelming complexity of data, we present here the GrAph LANdscape VisualizaTion (GALANT), a Cytoscape plugin that builds functional landscapes onto biological networks. By using GALANT, it is possible to project any type of numerical data onto a network to create a smoothed data map resembling the network layout. As a Cytoscape plugin, GALANT is further improved by the functionalities of Cytoscape, the popular bioinformatics package for biological network visualization and data integration. Availability: http://www.lbbc.ibb.unesp.br/galant . Contact: esther@ibb.unesp.br Supplementary Information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

80

Unknown

BLAD: A comprehensive database of widely circulated beta-lactamases (2013)

Danishuddin, M., Hassan Baig, M., Kaushal, L., Khan, A. U.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: Motivation: Beta-lactamases confer resistance to a broad range of antibiotics and inhibitors by accumulating mutations. The number of beta-lactamases and their variants is steadily increasing. The horizontal gene transfer likely plays a major role in dissemination of these markers to new environments and hosts. Moreover, information about the beta-lactamase classes and their variants was scattered. Categorizing all these classes and their associated variants along with their epidemiology and resistance pattern information on one platform could be helpful to the researcher working on multidrug-resistant bacteria. Thus, the beta-lactamase database (BLAD) has been developed to provide comprehensive information (epidemiology and resistance pattern) on beta-lactamases. Beta-lactamase gene sequences in BLAD are linked with structural data, phenotypic data (i.e. antibiotic resistance) and literature references to experimental studies. In summary, BLAD integrates information that may provide insight into the epidemiology of multidrug resistance and enable the designing of novel drug candidates. Availability: The database can be accessed from the website www.blad.co.in . Contact: asad.k@rediffmail.com Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

81

Unknown

SubNet: a Java application for subnetwork extraction (2013)

Zhang, Q., Zhang, Z. D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-09-20

Description: : The extraction of targeted subnetworks is a powerful way to identify functional modules and pathways within complex networks. Here, we present SubNet, a Java-based stand-alone program for extracting subnetworks, given a basal network and a set of selected nodes. Designed with a graphical user-friendly interface, SubNet combines four different extraction methods, which offer the possibility to interrogate a biological network according to the question investigated. Of note, we developed a method based on the highly successful Google PageRank algorithm to extract the subnetwork using the node centrality metric, to which possible node weights of the selected genes can be incorporated. Availability: http://www.zdzlab.org/1/subnet.html Contact: zhengdong.zhang@einstein.yu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

82

Unknown

Density parameter estimation for finding clusters of homologous proteins--tracing actinobacterial pathogenicity lifestyles (2013)

Rottger, R., Kalaghatgi, P., Sun, P., Soares, S. d. C., Azevedo, V., Wittkop, T., Baumbach, J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Homology detection is a long-standing challenge in computational biology. To tackle this problem, typically all-versus-all BLAST results are coupled with data partitioning approaches resulting in clusters of putative homologous proteins. One of the main problems, however, has been widely neglected: all clustering tools need a density parameter that adjusts the number and size of the clusters. This parameter is crucial but hard to estimate without gold standard data at hand. Developing a gold standard, however, is a difficult and time consuming task. Having a reliable method for detecting clusters of homologous proteins between a huge set of species would open opportunities for better understanding the genetic repertoire of bacteria with different lifestyles. Results: Our main contribution is a method for identifying a suitable and robust density parameter for protein homology detection without a given gold standard. Therefore, we study the core genome of 89 actinobacteria. This allows us to incorporate background knowledge, i.e. the assumption that a set of evolutionarily closely related species should share a comparably high number of evolutionarily conserved proteins (emerging from phylum-specific housekeeping genes). We apply our strategy to find genes/proteins that are specific for certain actinobacterial lifestyles, i.e. different types of pathogenicity. The whole study was performed with transitivity clustering, as it only requires a single intuitive density parameter and has been shown to be well applicable for the task of protein sequence clustering. Note, however, that the presented strategy generally does not depend on our clustering method but can easily be adapted to other clustering approaches. Availability: All results are publicly available at http://transclust.mmci.uni-saarland.de/actino_core/ or as Supplementary Material of this article. Contact: roettger@mpi-inf.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

83

Unknown

NURBS: a database of experimental and predicted nuclear receptor binding sites of mouse (2013)

Fang, Y., Liu, H.-X., Zhang, N., Guo, G. L., Wan, Y.-J. Y., Fang, J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: : Nuclear receptors (NRs) are a class of transcription factors playing important roles in various biological processes. An NR often impacts numerous genes and different NRs share overlapped target networks. To fulfil the need for a database incorporating binding sites of different NRs at various conditions for easy comparison and visualization to improve our understanding of NR binding mechanisms, we have developed NURBS, a database for experimental and predicted nuclear receptor binding sites of mouse (NURBS). NURBS currently contains binding sites across the whole-mouse genome of 8 NRs identified in 40 chromatin immunoprecipitation with massively parallel DNA sequencing experiments. All datasets are processed using a widely used procedure and same statistical criteria to ensure the binding sites derived from different datasets are comparable. NURBS also provides predicted binding sites using NR-HMM, a Hidden Markov Model (HMM) model. Availability: The GBrowse-based user interface of NURBS is freely accessible at http://shark.abl.ku.edu/nurbs/ . NR-HMM and all results can be downloaded for free at the website. Contact: jwfang@ku.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

84

Unknown

ImgLib2--generic image processing in Java (2013)

Pietzsch, T., Preibisch, S., Tomancak, P., Saalfeld, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

85

Unknown

Stability analysis of phylogenetic trees (2013)

Sheikh, S. I., Kahveci, T., Ranka, S., Gordon Burleigh, J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Phylogenetics, or reconstructing the evolutionary relationships of organisms, is critical for understanding evolution. A large number of heuristic algorithms for phylogenetics have been developed, some of which enable estimates of trees with tens of thousands of taxa. Such trees may not be robust, as small changes in the input data can cause major differences in the optimal topology. Tools that can assess the quality and stability of phylogenetic tree estimates and identify the most reliable parts of the tree are needed. Results: We define measures that assess the stability of trees, subtrees and individual taxa with respect to changes in the input sequences. Our measures consider changes at the finest granularity in the input data (i.e. individual nucleotides). We demonstrate the effectiveness of our measures on large published datasets. Our measures are computationally feasible for phylogenetic datasets consisting of tens of thousands of taxa. Availability: This software is available at http://bioinformatics.cise.ufl.edu/phylostab Contact: sheikh@cise.ufl.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

86

Unknown

An empirical Bayes approach for analysis of diverse periodic trends in time-course gene expression data (2013)

Kocak, M., Olusegun George, E., Pyne, S., Pounds, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: There is a substantial body of works in the biology literature that seeks to characterize the cyclic behavior of genes during cell division. Gene expression microarrays made it possible to measure the expression profiles of thousands of genes simultaneously in time-course experiments to assess changes in the expression levels of genes over time. In this context, the commonly used procedures for testing include the permutation test by de Lichtenberg et al. and the Fisher’s G -test, both of which are designed to evaluate periodicity against noise. However, it is possible that a gene of interest may have expression that is neither cyclic nor just noise. Thus, there is a need for a new test for periodicity that can identify cyclic patterns against not only noise but also other non-cyclic patterns such as linear, quadratic or higher order polynomial patterns. Results: To address this weakness, we have introduced an empirical Bayes approach to test for periodicity and compare its performance in terms of sensitivity and specificity with that of the permutation test and Fisher’s G -test through extensive simulations and by application to a set of time-course experiments on the Schizosaccharomyces pombe cell-cycle gene expression. We use ‘conserved’ and ‘cycling’ genes by Lu et al. to assess the sensitivity and CESR genes by Chen et al. to assess the specificity of our new empirical Bayes method. Availability and implementation: The SAS Macro for our empirical Bayes test for periodicity is included in the supplementary materials along with a sample run of the MACRO program. Contact: mkocak1@uthsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

87

Unknown

A Lasso multi-marker mixed model for association mapping with population structure correction (2013)

Rakitsch, B., Lippert, C., Stegle, O., Borgwardt, K.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Exploring the genetic basis of heritable traits remains one of the central challenges in biomedical research. In traits with simple Mendelian architectures, single polymorphic loci explain a significant fraction of the phenotypic variability. However, many traits of interest seem to be subject to multifactorial control by groups of genetic loci. Accurate detection of such multivariate associations is non-trivial and often compromised by limited statistical power. At the same time, confounding influences, such as population structure, cause spurious association signals that result in false-positive findings. Results: We propose linear mixed models LMM-Lasso, a mixed model that allows for both multi-locus mapping and correction for confounding effects. Our approach is simple and free of tuning parameters; it effectively controls for population structure and scales to genome-wide datasets. LMM-Lasso simultaneously discovers likely causal variants and allows for multi-marker–based phenotype prediction from genotype. We demonstrate the practical use of LMM-Lasso in genome-wide association studies in Arabidopsis thaliana and linkage mapping in mouse, where our method achieves significantly more accurate phenotype prediction for 91% of the considered phenotypes. At the same time, our model dissects the phenotypic variability into components that result from individual single nucleotide polymorphism effects and population structure. Enrichment of known candidate genes suggests that the individual associations retrieved by LMM-Lasso are likely to be genuine. Availability: Code available under http://webdav.tuebingen . mpg.de/u/karsten/Forschung/research.html. Contact: rakitsch@tuebingen.mpg.de , ippert@microsoft.com or stegle@ebi.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

88

Unknown

iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data (2013)

Wang, W., Baladandayuthapani, V., Morris, J. S., Broom, B. M., Manyam, G., Do, K.-A.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Analyzing data from multi-platform genomics experiments combined with patients’ clinical outcomes helps us understand the complex biological processes that characterize a disease, as well as how these processes relate to the development of the disease. Current data integration approaches are limited in that they do not consider the fundamental biological relationships that exist among the data obtained from different platforms. Statistical Model: We propose an integrative Bayesian analysis of genomics data (iBAG) framework for identifying important genes/biomarkers that are associated with clinical outcome. This framework uses hierarchical modeling to combine the data obtained from multiple platforms into one model. Results: We assess the performance of our methods using several synthetic and real examples. Simulations show our integrative methods to have higher power to detect disease-related genes than non-integrative methods. Using the Cancer Genome Atlas glioblastoma dataset, we apply the iBAG model to integrate gene expression and methylation data to study their associations with patient survival. Our proposed method discovers multiple methylation-regulated genes that are related to patient survival, most of which have important biological functions in other diseases but have not been previously studied in glioblastoma. Availability: http://odin.mdacc.tmc.edu/~vbaladan/ . Contact: veera@mdanderson.org Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

89

Unknown

Scaffolding low quality genomes using orthologous protein sequences (2013)

Li, Y. I., Copley, R. R.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: The ready availability of next-generation sequencing has led to a situation where it is easy to produce very fragmentary genome assemblies. We present a pipeline, SWiPS (Scaffolding With Protein Sequences), that uses orthologous proteins to improve low quality genome assemblies. The protein sequences are used as guides to scaffold existing contigs, while simultaneously allowing the gene structure to be predicted by homology. Results: To perform, SWiPS does not depend on a high N50 or whole proteins being encoded on a single contig. We tested our algorithm on simulated next-generation data from Ciona intestinalis , real next-generation data from Drosophila melanogaster , a complex genome assembly of Homo sapiens and the low coverage Sanger sequence assembly of Callorhinchus milii . The improvements in N50 are of the order of ~20% for the C.intestinalis and H.sapiens assemblies, which is significant, considering the large size of intergenic regions in these eukaryotes. Using the CEGMA pipeline to assess the gene space represented in the genome assemblies, the number of genes retrieved increased by 〉110% for C.milii and from 20 to 40% for C.intestinalis . The scaffold error rates are low: 85–90% of scaffolds are fully correct, and 〉95% of local contig joins are correct. Availability: SWiPS is available freely for download at http://www.well.ox.ac.uk/~yli142/swips.html . Contact: yang.li@well.ox.ac.uk or copley@well.ox.ac.uk

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

90

Unknown

Bridging the scales: semantic integration of quantitative SBML in graphical multi-cellular models and simulations with EPISIM and COPASI (2013)

Sutterlin, T., Kolb, C., Dickhaus, H., Jager, D., Grabe, N.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Biological reality can in silico only be comprehensively represented in multi-scaled models. To this end, cell behavioural models addressing the multi-cellular level have to be semantically linked with mechanistic molecular models. These requirements have to be met by flexible software workflows solving the issues of different time scales, inter-model variable referencing and flexible sub-model embedding. Results: We developed a novel software workflow (EPISIM) for the semantic integration of Systems Biology Markup Language (SBML)-based quantitative models in multi-scaled tissue models and simulations. This workflow allows to import and access SBML-based models. SBML model species, reactions and parameters are semantically integrated in cell behavioural models (CBM) represented by graphical process diagrams. By this, cellular states like proliferation and differentiation can be flexibly linked to gene-regulatory or biochemical reaction networks. For a multi-scale agent-based tissue simulation executable code is automatically generated where different time scales of imported SBML models and CBM have been mapped. We demonstrate the capabilities of the novel software workflow by integrating Tyson’s cell cycle model in our model of human epidermal tissue homeostasis. Finally, we show the semantic interplay of the different biological scales during tissue simulation. Availability: The EPISIM platform is available as binary executables for Windows, Linux and Mac OS X at http://www.tiga.uni-hd.de . Supplementary data are available at http://www.tiga.uni-hd.de/supplements/SemSBMLIntegration.html . Contact: niels.grabe@bioquant.uni-heidelberg.de

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

91

Unknown

A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data (2013)

Teschendorff, A. E., Marabita, F., Lechner, M., Bartlett, T., Tegner, J., Gomez-Cabrero, D., Beck, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: The Illumina Infinium 450 k DNA Methylation Beadchip is a prime candidate technology for Epigenome-Wide Association Studies (EWAS). However, a difficulty associated with these beadarrays is that probes come in two different designs, characterized by widely different DNA methylation distributions and dynamic range, which may bias downstream analyses. A key statistical issue is therefore how best to adjust for the two different probe designs. Results: Here we propose a novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes. The strategy involves application of a three-state beta-mixture model to assign probes to methylation states, subsequent transformation of probabilities into quantiles and finally a methylation-dependent dilation transformation to preserve the monotonicity and continuity of the data. We validate our method on cell-line data, fresh frozen and paraffin-embedded tumour tissue samples and demonstrate that BMIQ compares favourably with two competing methods. Specifically, we show that BMIQ improves the robustness of the normalization procedure, reduces the technical variation and bias of type2 probe values and successfully eliminates the type1 enrichment bias caused by the lower dynamic range of type2 probes. BMIQ will be useful as a preprocessing step for any study using the Illumina Infinium 450 k platform. Availability: BMIQ is freely available from http://code.google.com/p/bmiq/ . Contact: a.teschendorff@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

92

Unknown

Drug-target interaction prediction by learning from local information and neighbors (2013)

Mei, J.-P., Kwoh, C.-K., Yang, P., Li, X.-L., Zheng, J.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: In silico methods provide efficient ways to predict possible interactions between drugs and targets. Supervised learning approach, bipartite local model (BLM), has recently been shown to be effective in prediction of drug–target interactions. However, for drug-candidate compounds or target-candidate proteins that currently have no known interactions available, its pure ‘local’ model is not able to be learned and hence BLM may fail to make correct prediction when involving such kind of new candidates . Results: We present a simple procedure called neighbor-based interaction-profile inferring (NII) and integrate it into the existing BLM method to handle the new candidate problem. Specifically, the inferred interaction profile is treated as label information and is used for model learning of new candidates. This functionality is particularly important in practice to find targets for new drug-candidate compounds and identify targeting drugs for new target-candidate proteins. Consistent good performance of the new BLM–NII approach has been observed in the experiment for the prediction of interactions between drugs and four categories of target proteins. Especially for nuclear receptors, BLM–NII achieves the most significant improvement as this dataset contains many drugs/targets with no interactions in the cross-validation. This demonstrates the effectiveness of the NII strategy and also shows the great potential of BLM–NII for prediction of compound–protein interactions. Contact: jpmei@ntu.edu.sg Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

93

Unknown

Efficient statistical significance approximation for local similarity analysis of high-throughput time series data (2013)

Xia, L. C., Ai, D., Cram, J., Fuhrman, J. A., Sun, F.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Local similarity analysis of biological time series data helps elucidate the varying dynamics of biological systems. However, its applications to large scale high-throughput data are limited by slow permutation procedures for statistical significance evaluation. Results: We developed a theoretical approach to approximate the statistical significance of local similarity analysis based on the approximate tail distribution of the maximum partial sum of independent identically distributed (i.i.d.) random variables. Simulations show that the derived formula approximates the tail distribution reasonably well (starting at time points with no delay and with delay) and provides P -values comparable with those from permutations. The new approach enables efficient calculation of statistical significance for pairwise local similarity analysis, making possible all-to-all local association studies otherwise prohibitive. As a demonstration, local similarity analysis of human microbiome time series shows that core operational taxonomic units (OTUs) are highly synergetic and some of the associations are body-site specific across samples. Availability: The new approach is implemented in our eLSA package, which now provides pipelines for faster local similarity analysis of time series data. The tool is freely available from eLSA’s website: http://meta.usc.edu/softs/lsa . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: fsun@usc.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

94

Unknown

A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees (2013)

Silberstein, M., Weissbrod, O., Otten, L., Tzemach, A., Anisenia, A., Shtark, O., Tuberg, D., Galfrin, E., Gannon, I., Shalata, A., Borochowitz, Z. U., Dechter, R., Thompson, E., Geiger, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes. Results: Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain–Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman–Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome. Availability: Superlink-Online SNP is freely available for researchers at http://cbl-hap.cs.technion.ac.il/superlink-snp . The system source code can also be downloaded from the system website. Contact: omerw@cs.technion.ac.il Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

95

Unknown

VAGUE: a graphical user interface for the Velvet assembler (2013)

Powell, D. R., Seemann, T.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: : Velvet is a popular open-source de novo genome assembly software tool, which is run from the Unix command line. Most of the problems experienced by new users of Velvet revolve around constructing syntactically and semantically correct command lines, getting input files into acceptable formats and assessing the output. Here, we present Velvet Assembler Graphical User Environment (VAGUE), a multi-platform graphical front-end for Velvet. VAGUE aims to make sequence assembly accessible to a wider audience and to facilitate better usage amongst existing users of Velvet. Availability and implementation: VAGUE is implemented in JRuby and targets the Java Virtual Machine. It is available under an open-source GPLv2 licence from http://www.vicbioinformatics.com/ . Contact: torsten.seemann@monash.edu

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

96

Unknown

PASS-bis: a bisulfite aligner suitable for whole methylome analysis of Illumina and SOLiD reads (2013)

Campagna, D., Telatin, A., Forcato, C., Vitulo, N., Valle, G.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: : The sequencing of bisulfite-treated DNA (Bi-Seq) is becoming a gold standard for methylation studies. The mapping of Bi-Seq reads is complex and requires special alignment algorithms. This problem is particularly relevant for SOLiD color space, where the bisulfite conversion C/T changes two adjacent colors into 16 possible combinations. Here, we present an algorithm that efficiently aligns Bi-Seq reads obtained either from SOLiD or Illumina. An accompanying methylation-caller program creates a genomic view of methylated and unmethylated Cs on both DNA strands. Availability and implementation: The algorithm has been implemented as an option of the program PASS, freely available at http://pass.cribi.unipd.it . Contact: pass@cribi.unipd.it Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

97

Unknown

Defining and predicting structurally conserved regions in protein superfamilies (2013)

Huang, I. K., Pei, J., Grishin, N. V.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment. Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions. Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR . Contact: 91huangi@gmail.com or grishin@chop.swmed.edu Supplementary information: Supplementary data are available at Bioinformatics Online

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

98

Unknown

An effective framework for reconstructing gene regulatory networks from genetical genomics data (2013)

Flassig, R. J., Heise, S., Sundmacher, K., Klamt, S.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Systems Genetics approaches, in particular those relying on genetical genomics data, put forward a new paradigm of large-scale genome and network analysis. These methods use naturally occurring multi-factorial perturbations (e.g. polymorphisms) in properly controlled and screened genetic crosses to elucidate causal relationships in biological networks. However, although genetical genomics data contain rich information, a clear dissection of causes and effects as required for reconstructing gene regulatory networks is not easily possible. Results: We present a framework for reconstructing gene regulatory networks from genetical genomics data where genotype and phenotype correlation measures are used to derive an initial graph which is subsequently reduced by pruning strategies to minimize false positive predictions. Applied to realistic simulated genetic data from a recent DREAM challenge, we demonstrate that our approach is simple yet effective and outperforms more complex methods (including the best performer) with respect to (i) reconstruction quality (especially for small sample sizes) and (ii) applicability to large data sets due to relatively low computational costs. We also present reconstruction results from real genetical genomics data of yeast. Availability: A MATLAB implementation (script) of the reconstruction framework is available at www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html Contact: klamt@mpi-magdeburg.mpg.de

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

99

Unknown

Non-redundant compendium of human ncRNA genes in GeneCards (2013)

Belinky, F., Bahir, I., Stelzer, G., Zimmerman, S., Rosen, N., Nativ, N., Dalah, I., Iny Stein, T., Rappaport, N., Mituyama, T., Safran, M., Lancet, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: Non-coding RNA (ncRNA) genes are increasingly acknowledged for their importance in the human genome. However, there is no comprehensive non-redundant database for all such human genes. Results: We leveraged the effective platform of GeneCards, the human gene compendium, together with the power of fRNAdb and additional primary sources, to judiciously unify all ncRNA gene entries obtainable from 15 different primary sources. Overlapping entries were clustered to unified locations based on an algorithm employing genomic coordinates. This allowed GeneCards’ gamut of relevant entries to rise ~5-fold, resulting in ~80 000 human non-redundant ncRNAs, belonging to 14 classes. Such ‘grand unification’ within a regularly updated data structure will assist future ncRNA research. Availability and implementation: All of these non-coding RNAs are included among the ~122 500 entries in GeneCards V3.09, along with pertinent annotation, automatically mined by its built-in pipeline from 100 data sources. This information is available at www.genecards.org . Contact : Frida.Belinky@weizmann.ac.il Supplementary information: Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

100

Unknown

DOOSS: a tool for visual analysis of data overlaid on secondary structures (2013)

Golden, M., Martin, D.

Oxford University Press

In: Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2013-01-17

Description: Motivation: DOOSS (Data Overlaid On Secondary Structures) is a tool for visualizing annotated secondary structures of large single-stranded nucleotide sequences (such as full-length virus genomes). The purpose of this tool is to assist investigators in evaluating the biological relevance of secondary structures within particular sequences. Availability and implementation: DOOSS is written in Java and is available from: http://dooss.computingforbiology.org Contact: michaelgolden0@gmail.com Supplementary information : Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext