ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications (2015)

Fang Yu; Ming-Hui Chen; Lynn Kuo; Heather Talbott; John Davis

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-08

Description: Background: Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators. Results: In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387–404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783–802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods. Conclusion: The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

2

Unknown

Automated interpretation of 3D laserscanned point clouds for plant organ segmentation (2015)

Mirwaes Wahabzada; Stefan Paulus; Kristian Kersting; Anne-Katrin Mahlein

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-09

Description: Background: Plant organ segmentation from 3D point clouds is a relevant task for plant phenotyping and plant growth observation. Automated solutions are required to increase the efficiency of recent high-throughput plant phenotyping pipelines. However, plant geometrical properties vary with time, among observation scales and different plant types. The main objective of the present research is to develop a fully automated, fast and reliable data driven approach for plant organ segmentation. Results: The automated segmentation of plant organs using unsupervised, clustering methods is crucial in cases where the goal is to get fast insights into the data or no labeled data is available or costly to achieve. For this we propose and compare data driven approaches that are easy-to-realize and make the use of standard algorithms possible. Since normalized histograms, acquired from 3D point clouds, can be seen as samples from a probability simplex, we propose to map the data from the simplex space into Euclidean space using Aitchisons log ratio transformation, or into the positive quadrant of the unit sphere using square root transformation. This, in turn, paves the way to a wide range of commonly used analysis techniques that are based on measuring the similarities between data points using Euclidean distance. We investigate the performance of the resulting approaches in the practical context of grouping 3D point clouds and demonstrate empirically that they lead to clustering results with high accuracy for monocotyledonous and dicotyledonous plant species with diverse shoot architecture. Conclusion: An automated segmentation of 3D point clouds is demonstrated in the present work. Within seconds first insights into plant data can be deviated – even from non-labelled data. This approach is applicable to different plant species with high accuracy. The analysis cascade can be implemented in future high-throughput phenotyping scenarios and will support the evaluation of the performance of different plant genotypes exposed to stress or in different environmental scenarios.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

3

Unknown

A simplicial complex-based approach to unmixing tumor progression data (2015)

Theodore Roman; Amir Nayyeri; Brittany Fasy; Russell Schwartz

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-13

Description: Background: Tumorigenesis is an evolutionary process by which tumor cells acquire mutations through successive diversification and differentiation. There is much interest in reconstructing this process of evolution due to its relevance to identifying drivers of mutation and predicting future prognosis and drug response. Efforts are challenged by high tumor heterogeneity, though, both within and among patients. In prior work, we showed that this heterogeneity could be turned into an advantage by computationally reconstructing models of cell populations mixed to different degrees in distinct tumors. Such mixed membership model approaches, however, are still limited in their ability to dissect more than a few well-conserved cell populations across a tumor data set. Results: We present a method to improve on current mixed membership model approaches by better accounting for conserved progression pathways between subsets of cancers, which imply a structure to the data that has not previously been exploited. We extend our prior methods, which use an interpretation of the mixture problem as that of reconstructing simple geometric objects called simplices, to instead search for structured unions of simplices called simplicial complexes that one would expect to emerge from mixture processes describing branches along an evolutionary tree. We further improve on the prior work with a novel objective function to better identify mixtures corresponding to parsimonious evolutionary tree models. We demonstrate that this approach improves on our ability to accurately resolve mixtures on simulated data sets and demonstrate its practical applicability on a large RNASeq tumor data set. Conclusions: Better exploiting the expected geometric structure for mixed membership models produced from common evolutionary trees allows us to quickly and accurately reconstruct models of cell populations sampled from those trees. In the process, we hope to develop a better understanding of tumor evolution as well as other biological problems that involve interpreting genomic data gathered from heterogeneous populations of cells.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

4

Unknown

MultiSETTER: web server for multiple RNA structure comparison (2015)

Petr ¿ech; David Hoksza; Daniel Svozil

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-13

Description: Background: Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. Results: In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. Conclusion: To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

5

Unknown

ImmunExplorer (IMEX): a software framework for diversity and clonality analyses of immunoglobulins and T cell receptors on the basis of IMGT/HighV-QUEST preprocessed NGS data (2015)

Susanne Schaller; Johannes Weinberger; Raul Jimenez-Heredia; Martin Danzer; Rainer Oberbauer; Christian Gabriel; Stephan Winkler

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-13

Description: Background: Today’s modern research of B and T cell antigen receptors (the immunoglobulins (IG) or antibodies and T cell receptors (TR)) forms the basis for detailed analyses of the human adaptive immune system. For instance, insights in the state of the adaptive immune system provide information that is essentially important in monitoring transplantation processes and the regulation of immune suppressiva. In this context, algorithms and tools are necessary for analyzing the IG and TR diversity on nucleotide as well as on amino acid sequence level, identifying highly proliferated clonotypes, determining the diversity of the cell repertoire found in a sample, comparing different states of the human immune system, and visualizing all relevant information. Results: We here present IMEX, a software framework for the detailed characterization and visualization of the state of human IG and TR repertoires. IMEX offers a broad range of algorithms for statistical analysis of IG and TR data, CDR and V-(D)-J analysis, diversity analysis by calculating the distribution of IG and TR, calculating primer efficiency, and comparing multiple data sets. We use a mathematical model that is able to describe the number of unique clonotypes in a sample taking into account the true number of unique sequences and read errors; we heuristically optimize the parameters of this model. IMEX uses IMGT/HighV-QUEST analysis outputs and includes methods for splitting and merging to enable the submission to this portal and to combine the outputs results, respectively. All calculation results can be visualized and exported. Conclusion: IMEX is an user-friendly and flexible framework for performing clonality experiments based on CDR and V-(D)-J rearranged regions, diversity analysis, primer efficiency, and various different visualization experiments. Using IMEX, various immunological reactions and alterations can be investigated in detail. IMEX is freely available for Windows and Unix platforms at http://bioinformatics.fh-hagenberg.at/immunexplorer/.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

6

Unknown

TMalphaDB and TMbetaDB: web servers to study the structural role of sequence motifs in α-helix and β-barrel domains of membrane proteins (2015)

Marc Perea; Ivar Lugtenburg; Eduardo Mayol; Arnau CordomíXavier DeupíLeonardo Pardo; Mireia Olivella

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-21

Description: Background: Membrane proteins represent over 25 % of human protein genes and account for more than 60 % of drug targets due to their accessibility from the extracellular environment. The increasing number of available crystal structures of these proteins in the Protein Data Bank permits an initial estimation of their structural properties.DescriptionWe have developed two web servers—TMalphaDB for α-helix bundles and TMbetaDB for β-barrels—to analyse the growing repertoire of available crystal structures of membrane proteins. TMalphaDB and TMbetaDB permit to search for these specific sequence motifs in a non-redundant structure database of transmembrane segments and quantify structural parameters such as ϕ and ψ backbone dihedral angles, χ 1 side chain torsion angle, unit bend and unit twist. Conclusions: The structural information offered by TMalphaDB and TMbetaDB permits to quantify structural distortions induced by specific sequence motifs, and to elucidate their role in the 3D structure. This specific structural information has direct implications in homology modeling of the growing sequences of membrane proteins lacking experimental structure. TMalphaDB and TMbetaDB are freely available at http://lmc.uab.cat/TMalphaDB and http://lmc.uab.cat/TMbetaDB.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

Reliable scaling of position weight matrices for binding strength comparisons between transcription factors (2015)

Xiaoyan Ma; Daphne Ezer; Carmen Navarro; Boris Adryan

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-21

Description: Background: Scoring DNA sequences against Position Weight Matrices (PWMs) is a widely adopted method to identify putative transcription factor binding sites. While common bioinformatics tools produce scores that can reflect the binding strength between a specific transcription factor and the DNA, these scores are not directly comparable between different transcription factors. Other methods, including p-value associated approaches (Touzet H, Varré J-S. Efficient and accurate p-value computation for position weight matrices. Algorithms Mol Biol. 2007;2(1510.1186):1748–7188), provide more rigorous ways to identify potential binding sites, but their results are difficult to interpret in terms of binding energy, which is essential for the modeling of transcription factor binding dynamics and enhancer activities. Results: Here, we provide two different ways to find the scaling parameter λ that allows us to infer binding energy from a PWM score. The first approach uses a PWM and background genomic sequence as input to estimate λ for a specific transcription factor, which we applied to show that λ distributions for different transcription factor families correspond with their DNA binding properties. Our second method can reliably convert λ between different PWMs of the same transcription factor, which allows us to directly compare PWMs that were generated by different approaches. Conclusion: These two approaches provide computationally efficient ways to scale PWM scores and estimate the strength of transcription factor binding sites in quantitative studies of binding dynamics. Their results are consistent with each other and previous reports in most of cases.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

8

Unknown

Automatically visualise and analyse data on pathways using PathVisioRPC from any programming environment (2015)

Anwesha Bohler; Lars Eijssen; Martijn van Iersel; Christ Leemans; Egon Willighagen; Martina Kutmon; Magali Jaillard; Chris Evelo

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-24

Description: Background: Biological pathways are descriptive diagrams of biological processes widely used for functional analysis of differentially expressed genes or proteins. Primary data analysis, such as quality control, normalisation, and statistical analysis, is often performed in scripting languages like R, Perl, and Python. Subsequent pathway analysis is usually performed using dedicated external applications. Workflows involving manual use of multiple environments are time consuming and error prone. Therefore, tools are needed that enable pathway analysis directly within the same scripting languages used for primary data analyses. Existing tools have limited capability in terms of available pathway content, pathway editing and visualisation options, and export file formats. Consequently, making the full-fledged pathway analysis tool PathVisio available from various scripting languages will benefit researchers. Results: We developed PathVisioRPC, an XMLRPC interface for the pathway analysis software PathVisio. PathVisioRPC enables creating and editing biological pathways, visualising data on pathways, performing pathway statistics, and exporting results in several image formats in multiple programming environments.We demonstrate PathVisioRPC functionalities using examples in Python. Subsequently, we analyse a publicly available NCBI GEO gene expression dataset studying tumour bearing mice treated with cyclophosphamide in R. The R scripts demonstrate how calls to existing R packages for data processing and calls to PathVisioRPC can directly work together. To further support R users, we have created RPathVisio simplifying the use of PathVisioRPC in this environment. We have also created a pathway module for the microarray data analysis portal ArrayAnalysis.org that calls the PathVisioRPC interface to perform pathway analysis. This module allows users to use PathVisio functionality online without having to download and install the software and exemplifies how the PathVisioRPC interface can be used by data analysis pipelines for functional analysis of processed genomics data. Conclusions: PathVisioRPC enables data visualisation and pathway analysis directly from within various analytical environments used for preliminary analyses. It supports the use of existing pathways from WikiPathways or pathways created using the RPC itself. It also enables automation of tasks performed using PathVisio, making it useful to PathVisio users performing repeated visualisation and analysis tasks. PathVisioRPC is freely available for academic and commercial use at http://projects.bigcat.unimaas.nl/pathvisiorpc.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

9

Unknown

A convex formulation for joint RNA isoform detection and quantification from multiple RNA-seq samples (2015)

Elsa Bernard; Laurent Jacob; Julien Mairal; Eric Viara; Jean-Philippe Vert

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-20

Description: Background: Detecting and quantifying isoforms from RNA-seq data is an important but challenging task. The problem is often ill-posed, particularly at low coverage. One promising direction is to exploit several samples simultaneously. Results: We propose a new method for solving the isoform deconvolution problem jointly across several samples. We formulate a convex optimization problem that allows to share information between samples and that we solve efficiently. We demonstrate the benefits of combining several samples on simulated and real data, and show that our approach outperforms pooling strategies and methods based on integer programming. Conclusion: Our convex formulation to jointly detect and quantify isoforms from RNA-seq data of multiple related samples is a computationally efficient approach to leverage the hypotheses that some isoforms are likely to be present in several samples. The software and source code are available at http://cbio.ensmp.fr/flipflop.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

10

Unknown

A cascade computer model for mocrobicide diffusivity from mucoadhesive formulations (2015)

Yugyung Lee; Alok Khemka; Gayathri Acharya; Namita Giri; Chi Lee

BioMed Central

In: BMC Bioinformatics

add to mindlist on the mindlist

Details

Publication Date: 2015-08-20

Description: Background: The cascade computer model (CCM) was designed as a machine-learning feature platform for prediction of drug diffusivity from the mucoadhesive formulations. Three basic models (the statistical regression model, the K nearest neighbor model and the modified version of the back propagation neural network) in CCM operate sequentially in close collaboration with each other, employing the estimated value obtained from the afore-positioned base model as an input value to the next-positioned base model in the cascade.The effects of various parameters on the pharmacological efficacy of a female controlled drug delivery system (FcDDS) intended for prevention of women from HIV-1 infection were evaluated using an in vitro apparatus “Simulant Vaginal System” (SVS). We used computer simulations to explicitly examine the changes in drug diffusivity from FcDDS and determine the prognostic potency of each variable for in vivo prediction of formulation efficacy. The results obtained using the CCM approach were compared with those from individual multiple regression model. Results: CCM significantly lowered the percentage mean error (PME) and enhanced r 2 values as compared with those from the multiple regression models. It was noted that CCM generated the PME value of 21.82 at 48169 epoch iterations, which is significantly improved from the PME value of 29.91 % at 118344 epochs by the back propagation network model. The results of this study indicated that the sequential ensemble of the classifiers allowed for an accurate prediction of the domain with significantly lowered variance and considerably reduces the time required for training phase. Conclusion: CCM is accurate, easy to operate, time and cost-effective, and thus, can serve as a valuable tool for prediction of drug diffusivity from mucoadhesive formulations. CCM may yield new insights into understanding how drugs are diffused from the carrier systems and exert their efficacies under various clinical conditions.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext