ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (11,084)
  • 2015-2019  (11,084)
  • 1975-1979
  • 1945-1949
  • PLoS Computational Biology  (2,535)
  • BMC Bioinformatics  (1,120)
  • Future Internet  (644)
  • 125090
  • 56466
  • 9756
  • Computer Science  (11,084)
  • 1
    Publication Date: 2015-08-08
    Description: Background: Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators. Results: In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387–404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783–802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods. Conclusion: The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2015-08-09
    Description: Background: Plant organ segmentation from 3D point clouds is a relevant task for plant phenotyping and plant growth observation. Automated solutions are required to increase the efficiency of recent high-throughput plant phenotyping pipelines. However, plant geometrical properties vary with time, among observation scales and different plant types. The main objective of the present research is to develop a fully automated, fast and reliable data driven approach for plant organ segmentation. Results: The automated segmentation of plant organs using unsupervised, clustering methods is crucial in cases where the goal is to get fast insights into the data or no labeled data is available or costly to achieve. For this we propose and compare data driven approaches that are easy-to-realize and make the use of standard algorithms possible. Since normalized histograms, acquired from 3D point clouds, can be seen as samples from a probability simplex, we propose to map the data from the simplex space into Euclidean space using Aitchisons log ratio transformation, or into the positive quadrant of the unit sphere using square root transformation. This, in turn, paves the way to a wide range of commonly used analysis techniques that are based on measuring the similarities between data points using Euclidean distance. We investigate the performance of the resulting approaches in the practical context of grouping 3D point clouds and demonstrate empirically that they lead to clustering results with high accuracy for monocotyledonous and dicotyledonous plant species with diverse shoot architecture. Conclusion: An automated segmentation of 3D point clouds is demonstrated in the present work. Within seconds first insights into plant data can be deviated – even from non-labelled data. This approach is applicable to different plant species with high accuracy. The analysis cascade can be implemented in future high-throughput phenotyping scenarios and will support the evaluation of the performance of different plant genotypes exposed to stress or in different environmental scenarios.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2015-08-12
    Description: As internet technologies make their way into developing areas, so too does the possibility of education and training being delivered to the people living in those previously unserved areas. The growing catalogue of free, high quality courseware, when combined with the newly acquired means of delivery, creates the potential for millions of people in the developing world to acquire a good education. Yet a good education obviously requires more than simply delivering information; students must also receive high quality feedback on their assessments. They must be told how their performance compares with the ideal, and be shown how to close the gap between the two. However, delivering high quality feedback is labor-intensive, and therefore expensive, and has long been recognized as a problematic issue by educators. This paper outlines a case study that uses a Learning Management System (LMS) to efficiently deliver detailed feedback that is informed by the principles of best practice. We make the case that the efficiencies of this method allow for large-scale courses with thousands of enrolments that are accessible to developing and developed areas alike. We explore the question; is computer-mediated feedback delivery efficient and effective and might it be applied to large-scale courses at low-cost?
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2015-08-13
    Description: Background: Tumorigenesis is an evolutionary process by which tumor cells acquire mutations through successive diversification and differentiation. There is much interest in reconstructing this process of evolution due to its relevance to identifying drivers of mutation and predicting future prognosis and drug response. Efforts are challenged by high tumor heterogeneity, though, both within and among patients. In prior work, we showed that this heterogeneity could be turned into an advantage by computationally reconstructing models of cell populations mixed to different degrees in distinct tumors. Such mixed membership model approaches, however, are still limited in their ability to dissect more than a few well-conserved cell populations across a tumor data set. Results: We present a method to improve on current mixed membership model approaches by better accounting for conserved progression pathways between subsets of cancers, which imply a structure to the data that has not previously been exploited. We extend our prior methods, which use an interpretation of the mixture problem as that of reconstructing simple geometric objects called simplices, to instead search for structured unions of simplices called simplicial complexes that one would expect to emerge from mixture processes describing branches along an evolutionary tree. We further improve on the prior work with a novel objective function to better identify mixtures corresponding to parsimonious evolutionary tree models. We demonstrate that this approach improves on our ability to accurately resolve mixtures on simulated data sets and demonstrate its practical applicability on a large RNASeq tumor data set. Conclusions: Better exploiting the expected geometric structure for mixed membership models produced from common evolutionary trees allows us to quickly and accurately reconstruct models of cell populations sampled from those trees. In the process, we hope to develop a better understanding of tumor evolution as well as other biological problems that involve interpreting genomic data gathered from heterogeneous populations of cells.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2015-08-13
    Description: Background: Understanding the architecture and function of RNA molecules requires methods for comparing and analyzing their tertiary and quaternary structures. While structural superposition of short RNAs is achievable in a reasonable time, large structures represent much bigger challenge. Therefore, we have developed a fast and accurate algorithm for RNA pairwise structure superposition called SETTER and implemented it in the SETTER web server. However, though biological relationships can be inferred by a pairwise structure alignment, key features preserved by evolution can be identified only from a multiple structure alignment. Thus, we extended the SETTER algorithm to the alignment of multiple RNA structures and developed the MultiSETTER algorithm. Results: In this paper, we present the updated version of the SETTER web server that implements a user friendly interface to the MultiSETTER algorithm. The server accepts RNA structures either as the list of PDB IDs or as user-defined PDB files. After the superposition is computed, structures are visualized in 3D and several reports and statistics are generated. Conclusion: To the best of our knowledge, the MultiSETTER web server is the first publicly available tool for a multiple RNA structure alignment. The MultiSETTER server offers the visual inspection of an alignment in 3D space which may reveal structural and functional relationships not captured by other multiple alignment methods based either on a sequence or on secondary structure motifs.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2015-08-13
    Description: Background: Today’s modern research of B and T cell antigen receptors (the immunoglobulins (IG) or antibodies and T cell receptors (TR)) forms the basis for detailed analyses of the human adaptive immune system. For instance, insights in the state of the adaptive immune system provide information that is essentially important in monitoring transplantation processes and the regulation of immune suppressiva. In this context, algorithms and tools are necessary for analyzing the IG and TR diversity on nucleotide as well as on amino acid sequence level, identifying highly proliferated clonotypes, determining the diversity of the cell repertoire found in a sample, comparing different states of the human immune system, and visualizing all relevant information. Results: We here present IMEX, a software framework for the detailed characterization and visualization of the state of human IG and TR repertoires. IMEX offers a broad range of algorithms for statistical analysis of IG and TR data, CDR and V-(D)-J analysis, diversity analysis by calculating the distribution of IG and TR, calculating primer efficiency, and comparing multiple data sets. We use a mathematical model that is able to describe the number of unique clonotypes in a sample taking into account the true number of unique sequences and read errors; we heuristically optimize the parameters of this model. IMEX uses IMGT/HighV-QUEST analysis outputs and includes methods for splitting and merging to enable the submission to this portal and to combine the outputs results, respectively. All calculation results can be visualized and exported. Conclusion: IMEX is an user-friendly and flexible framework for performing clonality experiments based on CDR and V-(D)-J rearranged regions, diversity analysis, primer efficiency, and various different visualization experiments. Using IMEX, various immunological reactions and alterations can be investigated in detail. IMEX is freely available for Windows and Unix platforms at http://bioinformatics.fh-hagenberg.at/immunexplorer/.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Eliseo Ferrante, Ali Emre Turgut, Edgar Duéñez-Guzmán, Marco Dorigo, Tom Wenseleers Division of labor is ubiquitous in biological systems, as evidenced by various forms of complex task specialization observed in both animal societies and multicellular organisms. Although clearly adaptive, the way in which division of labor first evolved remains enigmatic, as it requires the simultaneous co-occurrence of several complex traits to achieve the required degree of coordination. Recently, evolutionary swarm robotics has emerged as an excellent test bed to study the evolution of coordinated group-level behavior. Here we use this framework for the first time to study the evolutionary origin of behavioral task specialization among groups of identical robots. The scenario we study involves an advanced form of division of labor, common in insect societies and known as “task partitioning”, whereby two sets of tasks have to be carried out in sequence by different individuals. Our results show that task partitioning is favored whenever the environment has features that, when exploited, reduce switching costs and increase the net efficiency of the group, and that an optimal mix of task specialists is achieved most readily when the behavioral repertoires aimed at carrying out the different subtasks are available as pre-adapted building blocks. Nevertheless, we also show for the first time that self-organized task specialization could be evolved entirely from scratch, starting only from basic, low-level behavioral primitives, using a nature-inspired evolutionary method known as Grammatical Evolution. Remarkably, division of labor was achieved merely by selecting on overall group performance, and without providing any prior information on how the global object retrieval task was best divided into smaller subtasks. We discuss the potential of our method for engineering adaptively behaving robot swarms and interpret our results in relation to the likely path that nature took to evolve complex sociality and task specialization.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2015-08-07
    Description: by Patrícia Santos-Oliveira, António Correia, Tiago Rodrigues, Teresa M Ribeiro-Rodrigues, Paulo Matafome, Juan Carlos Rodríguez-Manzaneque, Raquel Seiça, Henrique Girão, Rui D. M. Travasso Sprouting angiogenesis, where new blood vessels grow from pre-existing ones, is a complex process where biochemical and mechanical signals regulate endothelial cell proliferation and movement. Therefore, a mathematical description of sprouting angiogenesis has to take into consideration biological signals as well as relevant physical processes, in particular the mechanical interplay between adjacent endothelial cells and the extracellular microenvironment. In this work, we introduce the first phase-field continuous model of sprouting angiogenesis capable of predicting sprout morphology as a function of the elastic properties of the tissues and the traction forces exerted by the cells. The model is very compact, only consisting of three coupled partial differential equations, and has the clear advantage of a reduced number of parameters. This model allows us to describe sprout growth as a function of the cell-cell adhesion forces and the traction force exerted by the sprout tip cell. In the absence of proliferation, we observe that the sprout either achieves a maximum length or, when the traction and adhesion are very large, it breaks. Endothelial cell proliferation alters significantly sprout morphology, and we explore how different types of endothelial cell proliferation regulation are able to determine the shape of the growing sprout. The largest region in parameter space with well formed long and straight sprouts is obtained always when the proliferation is triggered by endothelial cell strain and its rate grows with angiogenic factor concentration. We conclude that in this scenario the tip cell has the role of creating a tension in the cells that follow its lead. On those first stalk cells, this tension produces strain and/or empty spaces, inevitably triggering cell proliferation. The new cells occupy the space behind the tip, the tension decreases, and the process restarts. Our results highlight the ability of mathematical models to suggest relevant hypotheses with respect to the role of forces in sprouting, hence underlining the necessary collaboration between modelling and molecular biology techniques to improve the current state-of-the-art.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2015-08-08
    Description: by Sayed-Rzgar Hosseini, Aditya Barve, Andreas Wagner All biological evolution takes place in a space of possible genotypes and their phenotypes. The structure of this space defines the evolutionary potential and limitations of an evolving system. Metabolism is one of the most ancient and fundamental evolving systems, sustaining life by extracting energy from extracellular nutrients. Here we study metabolism’s potential for innovation by analyzing an exhaustive genotype-phenotype map for a space of 10 15 metabolisms that encodes all possible subsets of 51 reactions in central carbon metabolism. Using flux balance analysis, we predict the viability of these metabolisms on 10 different carbon sources which give rise to 1024 potential metabolic phenotypes. Although viable metabolisms with any one phenotype comprise a tiny fraction of genotype space, their absolute numbers exceed 10 9 for some phenotypes. Metabolisms with any one phenotype typically form a single network of genotypes that extends far or all the way through metabolic genotype space, where any two genotypes can be reached from each other through a series of single reaction changes. The minimal distance of genotype networks associated with different phenotypes is small, such that one can reach metabolisms with novel phenotypes – viable on new carbon sources – through one or few genotypic changes. Exceptions to these principles exist for those metabolisms whose complexity (number of reactions) is close to the minimum needed for viability. Increasing metabolic complexity enhances the potential for both evolutionary conservation and evolutionary innovation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2015-08-19
    Description: by Pengxing Cao, Ada W. C. Yan, Jane M. Heffernan, Stephen Petrie, Robert G. Moss, Louise A. Carolan, Teagan A. Guarnaccia, Anne Kelso, Ian G. Barr, Jodie McVernon, Karen L. Laurie, James M. McCaw Influenza is an infectious disease that primarily attacks the respiratory system. Innate immunity provides both a very early defense to influenza virus invasion and an effective control of viral growth. Previous modelling studies of virus–innate immune response interactions have focused on infection with a single virus and, while improving our understanding of viral and immune dynamics, have been unable to effectively evaluate the relative feasibility of different hypothesised mechanisms of antiviral immunity. In recent experiments, we have applied consecutive exposures to different virus strains in a ferret model, and demonstrated that viruses differed in their ability to induce a state of temporary immunity or viral interference capable of modifying the infection kinetics of the subsequent exposure. These results imply that virus-induced early immune responses may be responsible for the observed viral hierarchy. Here we introduce and analyse a family of within-host models of re-infection viral kinetics which allow for different viruses to stimulate the innate immune response to different degrees. The proposed models differ in their hypothesised mechanisms of action of the non-specific innate immune response. We compare these alternative models in terms of their abilities to reproduce the re-exposure data. Our results show that 1) a model with viral control mediated solely by a virus-resistant state, as commonly considered in the literature, is not able to reproduce the observed viral hierarchy; 2) the synchronised and desynchronised behaviour of consecutive virus infections is highly dependent upon the interval between primary virus and challenge virus exposures and is consistent with virus-dependent stimulation of the innate immune response. Our study provides the first mechanistic explanation for the recently observed influenza viral hierarchies and demonstrates the importance of understanding the host response to multi-strain viral infections. Re-exposure experiments provide a new paradigm in which to study the immune response to influenza and its role in viral control.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2015-08-21
    Description: Background: Membrane proteins represent over 25 % of human protein genes and account for more than 60 % of drug targets due to their accessibility from the extracellular environment. The increasing number of available crystal structures of these proteins in the Protein Data Bank permits an initial estimation of their structural properties.DescriptionWe have developed two web servers—TMalphaDB for α-helix bundles and TMbetaDB for β-barrels—to analyse the growing repertoire of available crystal structures of membrane proteins. TMalphaDB and TMbetaDB permit to search for these specific sequence motifs in a non-redundant structure database of transmembrane segments and quantify structural parameters such as ϕ and ψ backbone dihedral angles, χ 1 side chain torsion angle, unit bend and unit twist. Conclusions: The structural information offered by TMalphaDB and TMbetaDB permits to quantify structural distortions induced by specific sequence motifs, and to elucidate their role in the 3D structure. This specific structural information has direct implications in homology modeling of the growing sequences of membrane proteins lacking experimental structure. TMalphaDB and TMbetaDB are freely available at http://lmc.uab.cat/TMalphaDB and http://lmc.uab.cat/TMbetaDB.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2015-08-21
    Description: Background: Scoring DNA sequences against Position Weight Matrices (PWMs) is a widely adopted method to identify putative transcription factor binding sites. While common bioinformatics tools produce scores that can reflect the binding strength between a specific transcription factor and the DNA, these scores are not directly comparable between different transcription factors. Other methods, including p-value associated approaches (Touzet H, Varré J-S. Efficient and accurate p-value computation for position weight matrices. Algorithms Mol Biol. 2007;2(1510.1186):1748–7188), provide more rigorous ways to identify potential binding sites, but their results are difficult to interpret in terms of binding energy, which is essential for the modeling of transcription factor binding dynamics and enhancer activities. Results: Here, we provide two different ways to find the scaling parameter λ that allows us to infer binding energy from a PWM score. The first approach uses a PWM and background genomic sequence as input to estimate λ for a specific transcription factor, which we applied to show that λ distributions for different transcription factor families correspond with their DNA binding properties. Our second method can reliably convert λ between different PWMs of the same transcription factor, which allows us to directly compare PWMs that were generated by different approaches. Conclusion: These two approaches provide computationally efficient ways to scale PWM scores and estimate the strength of transcription factor binding sites in quantitative studies of binding dynamics. Their results are consistent with each other and previous reports in most of cases.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2015-08-21
    Description: by Paul M. Harrison, Laurent Badel, Mark J. Wall, Magnus J. E. Richardson Models of neocortical networks are increasingly including the diversity of excitatory and inhibitory neuronal classes. Significant variability in cellular properties are also seen within a nominal neuronal class and this heterogeneity can be expected to influence the population response and information processing in networks. Recent studies have examined the population and network effects of variability in a particular neuronal parameter with some plausibly chosen distribution. However, the empirical variability and covariance seen across multiple parameters are rarely included, partly due to the lack of data on parameter correlations in forms convenient for model construction. To addess this we quantify the heterogeneity within and between the neocortical pyramidal-cell classes in layers 2/3, 4, and the slender-tufted and thick-tufted pyramidal cells of layer 5 using a combination of intracellular recordings, single-neuron modelling and statistical analyses. From the response to both square-pulse and naturalistic fluctuating stimuli, we examined the class-dependent variance and covariance of electrophysiological parameters and identify the role of the h current in generating parameter correlations. A byproduct of the dynamic I-V method we employed is the straightforward extraction of reduced neuron models from experiment. Empirically these models took the refractory exponential integrate-and-fire form and provide an accurate fit to the perisomatic voltage responses of the diverse pyramidal-cell populations when the class-dependent statistics of the model parameters were respected. By quantifying the parameter statistics we obtained an algorithm which generates populations of model neurons, for each of the four pyramidal-cell classes, that adhere to experimentally observed marginal distributions and parameter correlations. As well as providing this tool, which we hope will be of use for exploring the effects of heterogeneity in neocortical networks, we also provide the code for the dynamic I-V method and make the full electrophysiological data set available.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2015-08-24
    Description: Background: Biological pathways are descriptive diagrams of biological processes widely used for functional analysis of differentially expressed genes or proteins. Primary data analysis, such as quality control, normalisation, and statistical analysis, is often performed in scripting languages like R, Perl, and Python. Subsequent pathway analysis is usually performed using dedicated external applications. Workflows involving manual use of multiple environments are time consuming and error prone. Therefore, tools are needed that enable pathway analysis directly within the same scripting languages used for primary data analyses. Existing tools have limited capability in terms of available pathway content, pathway editing and visualisation options, and export file formats. Consequently, making the full-fledged pathway analysis tool PathVisio available from various scripting languages will benefit researchers. Results: We developed PathVisioRPC, an XMLRPC interface for the pathway analysis software PathVisio. PathVisioRPC enables creating and editing biological pathways, visualising data on pathways, performing pathway statistics, and exporting results in several image formats in multiple programming environments.We demonstrate PathVisioRPC functionalities using examples in Python. Subsequently, we analyse a publicly available NCBI GEO gene expression dataset studying tumour bearing mice treated with cyclophosphamide in R. The R scripts demonstrate how calls to existing R packages for data processing and calls to PathVisioRPC can directly work together. To further support R users, we have created RPathVisio simplifying the use of PathVisioRPC in this environment. We have also created a pathway module for the microarray data analysis portal ArrayAnalysis.org that calls the PathVisioRPC interface to perform pathway analysis. This module allows users to use PathVisio functionality online without having to download and install the software and exemplifies how the PathVisioRPC interface can be used by data analysis pipelines for functional analysis of processed genomics data. Conclusions: PathVisioRPC enables data visualisation and pathway analysis directly from within various analytical environments used for preliminary analyses. It supports the use of existing pathways from WikiPathways or pathways created using the RPC itself. It also enables automation of tasks performed using PathVisio, making it useful to PathVisio users performing repeated visualisation and analysis tasks. PathVisioRPC is freely available for academic and commercial use at http://projects.bigcat.unimaas.nl/pathvisiorpc.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2015-08-19
    Description: by Ariel Afek, Hila Cohen, Shiran Barber-Zucker, Raluca Gordân, David B. Lukatsky Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF) binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs) have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C . elegans and D . melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq) results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large-scale analysis using in vitro TF-DNA preferences obtained from the universal protein binding microarrays (PBM) for ~90 eukaryotic TFs belonging to 22 different DNA-binding domain types. As a result of this new analysis, we conclude that nonconsensus protein-DNA binding is a widespread phenomenon that significantly affects protein-DNA binding preferences and need not require the presence of consensus (specific) TFBSs in order to achieve genome-wide TF-DNA binding specificity.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2015-08-20
    Description: Background: Detecting and quantifying isoforms from RNA-seq data is an important but challenging task. The problem is often ill-posed, particularly at low coverage. One promising direction is to exploit several samples simultaneously. Results: We propose a new method for solving the isoform deconvolution problem jointly across several samples. We formulate a convex optimization problem that allows to share information between samples and that we solve efficiently. We demonstrate the benefits of combining several samples on simulated and real data, and show that our approach outperforms pooling strategies and methods based on integer programming. Conclusion: Our convex formulation to jointly detect and quantify isoforms from RNA-seq data of multiple related samples is a computationally efficient approach to leverage the hypotheses that some isoforms are likely to be present in several samples. The software and source code are available at http://cbio.ensmp.fr/flipflop.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2015-08-20
    Description: Background: The cascade computer model (CCM) was designed as a machine-learning feature platform for prediction of drug diffusivity from the mucoadhesive formulations. Three basic models (the statistical regression model, the K nearest neighbor model and the modified version of the back propagation neural network) in CCM operate sequentially in close collaboration with each other, employing the estimated value obtained from the afore-positioned base model as an input value to the next-positioned base model in the cascade.The effects of various parameters on the pharmacological efficacy of a female controlled drug delivery system (FcDDS) intended for prevention of women from HIV-1 infection were evaluated using an in vitro apparatus “Simulant Vaginal System” (SVS). We used computer simulations to explicitly examine the changes in drug diffusivity from FcDDS and determine the prognostic potency of each variable for in vivo prediction of formulation efficacy. The results obtained using the CCM approach were compared with those from individual multiple regression model. Results: CCM significantly lowered the percentage mean error (PME) and enhanced r 2 values as compared with those from the multiple regression models. It was noted that CCM generated the PME value of 21.82 at 48169 epoch iterations, which is significantly improved from the PME value of 29.91 % at 118344 epochs by the back propagation network model. The results of this study indicated that the sequential ensemble of the classifiers allowed for an accurate prediction of the domain with significantly lowered variance and considerably reduces the time required for training phase. Conclusion: CCM is accurate, easy to operate, time and cost-effective, and thus, can serve as a valuable tool for prediction of drug diffusivity from mucoadhesive formulations. CCM may yield new insights into understanding how drugs are diffused from the carrier systems and exert their efficacies under various clinical conditions.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2015-08-20
    Description: by The PLOS Computational Biology Staff
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2015-08-21
    Description: Background: In many domains, scientists build complex simulators of natural phenomena that encode their hypotheses about the underlying processes. These simulators can be deterministic or stochastic, fast or slow, constrained or unconstrained, and so on. Optimizing the simulators with respect to a set of parameter values is common practice, resulting in a single parameter setting that minimizes an objective subject to constraints. Results: We propose algorithms for post optimization posterior evaluation (POPE) of simulators. The algorithms compute and visualize all simulations that can generate results of the same or better quality than the optimum, subject to constraints. These optimization posteriors are desirable for a number of reasons among which are easy interpretability, automatic parameter sensitivity and correlation analysis, and posterior predictive analysis. Our algorithms are simple extensions to an existing simulation-based inference framework called approximate Bayesian computation. POPE is applied two biological simulators: a fast and stochastic simulator of stem-cell cycling and a slow and deterministic simulator of tumor growth patterns. Conclusions: POPE allows the scientist to explore and understand the role that constraints, both on the input and the output, have on the optimization posterior. As a Bayesian inference procedure, POPE provides a rigorous framework for the analysis of the uncertainty of an optimal simulation parameter setting.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2015-08-21
    Description: by James Tamerius, Cécile Viboud, Jeffrey Shaman, Gerardo Chowell While a relationship between environmental forcing and influenza transmission has been established in inter-pandemic seasons, the drivers of pandemic influenza remain debated. In particular, school effects may predominate in pandemic seasons marked by an atypical concentration of cases among children. For the 2009 A/H1N1 pandemic, Mexico is a particularly interesting case study due to its broad geographic extent encompassing temperate and tropical regions, well-documented regional variation in the occurrence of pandemic outbreaks, and coincidence of several school breaks during the pandemic period. Here we fit a series of transmission models to daily laboratory-confirmed influenza data in 32 Mexican states using MCMC approaches, considering a meta-population framework or the absence of spatial coupling between states. We use these models to explore the effect of environmental, school–related and travel factors on the generation of spatially-heterogeneous pandemic waves. We find that the spatial structure of the pandemic is best understood by the interplay between regional differences in specific humidity (explaining the occurrence of pandemic activity towards the end of the school term in late May-June 2009 in more humid southeastern states), school vacations (preventing influenza transmission during July-August in all states), and regional differences in residual susceptibility (resulting in large outbreaks in early fall 2009 in central and northern Mexico that had yet to experience fully-developed outbreaks). Our results are in line with the concept that very high levels of specific humidity, as present during summer in southeastern Mexico, favor influenza transmission, and that school cycles are a strong determinant of pandemic wave timing.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2015-08-21
    Description: by Alireza Alemi, Carlo Baldassi, Nicolas Brunel, Riccardo Zecchina Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2015-08-12
    Description: by Sander Land, Steven A. Niederer Biophysical models of cardiac tension development provide a succinct representation of our understanding of force generation in the heart. The link between protein kinetics and interactions that gives rise to high cooperativity is not yet fully explained from experiments or previous biophysical models. We propose a biophysical ODE-based representation of cross-bridge (XB), tropomyosin and troponin within a contractile regulatory unit (RU) to investigate the mechanisms behind cooperative activation, as well as the role of cooperativity in dynamic tension generation across different species. The model includes cooperative interactions between regulatory units (RU-RU), between crossbridges (XB-XB), as well more complex interactions between crossbridges and regulatory units (XB-RU interactions). For the steady-state force-calcium relationship, our framework predicts that: (1) XB-RU effects are key in shifting the half-activation value of the force-calcium relationship towards lower [Ca 2+ ], but have only small effects on cooperativity. (2) XB-XB effects approximately double the duty ratio of myosin, but do not significantly affect cooperativity. (3) RU-RU effects derived from the long-range action of tropomyosin are a major factor in cooperative activation, with each additional unblocked RU increasing the rate of additional RU’s unblocking. (4) Myosin affinity for short (1–4 RU) unblocked stretches of actin of is very low, and the resulting suppression of force at low [Ca 2+ ] is a major contributor in the biphasic force-calcium relationship. We also reproduce isometric tension development across mouse, rat and human at physiological temperature and pacing rate, and conclude that species differences require only changes in myosin affinity and troponin I/troponin C affinity. Furthermore, we show that the calcium dependence of the rate of tension redevelopment k tr is explained by transient blocking of RU’s by a temporary decrease in XB-RU effects.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-12
    Description: by Jonas Paulsen, Odin Gramstad, Philippe Collas The three-dimensional (3D) structure of the genome is important for orchestration of gene expression and cell differentiation. While mapping genomes in 3D has for a long time been elusive, recent adaptations of high-throughput sequencing to chromosome conformation capture (3C) techniques, allows for genome-wide structural characterization for the first time. However, reconstruction of "consensus" 3D genomes from 3C-based data is a challenging problem, since the data are aggregated over millions of cells. Recent single-cell adaptations to the 3C-technique, however, allow for non-aggregated structural assessment of genome structure, but data suffer from sparse and noisy interaction sampling. We present a manifold based optimization (MBO) approach for the reconstruction of 3D genome structure from chromosomal contact data. We show that MBO is able to reconstruct 3D structures based on the chromosomal contacts, imposing fewer structural violations than comparable methods. Additionally, MBO is suitable for efficient high-throughput reconstruction of large systems, such as entire genomes, allowing for comparative studies of genomic structure across cell-lines and different species.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2015-08-12
    Description: by Hiroo Kenzaki, Shoji Takada Nucleosomes, basic units of chromatin, are known to show spontaneous DNA unwrapping dynamics that are crucial for transcriptional activation, but its structural details are yet to be elucidated. Here, employing a coarse-grained molecular model that captures residue-level structural details up to histone tails, we simulated equilibrium fluctuations and forced unwrapping of single nucleosomes at various conditions. The equilibrium simulations showed spontaneous unwrapping from outer DNA and subsequent rewrapping dynamics, which are in good agreement with experiments. We found several distinct partially unwrapped states of nucleosomes, as well as reversible transitions among these states. At a low salt concentration, histone tails tend to sit in the concave cleft between the histone octamer and DNA, tightening the nucleosome. At a higher salt concentration, the tails tend to bound to the outer side of DNA or be expanded outwards, which led to higher degree of unwrapping. Of the four types of histone tails, H3 and H2B tail dynamics are markedly correlated with partial unwrapping of DNA, and, moreover, their contributions were distinct. Acetylation in histone tails was simply mimicked by changing their charges, which enhanced the unwrapping, especially markedly for H3 and H2B tails.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2015-08-13
    Description: Background: Estimating the phylogenetic position of bacterial and archaeal organisms by genetic sequence comparisons is considered as the gold-standard in taxonomy. This is also a way to identify the species of origin of the sequence. The quality of the reference database used in such analyses is crucial: the database must reflect the up-to-date bacterial nomenclature and accurately indicate the species of origin of its sequences.DescriptionleBIBI QBPP is a web tool taking as input a series of nucleotide sequences belonging to one of a set of reference markers (e.g., SSU rRNA, rpoB, groEL2) and automatically retrieving closely related sequences, aligning them, and performing phylogenetic reconstruction using an approximate maximum likelihood approach. The system returns a set of quality parameters and, if possible, a suggested taxonomic assigment for the input sequences. The reference databases are extracted from GenBank and present four degrees of stringency, from the “superstringent” degree (one type strain per species) to the loosely parsed degree (“lax” database). A set of one hundred to more than a thousand sequences may be analyzed at a time. The speed of the process has been optimized through careful hardware selection and database design. Conclusion: leBIBI QBPP is a powerful tool helping biologists to position bacterial or archaeal sequence commonly used markers in a phylogeny. It is a diagnostic tool for clinical, industrial and environmental microbiology laboratory, as well as an exploratory tool for more specialized laboratories. Its main advantages, relatively to comparable systems are: i) the use of a broad set of databases covering diverse markers with various degrees of stringency; ii) the use of an approximate Maximum Likelihood approach for phylogenetic reconstruction; iii) a speed compatible with on-line usage; and iv) providing fully documented results to help the user in decision making.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-13
    Description: by Sebastian Bitzer, Jelle Bruineberg, Stefan J. Kiebel Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2015-07-30
    Description: Background: Despite the tremendous drop in the cost of nucleotide sequencing in recent years, many research projects still utilize sequencing of pools containing multiple samples for the detection of sequence variants as a cost saving measure. Various software tools exist to analyze these pooled sequence data, yet little has been reported on the relative accuracy and ease of use of these different programs. Results: In this manuscript we evaluate five different variant detection programs—The Genome Analysis Toolkit (GATK), CRISP, LoFreq, VarScan, and SNVer—with regard to their ability to detect variants in synthetically pooled Illumina sequencing data, by creating simulated pooled binary alignment/map (BAM) files using single-sample sequencing data from varying numbers of previously characterized samples at varying depths of coverage per sample. We report the overall runtimes and memory usage of each program, as well as each program’s sensitivity and specificity to detect known true variants. Conclusions: GATK, CRISP, and LoFreq all gave balanced accuracy of 80 % or greater for datasets with varying per-sample depth of coverage and numbers of samples per pool. VarScan and SNVer generally had balanced accuracy lower than 80 %. CRISP and LoFreq required up to four times less computational time and up to ten times less physical memory than GATK did, and without filtering, gave results with the highest sensitivity. VarScan and SNVer had generally lower false positive rates, but also significantly lower sensitivity than the other three programs.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Malachi Griffith, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Arjun Bharioke, Dmitri B. Chklovskii Neurons must faithfully encode signals that can vary over many orders of magnitude despite having only limited dynamic ranges. For a correlated signal, this dynamic range constraint can be relieved by subtracting away components of the signal that can be predicted from the past, a strategy known as predictive coding, that relies on learning the input statistics. However, the statistics of input natural signals can also vary over very short time scales e.g., following saccades across a visual scene. To maintain a reduced transmission cost to signals with rapidly varying statistics, neuronal circuits implementing predictive coding must also rapidly adapt their properties. Experimentally, in different sensory modalities, sensory neurons have shown such adaptations within 100 ms of an input change. Here, we show first that linear neurons connected in a feedback inhibitory circuit can implement predictive coding. We then show that adding a rectification nonlinearity to such a feedback inhibitory circuit allows it to automatically adapt and approximate the performance of an optimal linear predictive coding network, over a wide range of inputs, while keeping its underlying temporal and synaptic properties unchanged. We demonstrate that the resulting changes to the linearized temporal filters of this nonlinear network match the fast adaptations observed experimentally in different sensory modalities, in different vertebrate species. Therefore, the nonlinear feedback inhibitory network can provide automatic adaptation to fast varying signals, maintaining the dynamic range necessary for accurate neuronal transmission of natural inputs.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2015-08-08
    Description: by Murat Alp, Vipan K. Parihar, Charles L. Limoli, Francis A. Cucinotta In this work, a stochastic computational model of microscopic energy deposition events is used to study for the first time damage to irradiated neuronal cells of the mouse hippocampus. An extensive library of radiation tracks for different particle types is created to score energy deposition in small voxels and volume segments describing a neuron’s morphology that later are sampled for given particle fluence or dose. Methods included the construction of in silico mouse hippocampal granule cells from neuromorpho.org with spine and filopodia segments stochastically distributed along the dendritic branches. The model is tested with high-energy 56 Fe, 12 C, and 1 H particles and electrons. Results indicate that the tree-like structure of the neuronal morphology and the microscopic dose deposition of distinct particles may lead to different outcomes when cellular injury is assessed, leading to differences in structural damage for the same absorbed dose. The significance of the microscopic dose in neuron components is to introduce specific local and global modes of cellular injury that likely contribute to spine, filopodia, and dendrite pruning, impacting cognition and possibly the collapse of the neuron. Results show that the heterogeneity of heavy particle tracks at low doses, compared to the more uniform dose distribution of electrons, juxtaposed with neuron morphology make it necessary to model the spatial dose painting for specific neuronal components. Going forward, this work can directly support the development of biophysical models of the modifications of spine and dendritic morphology observed after low dose charged particle irradiation by providing accurate descriptions of the underlying physical insults to complex neuron structures at the nano-meter scale.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-08
    Description: by Brinda Vallat, Carlos Madrid-Aliste, Andras Fiser Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2015-08-05
    Description: by Po-Wei Chen, Luis L. Fonseca, Yusuf A. Hannun, Eberhard O. Voit The article demonstrates that computational modeling has the capacity to convert metabolic snapshots, taken sequentially over time, into a description of cellular, dynamic strategies. The specific application is a detailed analysis of a set of actions with which Saccharomyces cerevisiae responds to heat stress. Using time dependent metabolic concentration data, we use a combination of mathematical modeling, reverse engineering, and optimization to infer dynamic changes in enzyme activities within the sphingolipid pathway. The details of the sphingolipid responses to heat stress are important, because they guide some of the longer-term alterations in gene expression, with which the cells adapt to the increased temperature. The analysis indicates that all enzyme activities in the system are affected and that the shapes of the time trends in activities depend on the fatty-acyl CoA chain lengths of the different ceramide species in the system.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2015-07-30
    Description: Background: Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria.DescriptionsA web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th. Conclusions: SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2015-07-30
    Description: Background: The knowledge of the spatial organisation of the chromatin fibre in cell nuclei helps researchers to understand the nuclear machinery that regulates dna activity. Recent experimental techniques of the type Chromosome Conformation Capture (3c, or similar) provide high-resolution, high-throughput data consisting in the number of times any possible pair of dna fragments is found to be in contact, in a certain population of cells. As these data carry information on the structure of the chromatin fibre, several attempts have been made to use them to obtain high-resolution 3d reconstructions of entire chromosomes, or even an entire genome. The techniques proposed treat the data in different ways, possibly exploiting physical-geometric chromatin models. One popular strategy is to transform contact data into Euclidean distances between pairs of fragments, and then solve a classical distance-to-geometry problem. Results: We developed and tested a reconstruction technique that does not require translating contacts into distances, thus avoiding a number of related drawbacks. Also, we introduce a geometrical chromatin chain model that allows us to include sound biochemical and biological constraints in the problem. This model can be scaled at different genomic resolutions, where the structures of the coarser models are influenced by the reconstructions at finer resolutions. The search in the solution space is then performed by a classical simulated annealing, where the model is evolved efficiently through quaternion operators. The presence of appropriate constraints permits the less reliable data to be overlooked, so the result is a set of plausible chromatin configurations compatible with both the data and the prior knowledge. Conclusions: To test our method, we obtained a number of 3d chromatin configurations from hi-c data available in the literature for the long arm of human chromosome 1, and validated their features against known properties of gene density and transcriptional activity. Our results are compatible with biological features not introduced a priori in the problem: structurally different regions in our reconstructions highly correlate with functionally different regions as known from literature and genomic repositories.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-08-08
    Description: This paper proposes a discussion concerning the use of social media-related geographic information in the context of the strategic environmental assessment (SEA) of Sardinian Municipal masterplans (MMPs). We show that this kind of information improves, substantially, the SEA process since it provides planners, evaluators, and the local communities with information retrieved from social media that would have not been available otherwise. This information integrates authoritative data collection, which comes from official sources, and enlightens tastes and preferences of the users of services and infrastructure, and their expectations concerning their spatial organization. A methodological approach related to the collection of social media-related geographic information is implemented and discussed with reference to the urban context of the city of Cagliari (Sardinia, Italy). The results are very effective in terms of provision of information, which may possibly increase the spatial knowledge available for planning policy definition and implementation. In this perspective, this kind of information discloses opportunities for building analytical scenarios related to urban and regional planning and it offers useful suggestions for sustainable development based on tourism strategies.
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2015-08-08
    Description: by Pengyi Yang, Xiaofeng Zheng, Vivek Jayaswal, Guang Hu, Jean Yee Hwa Yang, Raja Jothi Cell signaling underlies transcription/epigenetic control of a vast majority of cell-fate decisions. A key goal in cell signaling studies is to identify the set of kinases that underlie key signaling events. In a typical phosphoproteomics study, phosphorylation sites (substrates) of active kinases are quantified proteome-wide. By analyzing the activities of phosphorylation sites over a time-course, the temporal dynamics of signaling cascades can be elucidated. Since many substrates of a given kinase have similar temporal kinetics, clustering phosphorylation sites into distinctive clusters can facilitate identification of their respective kinases. Here we present a knowledge-based CLUster Evaluation (CLUE) approach for identifying the most informative partitioning of a given temporal phosphoproteomics data. Our approach utilizes prior knowledge, annotated kinase-substrate relationships mined from literature and curated databases, to first generate biologically meaningful partitioning of the phosphorylation sites and then determine key kinases associated with each cluster. We demonstrate the utility of the proposed approach on two time-series phosphoproteomics datasets and identify key kinases associated with human embryonic stem cell differentiation and insulin signaling pathway. The proposed approach will be a valuable resource in the identification and characterizing of signaling networks from phosphoproteomics data.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2015-08-08
    Description: Background: The traditional method used to estimate tree biomass is allometry. In this method, models are tested and equations fitted by regression usually applying ordinary least squares, though other analogous methods are also used for this purpose. Due to the nature of tree biomass data, the assumptions of regression are not always accomplished, bringing uncertainties to the inferences. This article demonstrates that the Data Mining (DM) technique can be used as an alternative to traditional regression approach to estimate tree biomass in the Atlantic Forest, providing better results than allometry, and demonstrating simplicity, versatility and flexibility to apply to a wide range of conditions. Results: Various DM approaches were examined regarding distance, number of neighbors and weighting, by using 180 trees coming from environmental restoration plantations in the Atlantic Forest biome. The best results were attained using the Chebishev distance, 1/d weighting and 5 neighbors. Increasing number of neighbors did not improve estimates. We also analyze the effect of the size of data set and number of variables in the results. The complete data set and the maximum number of predicting variables provided the best fitting. We compare DM to Schumacher-Hall model and the results showed a gain of up to 16.5 % in reduction of the standard error of estimate. Conclusion: It was concluded that Data Mining can provide accurate estimates of tree biomass and can be successfully used for this purpose in environmental restoration plantations in the Atlantic Forest. This technique provides lower standard error of estimate than the Schumacher-Hall model and has the advantage of not requiring some statistical assumptions as do the regression models. Flexibility, versatility and simplicity are attributes of DM that corroborates its great potential for similar applications.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2015-08-08
    Description: Background: Motivated by the general need to identify and classify species based on molecular evidence, genome comparisons have been proposed that are based on measuring mostly Euclidean distances between Chaos Game Representation (CGR) patterns of genomic DNA sequences. Results: We provide, on an extensive dataset and using several different distances, confirmation of the hypothesis that CGR patterns are preserved along a genomic DNA sequence, and are different for DNA sequences originating from genomes of different species. This finding lends support to the theory that CGRs of genomic sequences can act as graphic genomic signatures. In particular, we compare the CGR patterns of over five hundred different 150,000 bp genomic sequences spanning one complete chromosome from each of six organisms, representing all kingdoms of life: H. sapiens (Animalia; chromosome 21), S. cerevisiae (Fungi; chromosome 4), A. thaliana (Plantae; chromosome 1), P. falciparum (Protista; chromosome 14), E. coli (Bacteria - full genome), and P. furiosus (Archaea - full genome). To maximize the diversity within each species, we also analyze the interrelationships within a set of over five hundred 150,000 bp genomic sequences sampled from the entire aforementioned genomes. Lastly, we provide some preliminary evidence of this method’s ability to classify genomic DNA sequences at lower taxonomic levels by comparing sequences sampled from the entire genome of H. sapiens (class Mammalia, order Primates) and of M. musculus (class Mammalia, order Rodentia), for a total length of approximately 174 million basepairs analyzed. We compute pairwise distances between CGRs of these genomic sequences using six different distances, and construct Molecular Distance Maps, which visualize all sequences as points in a two-dimensional or three-dimensional space, to simultaneously display their interrelationships. Conclusion: Our analysis confirms, for this dataset, that CGR patterns of DNA sequences from the same genome are in general quantitatively similar, while being different for DNA sequences from genomes of different species. Our assessment of the performance of the six distances analyzed uses three different quality measures and suggests that several distances outperform the Euclidean distance, which has so far been almost exclusively used for such studies.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2015-08-08
    Description: Background: Next-generation sequencing (NGS) has greatly facilitated metagenomic analysis but also raised new challenges for metagenomic DNA sequence assembly, owing to its high-throughput nature and extremely short reads generated by sequencers such as Illumina. To date, how to generate a high-quality draft assembly for metagenomic sequencing projects has not been fully addressed. Results: We conducted a comprehensive assessment on state-of-the-art de novo assemblers and revealed that the performance of each assembler depends critically on the sequencing depth. To address this problem, we developed a pipeline named InteMAP to integrate three assemblers, ABySS, IDBA-UD and CABOG, which were found to complement each other in assembling metagenomic sequences. Making a decision of which assembling approaches to use according to the sequencing coverage estimation algorithm for each short read, the pipeline presents an automatic platform suitable to assemble real metagenomic NGS data with uneven coverage distribution of sequencing depth. By comparing the performance of InteMAP with current assemblers on both synthetic and real NGS metagenomic data, we demonstrated that InteMAP achieves better performance with a longer total contig length and higher contiguity, and contains more genes than others. Conclusions: We developed a de novo pipeline, named InteMAP, that integrates existing tools for metagenomics assembly. The pipeline outperforms previous assembly methods on metagenomic assembly by providing a longer total contig length, a higher contiguity and covering more genes. InteMAP, therefore, could potentially be a useful tool for the research community of metagenomics.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2015-08-12
    Description: Background: Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community. Results: We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods. Conclusions: We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2015-08-13
    Description: by Deborah A. Striegel, Manami Hara, Vipul Periwal Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-08-15
    Description: Background: Selective pressures at the DNA level shape genes into profiles consisting of patterns of rapidly evolving sites and sites withstanding change. These profiles remain detectable even when protein sequences become extensively diverged. A common task in molecular biology is to infer functional, structural or evolutionary relationships by querying a database using an algorithm. However, problems arise when sequence similarity is low. This study presents an algorithm that uses the evolutionary rate at codon sites, the dN/dS (ω) parameter, coupled to a substitution matrix as an alignment metric for detecting distantly related proteins. The algorithm, called BLOSUM-FIRE couples a newer and improved version of the original FIRE ( F unctional I nference using R ates of E volution) algorithm with an amino acid substitution matrix in a dynamic scoring function. The enigmatic hepatitis B virus X protein was used as a test case for BLOSUM-FIRE and its associated database EvoDB. Results: The evolutionary rate based approach was coupled with a conventional BLOSUM substitution matrix. The two approaches are combined in a dynamic scoring function, which uses the selective pressure to score aligned residues. The dynamic scoring function is based on a coupled additive approach that scores aligned sites based on the level of conservation inferred from the ω values. Evaluation of the accuracy of this new implementation, BLOSUM-FIRE, using MAFFT alignment as reference alignments has shown that it is more accurate than its predecessor FIRE. Comparison of the alignment quality with widely used algorithms (MUSCLE, T-COFFEE, and CLUSTAL Omega) revealed that the BLOSUM-FIRE algorithm performs as well as conventional algorithms. Its main strength lies in that it provides greater potential for aligning divergent sequences and addresses the problem of low specificity inherent in the original FIRE algorithm. The utility of this algorithm is demonstrated using the Hepatitis B virus X (HBx) protein, a protein of unknown function, as a test case. Conclusion: This study describes the utility of an evolutionary rate based approach coupled to the BLOSUM62 amino acid substitution matrix in inferring protein domain function. We demonstrate that such an approach is robust and performs as well as an array of conventional algorithms.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2015-08-15
    Description: by Ghanim Ullah, Yina Wei, Markus A Dahlem, Martin Wechselberger, Steven J Schiff Cell volume changes are ubiquitous in normal and pathological activity of the brain. Nevertheless, we know little of how cell volume affects neuronal dynamics. We here performed the first detailed study of the effects of cell volume on neuronal dynamics. By incorporating cell swelling together with dynamic ion concentrations and oxygen supply into Hodgkin-Huxley type spiking dynamics, we demonstrate the spontaneous transition between epileptic seizure and spreading depression states as the cell swells and contracts in response to changes in osmotic pressure. Our use of volume as an order parameter further revealed a dynamical definition for the experimentally described physiological ceiling that separates seizure from spreading depression, as well as predicted a second ceiling that demarcates spreading depression from anoxic depolarization. Our model highlights the neuroprotective role of glial K buffering against seizures and spreading depression, and provides novel insights into anoxic depolarization and the relevant cell swelling during ischemia. We argue that the dynamics of seizures, spreading depression, and anoxic depolarization lie along a continuum of the repertoire of the neuron membrane that can be understood only when the dynamic ion concentrations, oxygen homeostasis,and cell swelling in response to osmotic pressure are taken into consideration. Our results demonstrate the feasibility of a unified framework for a wide range of neuronal behaviors that may be of substantial importance in the understanding of and potentially developing universal intervention strategies for these pathological states.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2015-08-15
    Description: by John R. Houser, Craig Barnhart, Daniel R. Boutz, Sean M. Carroll, Aurko Dasgupta, Joshua K. Michener, Brittany D. Needham, Ophelia Papoulas, Viswanadham Sridhara, Dariya K. Sydykova, Christopher J. Marx, M. Stephen Trent, Jeffrey E. Barrick, Edward M. Marcotte, Claus O. Wilke How do bacteria regulate their cellular physiology in response to starvation? Here, we present a detailed characterization of Escherichia coli growth and starvation over a time-course lasting two weeks. We have measured multiple cellular components, including RNA and proteins at deep genomic coverage, as well as lipid modifications and flux through central metabolism. Our study focuses on the physiological response of E . coli in stationary phase as a result of being starved for glucose, not on the genetic adaptation of E . coli to utilize alternative nutrients. In our analysis, we have taken advantage of the temporal correlations within and among RNA and protein abundances to identify systematic trends in gene regulation. Specifically, we have developed a general computational strategy for classifying expression-profile time courses into distinct categories in an unbiased manner. We have also developed, from dynamic models of gene expression, a framework to characterize protein degradation patterns based on the observed temporal relationships between mRNA and protein abundances. By comparing and contrasting our transcriptomic and proteomic data, we have identified several broad physiological trends in the E . coli starvation response. Strikingly, mRNAs are widely down-regulated in response to glucose starvation, presumably as a strategy for reducing new protein synthesis. By contrast, protein abundances display more varied responses. The abundances of many proteins involved in energy-intensive processes mirror the corresponding mRNA profiles while proteins involved in nutrient metabolism remain abundant even though their corresponding mRNAs are down-regulated.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2015-08-15
    Description: by Shaun S. Sanders, Dale D. O. Martin, Stefanie L. Butland, Mathieu Lavallée-Adam, Diego Calzolari, Chris Kay, John R. Yates, Michael R. Hayden Palmitoylation involves the reversible posttranslational addition of palmitate to cysteines and promotes membrane binding and subcellular localization. Recent advancements in the detection and identification of palmitoylated proteins have led to multiple palmitoylation proteomics studies but these datasets are contained within large supplemental tables, making downstream analysis and data mining time-consuming and difficult. Consequently, we curated the data from 15 palmitoylation proteomics studies into one compendium containing 1,838 genes encoding palmitoylated proteins; representing approximately 10% of the genome. Enrichment analysis revealed highly significant enrichments for Gene Ontology biological processes, pathway maps, and process networks related to the nervous system. Strikingly, 41% of synaptic genes encode a palmitoylated protein in the compendium. The top disease associations included cancers and diseases and disorders of the nervous system, with Schizophrenia, HD, and pancreatic ductal carcinoma among the top five, suggesting that aberrant palmitoylation may play a pivotal role in the balance of cell death and survival. This compendium provides a much-needed resource for cell biologists and the palmitoylation field, providing new perspectives for cancer and neurodegeneration.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2015-08-15
    Description: by Liat Rockah-Shmuel, Ágnes Tóth-Petróczy, Dan S. Tawfik Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII’s natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2015-08-13
    Description: by Shuai Yuan, H. Richard Johnston, Guosheng Zhang, Yun Li, Yi-Juan Hu, Zhaohui S. Qin With rapid decline of the sequencing cost, researchers today rush to embrace whole genome sequencing (WGS), or whole exome sequencing (WES) approach as the next powerful tool for relating genetic variants to human diseases and phenotypes. A fundamental step in analyzing WGS and WES data is mapping short sequencing reads back to the reference genome. This is an important issue because incorrectly mapped reads affect the downstream variant discovery, genotype calling and association analysis. Although many read mapping algorithms have been developed, the majority of them uses the universal reference genome and do not take sequence variants into consideration. Given that genetic variants are ubiquitous, it is highly desirable if they can be factored into the read mapping procedure. In this work, we developed a novel strategy that utilizes genotypes obtained a priori to customize the universal haploid reference genome into a personalized diploid reference genome. The new strategy is implemented in a program named RefEditor. When applying RefEditor to real data, we achieved encouraging improvements in read mapping, variant discovery and genotype calling. Compared to standard approaches, RefEditor can significantly increase genotype calling consistency (from 43% to 61% at 4X coverage; from 82% to 92% at 20X coverage) and reduce Mendelian inconsistency across various sequencing depths. Because many WGS and WES studies are conducted on cohorts that have been genotyped using array-based genotyping platforms previously or concurrently, we believe the proposed strategy will be of high value in practice, which can also be applied to the scenario where multiple NGS experiments are conducted on the same cohort. The RefEditor sources are available at https://github.com/superyuan/refeditor.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2015-08-15
    Description: Background: In structural bioinformatics, there is an increasing interest in identifying and understanding the evolution of local protein structures regarded as key structural or functional protein building blocks. A central need is then to compare these, possibly short, fragments by measuring efficiently and accurately their (dis)similarity. Progress towards this goal has given rise to scores enabling to assess the strong similarity of fragments. Yet, there is still a lack of more progressive scores, with meaningful intermediate values, for the comparison, retrieval or clustering of distantly related fragments. Results: We introduce here the Amplitude Spectrum Distance (ASD), a novel way of comparing protein fragments based on the discrete Fourier transform of their C α distance matrix. Defined as the distance between their amplitude spectra, ASD can be computed efficiently and provides a parameter-free measure of the global shape dissimilarity of two fragments. ASD inherits from nice theoretical properties, making it tolerant to shifts, insertions, deletions, circular permutations or sequence reversals while satisfying the triangle inequality. The practical interest of ASD with respect to RMSD, RMSD d , BC and TM scores is illustrated through zinc finger retrieval experiments and concrete structure examples. The benefits of ASD are also illustrated by two additional clustering experiments: domain linkers fragments and complementarity-determining regions of antibodies. Conclusions: Taking advantage of the Fourier transform to compare fragments at a global shape level, ASD is an objective and progressive measure taking into account the whole fragments. Its practical computation time and its properties make ASD particularly relevant for applications requiring meaningful measures on distantly related protein fragments, such as similar fragments retrieval asking for high recalls as shown in the experiments, or for any application taking also advantage of triangle inequality, such as fragments clustering.ASD program and source code are freely available at: http://www.irisa.fr/dyliss/public/ASD/.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2015-08-15
    Description: by Alexey A. Gritsenko, Marc Hulsman, Marcel J. T. Reinders, Dick de Ridder Translation of RNA to protein is a core process for any living organism. While for some steps of this process the effect on protein production is understood, a holistic understanding of translation still remains elusive. In silico modelling is a promising approach for elucidating the process of protein synthesis. Although a number of computational models of the process have been proposed, their application is limited by the assumptions they make. Ribosome profiling (RP), a relatively new sequencing-based technique capable of recording snapshots of the locations of actively translating ribosomes, is a promising source of information for deriving unbiased data-driven translation models. However, quantitative analysis of RP data is challenging due to high measurement variance and the inability to discriminate between the number of ribosomes measured on a gene and their speed of translation. We propose a solution in the form of a novel multi-scale interpretation of RP data that allows for deriving models with translation dynamics extracted from the snapshots. We demonstrate the usefulness of this approach by simultaneously determining for the first time per-codon translation elongation and per-gene translation initiation rates of Saccharomyces cerevisiae from RP data for two versions of the Totally Asymmetric Exclusion Process (TASEP) model of translation. We do this in an unbiased fashion, by fitting the models using only RP data with a novel optimization scheme based on Monte Carlo simulation to keep the problem tractable. The fitted models match the data significantly better than existing models and their predictions show better agreement with several independent protein abundance datasets than existing models. Results additionally indicate that the tRNA pool adaptation hypothesis is incomplete, with evidence suggesting that tRNA post-transcriptional modifications and codon context may play a role in determining codon elongation rates.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2015-08-17
    Description: Background: Identifying periodically expressed genes across different processes (e.g. the cell and metabolic cycles, circadian rhythms, etc) is a central problem in computational biology. Biological time series may contain (multiple) unknown signal shapes of systemic relevance, imperfections like noise, damping, and trending, or limited sampling density. While there exist methods for detecting periodicity, their design biases (e.g. toward a specific signal shape) can limit their applicability in one or more of these situations. Methods: We present in this paper a novel method, SW1PerS, for quantifying periodicity in time series in a shape-agnostic manner and with resistance to damping. The measurement is performed directly, without presupposing a particular pattern, by evaluating the circularity of a high-dimensional representation of the signal. SW1PerS is compared to other algorithms using synthetic data and performance is quantified under varying noise models, noise levels, sampling densities, and signal shapes. Results on biological data are also analyzed and compared. Results: On the task of periodic/not-periodic classification, using synthetic data, SW1PerS outperforms all other algorithms in the low-noise regime. SW1PerS is shown to be the most shape-agnostic of the evaluated methods, and the only one to consistently classify damped signals as highly periodic. On biological data, and for several experiments, the lists of top 10% genes ranked with SW1PerS recover up to 67% of those generated with other popular algorithms. Moreover, the list of genes from data on the Yeast metabolic cycle which are highly-ranked only by SW1PerS, contains evidently non-cosine patterns (e.g. ECM33, CDC9, SAM1,2 and MSH6) with highly periodic expression profiles. In data from the Yeast cell cycle SW1PerS identifies genes not preferred by other algorithms, hence not previously reported as periodic, but found in other experiments such as the universal growth rate response of Slavov. These genes are BOP3, CDC10, YIL108W, YER034W, MLP1, PAC2 and RTT101. Conclusions: In biological systems with low noise, i.e. where periodic signals with interesting shapes are more likely to occur, SW1PerS can be used as a powerful tool in exploratory analyses. Indeed, by having an initial set of periodic genes with a rich variety of signal types, pattern/shape information can be included in the study of systems and the generation of hypotheses regarding the structure of gene regulatory networks.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2015-09-11
    Description: To monitor multiple environmental factors of henhouses in modern chicken farms, a henhouse online monitoring system based on wireless sensor network was developed using wireless sensor technology and computer network technology. Sensor data compensation and correction were designed to be achieved using software and data fitting methods, data reliable transmission achieved using a data loss recovery strategy, and data missing during monitoring addressed using a self-decision and online filling method. Operation test of the system showed that: The system was economic and reliable; it enabled wireless monitoring and Web display of the environmental factors of a henhouse; and the root mean square errors (RMSEs) between the estimated values from the self-decision and on-line filling method and experimental values of the four environmental factors were 0.1698, 3.0859, 77 and 0.094, respectively, indicative of high estimation accuracy. The system can provide support for modern management of henhouses and can be transplanted to related monitoring scenarios in the agricultural field.
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-11
    Description: by Santiago Schnell
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-15
    Description: by Alexander Ullrich, Mathias A. Böhme, Johannes Schöneberg, Harald Depner, Stephan J. Sigrist, Frank Noé Synaptic vesicle fusion is mediated by SNARE proteins forming in between synaptic vesicle (v-SNARE) and plasma membrane (t-SNARE), one of which is Syntaxin-1A. Although exocytosis mainly occurs at active zones, Syntaxin-1A appears to cover the entire neuronal membrane. By using STED super-resolution light microscopy and image analysis of Drosophila neuro-muscular junctions, we show that Syntaxin-1A clusters are more abundant and have an increased size at active zones. A computational particle-based model of syntaxin cluster formation and dynamics is developed. The model is parametrized to reproduce Syntaxin cluster-size distributions found by STED analysis, and successfully reproduces existing FRAP results. The model shows that the neuronal membrane is adjusted in a way to strike a balance between having most syntaxins stored in large clusters, while still keeping a mobile fraction of syntaxins free or in small clusters that can efficiently search the membrane or be traded between clusters. This balance is subtle and can be shifted toward almost no clustering and almost complete clustering by modifying the syntaxin interaction energy on the order of only 1 k B T. This capability appears to be exploited at active zones. The larger active-zone syntaxin clusters are more stable and provide regions of high docking and fusion capability, whereas the smaller clusters outside may serve as flexible reserve pool or sites of spontaneous ectopic release.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2015-09-15
    Description: by Stephan Köhler, Friederike Schmid, Giovanni Settanni Fibrinogen is a serum multi-chain protein which, when activated, aggregates to form fibrin, one of the main components of a blood clot. Fibrinolysis controls blood clot dissolution through the action of the enzyme plasmin, which cleaves fibrin at specific locations. Although the main biochemical factors involved in fibrin formation and lysis have been identified, a clear mechanistic picture of how these processes take place is not available yet. This picture would be instrumental, for example, for the design of improved thrombolytic or anti-haemorrhagic strategies, as well as, materials with improved biocompatibility. Here, we present extensive molecular dynamics simulations of fibrinogen which reveal large bending motions centered at a hinge point in the coiled-coil regions of the molecule. This feature, likely conserved across vertebrates according to our analysis, suggests an explanation for the mechanism of exposure to lysis of the plasmin cleavage sites on fibrinogen coiled-coil region. It also explains the conformational variability of fibrinogen observed during its adsorption on inorganic surfaces and it is supposed to play a major role in the determination of the hydrodynamic properties of fibrinogen. In addition the simulations suggest how the dynamics of the D region of fibrinogen may contribute to the allosteric regulation of the blood coagulation cascade through a dynamic coupling between the a- and b-holes, important for fibrin polymerization, and the integrin binding site P1.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-15
    Description: by Sergei Maslov, Kim Sneppen Populations of species in ecosystems are often constrained by availability of resources within their environment. In effect this means that a growth of one population, needs to be balanced by comparable reduction in populations of others. In neutral models of biodiversity all populations are assumed to change incrementally due to stochastic births and deaths of individuals. Here we propose and model another redistribution mechanism driven by abrupt and severe reduction in size of the population of a single species freeing up resources for the remaining ones. This mechanism may be relevant e.g. for communities of bacteria, with strain-specific collapses caused e.g. by invading bacteriophages, or for other ecosystems where infectious diseases play an important role. The emergent dynamics of our system is characterized by cyclic ‘‘diversity waves’’ triggered by collapses of globally dominating populations. The population diversity peaks at the beginning of each wave and exponentially decreases afterwards. Species abundances have bimodal time-aggregated distribution with the lower peak formed by populations of recently collapsed or newly introduced species while the upper peak - species that has not yet collapsed in the current wave. In most waves both upper and lower peaks are composed of several smaller peaks. This self-organized hierarchical peak structure has a long-term memory transmitted across several waves. It gives rise to a scale-free tail of the time-aggregated population distribution with a universal exponent of 1.7. We show that diversity wave dynamics is robust with respect to variations in the rules of our model such as diffusion between multiple environments, species-specific growth and extinction rates, and bet-hedging strategies.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-16
    Description: by Hannah Edwards, Charlotte M. Deane Several protein structure classification schemes exist that partition the protein universe into structural units called folds. Yet these schemes do not discuss how these units sit relative to each other in a global structure space. In this paper we construct networks that describe such global relationships between folds in the form of structural bridges. We generate these networks using four different structural alignment methods across multiple score thresholds. The networks constructed using the different methods remain a similar distance apart regardless of the probability threshold defining a structural bridge. This suggests that at least some structural bridges are method specific and that any attempt to build a picture of structural space should not be reliant on a single structural superposition method. Despite these differences all representations agree on an organisation of fold space into five principal community structures: all- α , all- β sandwiches, all- β barrels, α / β and α + β . We project estimated fold ages onto the networks and find that not only are the pairings of unconnected folds associated with higher age differences than bridged folds, but this difference increases with the number of networks displaying an edge. We also examine different centrality measures for folds within the networks and how these relate to fold age. While these measures interpret the central core of fold space in varied ways they all identify the disposition of ancestral folds to fall within this core and that of the more recently evolved structures to provide the peripheral landscape. These findings suggest that evolutionary information is encoded along these structural bridges. Finally, we identify four highly central pivotal folds representing dominant topological features which act as key attractors within our landscapes.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-16
    Description: by Ayal Lavi, Omri Perez, Uri Ashery Neuronal microcircuits generate oscillatory activity, which has been linked to basic functions such as sleep, learning and sensorimotor gating. Although synaptic release processes are well known for their ability to shape the interaction between neurons in microcircuits, most computational models do not simulate the synaptic transmission process directly and hence cannot explain how changes in synaptic parameters alter neuronal network activity. In this paper, we present a novel neuronal network model that incorporates presynaptic release mechanisms, such as vesicle pool dynamics and calcium-dependent release probability, to model the spontaneous activity of neuronal networks. The model, which is based on modified leaky integrate-and-fire neurons, generates spontaneous network activity patterns, which are similar to experimental data and robust under changes in the model's primary gain parameters such as excitatory postsynaptic potential and connectivity ratio. Furthermore, it reliably recreates experimental findings and provides mechanistic explanations for data obtained from microelectrode array recordings, such as network burst termination and the effects of pharmacological and genetic manipulations. The model demonstrates how elevated asynchronous release, but not spontaneous release, synchronizes neuronal network activity and reveals that asynchronous release enhances utilization of the recycling vesicle pool to induce the network effect. The model further predicts a positive correlation between vesicle priming at the single-neuron level and burst frequency at the network level; this prediction is supported by experimental findings. Thus, the model is utilized to reveal how synaptic release processes at the neuronal level govern activity patterns and synchronization at the network level.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-18
    Description: by Michael A. Cerullo
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-18
    Description: by Jasmine Foo, Lin L Liu, Kevin Leder, Markus Riester, Yoh Iwasa, Christoph Lengauer, Franziska Michor The traditional view of cancer as a genetic disease that can successfully be treated with drugs targeting mutant onco-proteins has motivated whole-genome sequencing efforts in many human cancer types. However, only a subset of mutations found within the genomic landscape of cancer is likely to provide a fitness advantage to the cell. Distinguishing such “driver” mutations from innocuous “passenger” events is critical for prioritizing the validation of candidate mutations in disease-relevant models. We design a novel statistical index, called the Hitchhiking Index, which reflects the probability that any observed candidate gene is a passenger alteration, given the frequency of alterations in a cross-sectional cancer sample set, and apply it to a mutational data set in colorectal cancer. Our methodology is based upon a population dynamics model of mutation accumulation and selection in colorectal tissue prior to cancer initiation as well as during tumorigenesis. This methodology can be used to aid in the prioritization of candidate mutations for functional validation and contributes to the process of drug discovery.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-18
    Description: by Héctor García Martín, Vinay Satish Kumar, Daniel Weaver, Amit Ghosh, Victor Chubukov, Aindrila Mukhopadhyay, Adam Arkin, Jay D. Keasling Current limitations in quantitatively predicting biological behavior hinder our efforts to engineer biological systems to produce biofuels and other desired chemicals. Here, we present a new method for calculating metabolic fluxes, key targets in metabolic engineering, that incorporates data from 13 C labeling experiments and genome-scale models. The data from 13 C labeling experiments provide strong flux constraints that eliminate the need to assume an evolutionary optimization principle such as the growth rate optimization assumption used in Flux Balance Analysis (FBA). This effective constraining is achieved by making the simple but biologically relevant assumption that flux flows from core to peripheral metabolism and does not flow back. The new method is significantly more robust than FBA with respect to errors in genome-scale model reconstruction. Furthermore, it can provide a comprehensive picture of metabolite balancing and predictions for unmeasured extracellular fluxes as constrained by 13 C labeling data. A comparison shows that the results of this new method are similar to those found through 13 C Metabolic Flux Analysis ( 13 C MFA) for central carbon metabolism but, additionally, it provides flux estimates for peripheral metabolism. The extra validation gained by matching 48 relative labeling measurements is used to identify where and why several existing COnstraint Based Reconstruction and Analysis (COBRA) flux prediction algorithms fail. We demonstrate how to use this knowledge to refine these methods and improve their predictive capabilities. This method provides a reliable base upon which to improve the design of biological systems.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2015-09-19
    Description: Background: Two component systems (TCS) are signalling complexes manifested by a histidine kinase (receptor) and a response regulator (effector). They are the most abundant signalling pathways in prokaryotes and control a wide range of biological processes. The pairing of these two components is highly specific, often requiring costly and time-consuming experimental characterisation. Therefore, there is considerable interest in developing accurate prediction tools to lessen the burden of experimental work and cope with the ever-increasing amount of genomic information. Results: We present a novel meta-predictor, MetaPred2CS, which is based on a support vector machine. MetaPred2CS integrates six sequence-based prediction methods: in-silico two-hybrid, mirror-tree, gene fusion, phylogenetic profiling, gene neighbourhood, and gene operon. To benchmark MetaPred2CS, we also compiled a novel high-quality training dataset of experimentally deduced TCS protein pairs for k-fold cross validation, to act as a gold standard for TCS partnership predictions. Combining individual predictions using MetaPred2CS improved performance when compared to the individual methods and in comparison with a current state-of-the-art meta-predictor. Conclusion: We have developed MetaPred2CS, a support vector machine-based metapredictor for prokaryotic TCS protein pairings. Central to the success of MetaPred2CS is a strategy of integrating individual predictors that improves the overall prediction accuracy, with the in-silico two-hybrid method contributing most to performance. MetaPred2CS outperformed other available systems in our benchmark tests, and is available online at http://metapred2cs.ibers.aber.ac.uk, along with our gold standard dataset of TCS interaction pairs.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2015-09-19
    Description: Background: Technological advances have enabled the analysis of very small amounts of DNA in forensic cases. However, the DNA profiles from such evidence are frequently incomplete and can contain contributions from multiple individuals. The complexity of such samples confounds the assessment of the statistical weight of such evidence. One approach to account for this uncertainty is to use a likelihood ratio framework to compare the probability of the evidence profile under different scenarios. While researchers favor the likelihood ratio framework, few open-source software solutions with a graphical user interface implementing these calculations are available for practicing forensic scientists. Results: To address this need, we developed Lab Retriever, an open-source, freely available program that forensic scientists can use to calculate likelihood ratios for complex DNA profiles. Lab Retriever adds a graphical user interface, written primarily in JavaScript, on top of a C++ implementation of the previously published R code of Balding. We redesigned parts of the original Balding algorithm to improve computational speed. In addition to incorporating a probability of allelic drop-out and other critical parameters, Lab Retriever computes likelihood ratios for hypotheses that can include up to four unknown contributors to a mixed sample. These computations are completed nearly instantaneously on a modern PC or Mac computer. Conclusions: Lab Retriever provides a practical software solution to forensic scientists who wish to assess the statistical weight of evidence for complex DNA profiles. Executable versions of the program are freely available for Mac OSX and Windows operating systems.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2015-09-22
    Description: by Julia C. Quindlen, Victor K. Lai, Victor H. Barocas Cutaneous mechanoreceptors transduce different tactile stimuli into neural signals that produce distinct sensations of touch. The Pacinian corpuscle (PC), a cutaneous mechanoreceptor located deep within the dermis of the skin, detects high frequency vibrations that occur within its large receptive field. The PC is comprised of lamellae that surround the nerve fiber at its core. We hypothesized that a layered, anisotropic structure, embedded deep within the skin, would produce the nonlinear strain transmission and low spatial sensitivity characteristic of the PC. A multiscale finite-element model was used to model the equilibrium response of the PC to indentation. The first simulation considered an isolated PC with fiber networks aligned with the PC’s surface. The PC was subjected to a 10 μm indentation by a 250 μm diameter indenter. The multiscale model captured the nonlinear strain transmission through the PC, predicting decreased compressive strain with proximity to the receptor’s core, as seen experimentally by others. The second set of simulations considered a single PC embedded epidermally (shallow) or dermally (deep) to model the PC’s location within the skin. The embedded models were subjected to 10 μm indentations at a series of locations on the surface of the skin. Strain along the long axis of the PC was calculated after indentation to simulate stretch along the nerve fiber at the center of the PC. Receptive fields for the epidermis and dermis models were constructed by mapping the long-axis strain after indentation at each point on the surface of the skin mesh. The dermis model resulted in a larger receptive field, as the calculated strain showed less indenter location dependence than in the epidermis model.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-23
    Description: by Vipin Narang, Muhamad Azfar Ramli, Amit Singhal, Pavanish Kumar, Gennaro de Libero, Michael Poidinger, Christopher Monterola Human gene regulatory networks (GRN) can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs). Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data) accompanying this manuscript.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2015-09-23
    Description: by Jianfei Hu, Johnathan Neiswinger, Jin Zhang, Heng Zhu, Jiang Qian Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2015-09-23
    Description: by Vassilios Christopoulos, Paul R. Schrater Decisions involve two fundamental problems, selecting goals and generating actions to pursue those goals. While simple decisions involve choosing a goal and pursuing it, humans evolved to survive in hostile dynamic environments where goal availability and value can change with time and previous actions, entangling goal decisions with action selection. Recent studies suggest the brain generates concurrent action-plans for competing goals, using online information to bias the competition until a single goal is pursued. This creates a challenging problem of integrating information across diverse types, including both the dynamic value of the goal and the costs of action. We model the computations underlying dynamic decision-making with disparate value types, using the probability of getting the highest pay-off with the least effort as a common currency that supports goal competition. This framework predicts many aspects of decision behavior that have eluded a common explanation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2015-09-24
    Description: Background: One of the most important application spectrums of transcriptomic data is cancer phenotype classification. Many characteristics of transcriptomic data, such as redundant features and technical artifacts, make over-fitting commonplace. Promising classification results often fail to generalize across datasets with different sources, platforms, or preprocessing. Recently a novel differential network rank conservation (DIRAC) algorithm to characterize cancer phenotypes using transcriptomic data. DIRAC is a member of a family of algorithms that have shown useful for disease classification based on the relative expression of genes. Combining the robustness of this family’s simple decision rules with known biological relationships, this systems approach identifies interpretable, yet highly discriminate networks. While DIRAC has been briefly employed for several classification problems in the original paper, the potentials of DIRAC in cancer phenotype classification, and especially robustness against artifacts in transcriptomic data have not been fully characterized yet. Results: In this study we thoroughly investigate the potentials of DIRAC by applying it to multiple datasets, and examine the variations in classification performances when datasets are (i) treated and untreated for batch effect; (ii) preprocessed with different techniques. We also propose the first DIRAC-based classifier to integrate multiple networks. We show that the DIRAC-based classifier is very robust in the examined scenarios. To our surprise, the trained DIRAC-based classifier even translated well to a dataset with different biological characteristics in the presence of substantial batch effects that, as shown here, plagued the standard expression value based classifier. In addition, the DIRAC-based classifier, because of the integrated biological information, also suggests pathways to target in specific subtypes, which may enhance the establishment of personalized therapy in diseases such as pediatric AML. In order to better comprehend the prediction power of the DIRAC-based classifier in general, we also performed classifications using publicly available datasets from breast and lung cancer. Furthermore, multiple well-known classification algorithms were utilized to create an ideal test bed for comparing the DIRAC-based classifier with the standard gene expression value based classifier. We observed that the DIRAC-based classifier greatly outperforms its rival. Conclusions: Based on our experiments with multiple datasets, we propose that DIRAC is a promising solution to the lack of generalizability in classification efforts that uses transcriptomic data. We believe that superior performances presented in this study may motivate other to initiate a new aline of research to explore the untapped power of DIRAC in a broad range of cancer types.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-23
    Description: by Chakravarthy Marella, Andrew E. Torda, Dominik Schwudke A lipidome is the set of lipids in a given organism, cell or cell compartment and this set reflects the organism’s synthetic pathways and interactions with its environment. Recently, lipidomes of biological model organisms and cell lines were published and the number of functional studies of lipids is increasing. In this study we propose a homology metric that can quantify systematic differences in the composition of a lipidome. Algorithms were developed to 1. consistently convert lipids structure into SMILES, 2. determine structural similarity between molecular species and 3. describe a lipidome in a chemical space model. We tested lipid structure conversion and structure similarity metrics, in detail, using sets of isomeric ceramide molecules and chemically related phosphatidylinositols. Template-based SMILES showed the best properties for representing lipid-specific structural diversity. We also show that sequence analysis algorithms are best suited to calculate distances between such template-based SMILES and we adjudged the Levenshtein distance as best choice for quantifying structural changes. When all lipid molecules of the LIPIDMAPS structure database were mapped in chemical space, they automatically formed clusters corresponding to conventional chemical families. Accordingly, we mapped a pair of lipidomes into the same chemical space and determined the degree of overlap by calculating the Hausdorff distance. We named this metric the ‘Lipidome jUXtaposition (LUX) score’. First, we tested this approach for estimating the lipidome similarity on four yeast strains with known genetic alteration in fatty acid synthesis. We show that the LUX score reflects the genetic relationship and growth temperature better than conventional methods although the score is based solely on lipid structures. Next, we applied this metric to high-throughput data of larval tissue lipidomes of Drosophila. This showed that the LUX score is sufficient to cluster tissues and determine the impact of nutritional changes in an unbiased manner, despite the limited information on the underlying structural diversity of each lipidome. This study is the first effort to define a lipidome homology metric based on structures that will enrich functional association of lipids in a similar manner to measures used in genetics. Finally, we discuss the significance of the LUX score to perform comparative lipidome studies across species borders.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2015-09-24
    Description: The self-organizing nature of the Mobile Ad hoc Networks (MANETs) provide a communication channel anywhere, anytime without any pre-existing network infrastructure. However, it is exposed to various vulnerabilities that may be exploited by the malicious nodes. One such malicious behavior is introduced by blackhole nodes, which can be easily introduced in the network and, in turn, such nodes try to crumble the working of the network by dropping the maximum data under transmission. In this paper, a new protocol is proposed which is based on the widely used Ad hoc On-Demand Distance Vector (AODV) protocol, Enhanced Secure Trusted AODV (ESTA), which makes use of multiple paths along with use of trust and asymmetric cryptography to ensure data security. The results, based on NS-3 simulation, reveal that the proposed protocol is effectively able to counter the blackhole nodes in three different scenarios.
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2015-09-24
    Description: by Noah Ollikainen, René M. de Jong, Tanja Kortemme Interactions between small molecules and proteins play critical roles in regulating and facilitating diverse biological functions, yet our ability to accurately re-engineer the specificity of these interactions using computational approaches has been limited. One main difficulty, in addition to inaccuracies in energy functions, is the exquisite sensitivity of protein–ligand interactions to subtle conformational changes, coupled with the computational problem of sampling the large conformational search space of degrees of freedom of ligands, amino acid side chains, and the protein backbone. Here, we describe two benchmarks for evaluating the accuracy of computational approaches for re-engineering protein-ligand interactions: (i) prediction of enzyme specificity altering mutations and (ii) prediction of sequence tolerance in ligand binding sites. After finding that current state-of-the-art “fixed backbone” design methods perform poorly on these tests, we develop a new “coupled moves” design method in the program Rosetta that couples changes to protein sequence with alterations in both protein side-chain and protein backbone conformations, and allows for changes in ligand rigid-body and torsion degrees of freedom. We show significantly increased accuracy in both predicting ligand specificity altering mutations and binding site sequences. These methodological improvements should be useful for many applications of protein – ligand design. The approach also provides insights into the role of subtle conformational adjustments that enable functional changes not only in engineering applications but also in natural protein evolution.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2015-09-25
    Description: Background: Searching for two-dimensional (2D) structural similarities is a useful tool to identify new active compounds in drug-discovery programs. However, as 2D similarity measures neglect important structural and functional features, similarity by 2D might be underestimated. In the present study, we used combined 2D and three-dimensional (3D) similarity comparisons to reveal possible new functions and/or side-effects of known bioactive compounds. Results: We utilised more than 10,000 compounds from the SuperTarget database with known inhibition values for twelve different anti-cancer targets. We performed all-against-all comparisons resulting in 2D similarity landscapes. Among the regions with low 2D similarity scores are inhibitors of vascular endothelial growth factor receptor (VEGFR) and inhibitors of poly ADP-ribose polymerase (PARP). To demonstrate that 3D landscape comparison can identify similarities, which are untraceable in 2D similarity comparisons, we analysed this region in more detail. This 3D analysis showed the unexpected structural similarity between inhibitors of VEGFR and inhibitors of PARP. Among the VEGFR inhibitors that show similarities to PARP inhibitors was Vatalanib, an oral “multi-targeted” small molecule protein kinase inhibitor being studied in phase-III clinical trials in cancer therapy. An in silico docking simulation and an in vitro HT universal colorimetric PARP assay confirmed that the VEGFR inhibitor Vatalanib exhibits off-target activity as a PARP inhibitor, broadening its mode of action. Conclusion: In contrast to the 2D-similarity search, the 3D-similarity landscape comparison identifies new functions and side effects of the known VEGFR inhibitor Vatalanib.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2015-09-25
    Description: Background: Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. Results: We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for ‘enrichment’ or conditional differences using one of a number of commonly available packages. Conclusion: The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2015-09-25
    Description: Background: In the past decade, the identification of gene co-expression has become a routine part of the analysis of high-dimensional microarray data. Gene co-expression, which is mostly detected via the Pearson correlation coefficient, has played an important role in the discovery of molecular pathways and networks. Unfortunately, the presence of systematic noise in high-dimensional microarray datasets corrupts estimates of gene co-expression. Removing systematic noise from microarray data is therefore crucial. Many cleaning approaches for microarray data exist, however these methods are aimed towards improving differential expression analysis and their performances have been primarily tested for this application. To our knowledge, the performances of these approaches have never been systematically compared in the context of gene co-expression estimation. Results: Using simulations we demonstrate that standard cleaning procedures, such as background correction and quantile normalization, fail to adequately remove systematic noise that affects gene co-expression and at times further degrade true gene co-expression. Instead we show that a global version of removal of unwanted variation (RUV), a data-driven approach, removes systematic noise but also allows the estimation of the true underlying gene-gene correlations. We compare the performance of all noise removal methods when applied to five large published datasets on gene expression in the human brain. RUV retrieves the highest gene co-expression values for sets of genes known to interact, but also provides the greatest consistency across all five datasets. We apply the method to prioritize epileptic encephalopathy candidate genes. Conclusions: Our work raises serious concerns about the quality of many published gene co-expression analyses. RUV provides an efficient and flexible way to remove systematic noise from high-dimensional microarray datasets when the objective is gene co-expression analysis. The RUV method as applicable in the context of gene-gene correlation estimation is available as a BioconductoR-package: RUVcorr.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2015-09-25
    Description: by Pete Riley, Michal Ben-Nun, Jon A. Linker, Angelia A. Cost, Jose L. Sanchez, Dylan George, David P. Bacon, Steven Riley The potential rapid availability of large-scale clinical episode data during the next influenza pandemic suggests an opportunity for increasing the speed with which novel respiratory pathogens can be characterized. Key intervention decisions will be determined by both the transmissibility of the novel strain (measured by the basic reproductive number R 0 ) and its individual-level severity. The 2009 pandemic illustrated that estimating individual-level severity, as described by the proportion p C of infections that result in clinical cases, can remain uncertain for a prolonged period of time. Here, we use 50 distinct US military populations during 2009 as a retrospective cohort to test the hypothesis that real-time encounter data combined with disease dynamic models can be used to bridge this uncertainty gap. Effectively, we estimated the total number of infections in multiple early-affected communities using the model and divided that number by the known number of clinical cases. Joint estimates of severity and transmissibility clustered within a relatively small region of parameter space, with 40 of the 50 populations bounded by: p C , 0.0133–0.150 and R 0 , 1.09–2.16. These fits were obtained despite widely varying incidence profiles: some with spring waves, some with fall waves and some with both. To illustrate the benefit of specific pairing of rapidly available data and infectious disease models, we simulated a future moderate pandemic strain with p C approximately ×10 that of 2009; the results demonstrating that even before the peak had passed in the first affected population, R 0 and p C could be well estimated. This study provides a clear reference in this two-dimensional space against which future novel respiratory pathogens can be rapidly assessed and compared with previous pandemics.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-09-26
    Description: by Gang Li, Karen E. Ross, Cecilia N. Arighi, Yifan Peng, Cathy H. Wu, K. Vijay-Shanker MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA gene targets are often reported in the literature. In this paper, we describe miRTex, a text mining system that extracts miRNA-target relations, as well as miRNA-gene and gene-miRNA regulation relations. The system achieves good precision and recall when evaluated on a literature corpus of 150 abstracts with F-scores close to 0.90 on the three different types of relations. We conducted full-scale text mining using miRTex to process all the Medline abstracts and all the full-length articles in the PubMed Central Open Access Subset. The results for all the Medline abstracts are stored in a database for interactive query and file download via the website at http://proteininformationresource.org/mirtex. Using miRTex, we identified genes potentially regulated by miRNAs in Triple Negative Breast Cancer, as well as miRNA-gene relations that, in conjunction with kinase-substrate relations, regulate the response to abiotic stress in Arabidopsis thaliana . These two use cases demonstrate the usefulness of miRTex text mining in the analysis of miRNA-regulated biological processes.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2015-09-26
    Description: by Greg Jensen, Fabian Muñoz, Yelda Alkan, Vincent P. Ferrera, Herbert S. Terrace Transitive inference (the ability to infer that B 〉 D given that B 〉 C and C 〉 D ) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort , inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q -learning, an established reward-prediction error (RPE) model. Of these, only Q -learning failed to respond above chance during critical test trials. Betasort’s success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2015-09-29
    Description: This paper is based on the experience of introducing wireless sensor networks (WSNs) into the building industry in Denmark and in a rural area of Greenland. There are very real advantages in the application of the technology and its consequences for the life cycle operation of the building sector. Sensor networks can be seen as an important part of the Internet of Things and may even constitute an Internet of Sensors, since the communication layers can differ from the Internet standards. The current paper describes the case for application, followed by a discussion of the observed adaptive advantages and consequences of the technology. Essentially, WSNs constitute a highly sophisticated technology that is more robust in a rural context due to its extremely simple installation procedures (plug and play) allowing the use of local less-skilled labour, and the possibility of reconfiguring and repurposing its use remotely.
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2015-10-01
    Description: Background: The detection of the glomeruli is a key step in the histopathological evaluation of microscopic images of the kidneys. However, the task of automatic detection of the glomeruli poses challenges owing to the differences in their sizes and shapes in renal sections as well as the extensive variations in their intensities due to heterogeneity in immunohistochemistry staining.Although the rectangular histogram of oriented gradients (Rectangular HOG) is a widely recognized powerful descriptor for general object detection, it shows many false positives owing to the aforementioned difficulties in the context of glomeruli detection. Results: A new descriptor referred to as Segmental HOG was developed to perform a comprehensive detection of hundreds of glomeruli in images of whole kidney sections. The new descriptor possesses flexible blocks that can be adaptively fitted to input images in order to acquire robustness for the detection of the glomeruli. Moreover, the novel segmentation technique employed herewith generates high-quality segmentation outputs, and the algorithm is assured to converge to an optimal solution. Consequently, experiments using real-world image data revealed that Segmental HOG achieved significant improvements in detection performance compared to Rectangular HOG. Conclusion: The proposed descriptor for glomeruli detection presents promising results, and it is expected to be useful in pathological evaluation.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2015-09-30
    Description: Background: Numerous methods are available to profile several epigenetic marks, providing data with different genome coverage and resolution. Large epigenomic datasets are then generated, and often combined with other high-throughput data, including RNA-seq, ChIP-seq for transcription factors (TFs) binding and DNase-seq experiments. Despite the numerous computational tools covering specific steps in the analysis of large-scale epigenomics data, comprehensive software solutions for their integrative analysis are still missing. Multiple tools must be identified and combined to jointly analyze histone marks, TFs binding and other -omics data together with DNA methylation data, complicating the analysis of these data and their integration with publicly available datasets. Results: To overcome the burden of integrating various data types with multiple tools, we developed two companion R/Bioconductor packages. The former, methylPipe, is tailored to the analysis of high- or low-resolution DNA methylomes in several species, accommodating (hydroxy-)methyl-cytosines in both CpG and non-CpG sequence context. The analysis of multiple whole-genome bisulfite sequencing experiments is supported, while maintaining the ability of integrating targeted genomic data. The latter, compEpiTools, seamlessly incorporates the results obtained with methylPipe and supports their integration with other epigenomics data. It provides a number of methods to score these data in regions of interest, leading to the identification of enhancers, lncRNAs, and RNAPII stalling/elongation dynamics. Moreover, it allows a fast and comprehensive annotation of the resulting genomic regions, and the association of the corresponding genes with non-redundant GeneOntology terms. Finally, the package includes a flexible method based on heatmaps for the integration of various data types, combining annotation tracks with continuous or categorical data tracks. Conclusions: methylPipe and compEpiTools provide a comprehensive Bioconductor-compliant solution for the integrative analysis of heterogeneous epigenomics data. These packages are instrumental in providing biologists with minimal R skills a complete toolkit facilitating the analysis of their own data, or in accelerating the analyses performed by more experienced bioinformaticians.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2015-09-30
    Description: Background: The characterization of proteins in families and subfamilies, at different levels, entails the definition and use of class labels. When the adscription of a protein to a family is uncertain, or even wrong, this becomes an instance of what has come to be known as a label noise problem. Label noise has a potentially negative effect on any quantitative analysis of proteins that depends on label information. This study investigates class C of G protein-coupled receptors, which are cell membrane proteins of relevance both to biology in general and pharmacology in particular. Their supervised classification into different known subtypes, based on primary sequence data, is hampered by label noise. The latter may stem from a combination of expert knowledge limitations and the lack of a clear correspondence between labels that mostly reflect GPCR functionality and the different representations of the protein primary sequences. Results: In this study, we describe a systematic approach, using Support Vector Machine classifiers, to the analysis of G protein-coupled receptor misclassifications. As a proof of concept, this approach is used to assist the discovery of labeling quality problems in a curated, publicly accessible database of this type of proteins. We also investigate the extent to which physico-chemical transformations of the protein sequences reflect G protein-coupled receptor subtype labeling. The candidate mislabeled cases detected with this approach are externally validated with phylogenetic trees and against further trusted sources such as the National Center for Biotechnology Information, Universal Protein Resource, European Bioinformatics Institute and Ensembl Genome Browser information repositories. Conclusions: In quantitative classification problems, class labels are often by default assumed to be correct. Label noise, though, is bound to be a pervasive problem in bioinformatics, where labels may be obtained indirectly through complex, many-step similarity modelling processes. In the case of G protein-coupled receptors, methods capable of singling out and characterizing those sequences with consistent misclassification behaviour are required to minimize this problem. A systematic, Support Vector Machine-based method has been proposed in this study for such purpose. The proposed method enables a filtering approach to the label noise problem and might become a support tool for database curators in proteomics.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2015-09-30
    Description: Background: Copy number variations are important in the detection and progression of significant tumors and diseases. Recently, Whole Exome Sequencing is gaining popularity with copy number variations detection due to low cost and better efficiency. In this work, we developed VEGAWES for accurate and robust detection of copy number variations on WES data. VEGAWES is an extension to a variational based segmentation algorithm, VEGA: Variational estimator for genomic aberrations, which has previously outperformed several algorithms on segmenting array comparative genomic hybridization data. Results: We tested this algorithm on synthetic data and 100 Glioblastoma Multiforme primary tumor samples. The results on the real data were analyzed with segmentation obtained from Single-nucleotide polymorphism data as ground truth. We compared our results with two other segmentation algorithms and assessed the performance based on accuracy and time. Conclusions: In terms of both accuracy and time, VEGAWES provided better results on the synthetic data and tumor samples demonstrating its potential in robust detection of aberrant regions in the genome.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2015-09-30
    Description: Background: In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Results: Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. Conclusions: The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2015-09-30
    Description: Background: We are creating software for agent-based simulation and visualization of bio-molecular processes in bacterial and eukaryotic cells. As a first example, we have built a 3-dimensional, interactive computer model of an Escherichia coli bacterium and its associated biomolecular processes. Our illustrative model focuses on the gene regulatory processes that control the expression of genes involved in the lactose operon. Prokaryo, our agent-based cell simulator, incorporates cellular structures, such as plasma membranes and cytoplasm, as well as elements of the molecular machinery, including RNA polymerase, messenger RNA, lactose permease, and ribosomes. Results: The dynamics of cellular ’agents’ are defined by their rules of interaction, implemented as finite state machines. The agents are embedded within a 3-dimensional virtual environment with simulated physical and electrochemical properties. The hybrid model is driven by a combination of (1) mathematical equations (DEQs) to capture higher-scale phenomena and (2) agent-based rules to implement localized interactions among a small number of molecular elements. Consequently, our model is able to capture phenomena across multiple spatial scales, from changing concentration gradients to one-on-one molecular interactions.We use the classic gene regulatory mechanism of the lactose operon to demonstrate our model’s resolution, visual presentation, and real-time interactivity. Our agent-based model expands on a sophisticated mathematical E. coli metabolism model, through which we highlight our model’s scientific validity. Conclusion: We believe that through illustration and interactive exploratory learning a model system like Prokaryo can enhance the general understanding and perception of biomolecular processes. Our agent-DEQ hybrid modeling approach can also be of value to conceptualize, illustrate, and—eventually—validate cell experiments in the wet lab.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-11-22
    Description: by Alberto Romagnoni, Jérôme Ribot, Daniel Bennequin, Jonathan Touboul The layout of sensory brain areas is thought to subtend perception. The principles shaping these architectures and their role in information processing are still poorly understood. We investigate mathematically and computationally the representation of orientation and spatial frequency in cat primary visual cortex. We prove that two natural principles, local exhaustivity and parsimony of representation, would constrain the orientation and spatial frequency maps to display a very specific pinwheel-dipole singularity. This is particularly interesting since recent experimental evidences show a dipolar structures of the spatial frequency map co-localized with pinwheels in cat. These structures have important properties on information processing capabilities. In particular, we show using a computational model of visual information processing that this architecture allows a trade-off in the local detection of orientation and spatial frequency, but this property occurs for spatial frequency selectivity sharper than reported in the literature. We validated this sharpening on high-resolution optical imaging experimental data. These results shed new light on the principles at play in the emergence of functional architecture of cortical maps, as well as their potential role in processing information.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2015-11-22
    Description: by Kimberly Glass, Michelle Girvan The Gene Ontology (GO) provides biologists with a controlled terminology that describes how genes are associated with functions and how functional terms are related to one another. These term-term relationships encode how scientists conceive the organization of biological functions, and they take the form of a directed acyclic graph (DAG). Here, we propose that the network structure of gene-term annotations made using GO can be employed to establish an alternative approach for grouping functional terms that captures intrinsic functional relationships that are not evident in the hierarchical structure established in the GO DAG. Instead of relying on an externally defined organization for biological functions, our approach connects biological functions together if they are performed by the same genes, as indicated in a compendium of gene annotation data from numerous different sources. We show that grouping terms by this alternate scheme provides a new framework with which to describe and predict the functions of experimentally identified sets of genes.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2015-11-22
    Description: by Robert P. Jenkins, Anja Hanisch, Cristian Soza-Ried, Erik Sahai, Julian Lewis The somite segmentation clock is a robust oscillator used to generate regularly-sized segments during early vertebrate embryogenesis. It has been proposed that the clocks of neighbouring cells are synchronised via inter-cellular Notch signalling, in order to overcome the effects of noisy gene expression. When Notch-dependent communication between cells fails, the clocks of individual cells operate erratically and lose synchrony over a period of about 5 to 8 segmentation clock cycles (2–3 hours in the zebrafish). Here, we quantitatively investigate the effects of stochasticity on cell synchrony, using mathematical modelling, to investigate the likely source of such noise. We find that variations in the transcription, translation and degradation rate of key Notch signalling regulators do not explain the in vivo kinetics of desynchronisation. Rather, the analysis predicts that clock desynchronisation, in the absence of Notch signalling, is due to the stochastic dissociation of Her1/7 repressor proteins from the oscillating her1/7 autorepressed target genes. Using in situ hybridisation to visualise sites of active her1 transcription, we measure an average delay of approximately three minutes between the times of activation of the two her1 alleles in a cell. Our model shows that such a delay is sufficient to explain the in vivo rate of clock desynchronisation in Notch pathway mutant embryos and also that Notch-mediated synchronisation is sufficient to overcome this stochastic variation. This suggests that the stochastic nature of repressor/DNA dissociation is the major source of noise in the segmentation clock.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2015-11-24
    Description: Background: The alignment of multiple protein sequences is one of the most commonly performed tasks in bioinformatics. In spite of considerable research and efforts that have been recently deployed for improving the performance of multiple sequence alignment (MSA) algorithms, finding a highly accurate alignment between multiple protein sequences is still a challenging problem. Results: We propose a novel and efficient algorithm called, MSAIndelFR, for multiple sequence alignment using the information on the predicted locations of IndelFRs and the computed average log–loss values obtained from IndelFR predictors, each of which is designed for a different protein fold. We demonstrate that the introduction of a new variable gap penalty function based on the predicted locations of the IndelFRs and the computed average log–loss values into the proposed algorithm substantially improves the protein alignment accuracy. This is illustrated by evaluating the performance of the algorithm in aligning sequences belonging to the protein folds for which the IndelFR predictors already exist and by using the reference alignments of the four popular benchmarks, BAliBASE 3.0, OXBENCH, PREFAB 4.0, and SABRE (SABmark 1.65). Conclusions: We have proposed a novel and efficient algorithm, the MSAIndelFR algorithm, for multiple protein sequence alignment incorporating a new variable gap penalty function. It is shown that the performance of the proposed algorithm is superior to that of the most–widely used alignment algorithms, Clustal W2, Clustal Omega, Kalign2, MSAProbs, MAFFT, MUSCLE, ProbCons and Probalign, in terms of both the sum–of–pairs and total column metrics.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2015-11-25
    Description: by Maciej Jan Ejsmond, Jacek Radwan Major Histocompatibility Complex (MHC) genes code for proteins involved in the incitation of the adaptive immune response in vertebrates, which is achieved through binding oligopeptides (antigens) of pathogenic origin. Across vertebrate species, substitutions of amino acids at sites responsible for the specificity of antigen binding (ABS) are positively selected. This is attributed to pathogen-driven balancing selection, which is also thought to maintain the high polymorphism of MHC genes, and to cause the sharing of allelic lineages between species. However, the nature of this selection remains controversial. We used individual-based computer simulations to investigate the roles of two phenomena capable of maintaining MHC polymorphism: heterozygote advantage and host-pathogen arms race (Red Queen process). Our simulations revealed that levels of MHC polymorphism were high and driven mostly by the Red Queen process at a high pathogen mutation rate, but were low and driven mostly by heterozygote advantage when the pathogen mutation rate was low. We found that novel mutations at ABSs are strongly favored by the Red Queen process, but not by heterozygote advantage, regardless of the pathogen mutation rate. However, while the strong advantage of novel alleles increased the allele turnover rate, under a high pathogen mutation rate, allelic lineages persisted for a comparable length of time under Red Queen and under heterozygote advantage. Thus, when pathogens evolve quickly, the Red Queen is capable of explaining both positive selection and long coalescence times, but the tension between the novel allele advantage and persistence of alleles deserves further investigation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2015-11-19
    Description: by Manuel Schottdorf, Wolfgang Keil, David Coppola, Leonard E. White, Fred Wolf The architecture of iso-orientation domains in the primary visual cortex (V1) of placental carnivores and primates apparently follows species invariant quantitative laws. Dynamical optimization models assuming that neurons coordinate their stimulus preferences throughout cortical circuits linking millions of cells specifically predict these invariants. This might indicate that V1’s intrinsic connectome and its functional architecture adhere to a single optimization principle with high precision and robustness. To validate this hypothesis, it is critical to closely examine the quantitative predictions of alternative candidate theories. Random feedforward wiring within the retino-cortical pathway represents a conceptually appealing alternative to dynamical circuit optimization because random dimension-expanding projections are believed to generically exhibit computationally favorable properties for stimulus representations. Here, we ask whether the quantitative invariants of V1 architecture can be explained as a generic emergent property of random wiring. We generalize and examine the stochastic wiring model proposed by Ringach and coworkers, in which iso-orientation domains in the visual cortex arise through random feedforward connections between semi-regular mosaics of retinal ganglion cells (RGCs) and visual cortical neurons. We derive closed-form expressions for cortical receptive fields and domain layouts predicted by the model for perfectly hexagonal RGC mosaics. Including spatial disorder in the RGC positions considerably changes the domain layout properties as a function of disorder parameters such as position scatter and its correlations across the retina. However, independent of parameter choice, we find that the model predictions substantially deviate from the layout laws of iso-orientation domains observed experimentally. Considering random wiring with the currently most realistic model of RGC mosaic layouts, a pairwise interacting point process, the predicted layouts remain distinct from experimental observations and resemble Gaussian random fields. We conclude that V1 layout invariants are specific quantitative signatures of visual cortical optimization, which cannot be explained by generic random feedforward-wiring models.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2015-11-20
    Description: Background: Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions.DescriptionWe have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Conclusions: This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
    Electronic ISSN: 1471-2105
    Topics: Biology , Computer Science
    Published by BioMed Central
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-11-21
    Description: by Jordi Fonollosa, Emre Neftci, Mikhail Rabinovich We often learn and recall long sequences in smaller segments, such as a phone number 858 534 22 30 memorized as four segments. Behavioral experiments suggest that humans and some animals employ this strategy of breaking down cognitive or behavioral sequences into chunks in a wide variety of tasks, but the dynamical principles of how this is achieved remains unknown. Here, we study the temporal dynamics of chunking for learning cognitive sequences in a chunking representation using a dynamical model of competing modes arranged to evoke hierarchical Winnerless Competition (WLC) dynamics. Sequential memory is represented as trajectories along a chain of metastable fixed points at each level of the hierarchy, and bistable Hebbian dynamics enables the learning of such trajectories in an unsupervised fashion. Using computer simulations, we demonstrate the learning of a chunking representation of sequences and their robust recall. During learning, the dynamics associates a set of modes to each information-carrying item in the sequence and encodes their relative order. During recall, hierarchical WLC guarantees the robustness of the sequence order when the sequence is not too long. The resulting patterns of activities share several features observed in behavioral experiments, such as the pauses between boundaries of chunks, their size and their duration. Failures in learning chunking sequences provide new insights into the dynamical causes of neurological disorders such as Parkinson’s disease and Schizophrenia.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2015-11-21
    Description: by Florian Raudies, Michael E. Hasselmo Firing fields of grid cells in medial entorhinal cortex show compression or expansion after manipulations of the location of environmental barriers. This compression or expansion could be selective for individual grid cell modules with particular properties of spatial scaling. We present a model for differences in the response of modules to barrier location that arise from different mechanisms for the influence of visual features on the computation of location that drives grid cell firing patterns. These differences could arise from differences in the position of visual features within the visual field. When location was computed from the movement of visual features on the ground plane (optic flow) in the ventral visual field, this resulted in grid cell spatial firing that was not sensitive to barrier location in modules modeled with small spacing between grid cell firing fields. In contrast, when location was computed from static visual features on walls of barriers, i.e. in the more dorsal visual field, this resulted in grid cell spatial firing that compressed or expanded based on the barrier locations in modules modeled with large spacing between grid cell firing fields. This indicates that different grid cell modules might have differential properties for computing location based on visual cues, or the spatial radius of sensitivity to visual cues might differ between modules.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2015-11-21
    Description: by Emile R. Chimusa, Mamana Mbiyavanga, Velaphi Masilela, Judit Kumuthini A shortage of practical skills and relevant expertise is possibly the primary obstacle to social upliftment and sustainable development in Africa. The “omics” fields, especially genomics, are increasingly dependent on the effective interpretation of large and complex sets of data. Despite abundant natural resources and population sizes comparable with many first-world countries from which talent could be drawn, countries in Africa still lag far behind the rest of the world in terms of specialized skills development. Moreover, there are serious concerns about disparities between countries within the continent. The multidisciplinary nature of the bioinformatics field, coupled with rare and depleting expertise, is a critical problem for the advancement of bioinformatics in Africa. We propose a formalized matchmaking system, which is aimed at reversing this trend, by introducing the Knowledge Transfer Programme (KTP). Instead of individual researchers travelling to other labs to learn, researchers with desirable skills are invited to join African research groups for six weeks to six months. Visiting researchers or trainers will pass on their expertise to multiple people simultaneously in their local environments, thus increasing the efficiency of knowledge transference. In return, visiting researchers have the opportunity to develop professional contacts, gain industry work experience, work with novel datasets, and strengthen and support their ongoing research. The KTP develops a network with a centralized hub through which groups and individuals are put into contact with one another and exchanges are facilitated by connecting both parties with potential funding sources. This is part of the PLOS Computational Biology Education collection.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2015-11-21
    Description: by Olivier J. N. Bertrand, Jens P. Lindemann, Martin Egelhaaf Avoiding collisions is one of the most basic needs of any mobile agent, both biological and technical, when searching around or aiming toward a goal. We propose a model of collision avoidance inspired by behavioral experiments on insects and by properties of optic flow on a spherical eye experienced during translation, and test the interaction of this model with goal-driven behavior. Insects, such as flies and bees, actively separate the rotational and translational optic flow components via behavior, i.e. by employing a saccadic strategy of flight and gaze control. Optic flow experienced during translation, i.e. during intersaccadic phases, contains information on the depth-structure of the environment, but this information is entangled with that on self-motion. Here, we propose a simple model to extract the depth structure from translational optic flow by using local properties of a spherical eye. On this basis, a motion direction of the agent is computed that ensures collision avoidance. Flying insects are thought to measure optic flow by correlation-type elementary motion detectors. Their responses depend, in addition to velocity, on the texture and contrast of objects and, thus, do not measure the velocity of objects veridically. Therefore, we initially used geometrically determined optic flow as input to a collision avoidance algorithm to show that depth information inferred from optic flow is sufficient to account for collision avoidance under closed-loop conditions. Then, the collision avoidance algorithm was tested with bio-inspired correlation-type elementary motion detectors in its input. Even then, the algorithm led successfully to collision avoidance and, in addition, replicated the characteristics of collision avoidance behavior of insects. Finally, the collision avoidance algorithm was combined with a goal direction and tested in cluttered environments. The simulated agent then showed goal-directed behavior reminiscent of components of the navigation behavior of insects.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-11-21
    Description: by Satohiro Tajima, Toru Yanagawa, Naotaka Fujii, Taro Toyoizumi Brain-wide interactions generating complex neural dynamics are considered crucial for emergent cognitive functions. However, the irreducible nature of nonlinear and high-dimensional dynamical interactions challenges conventional reductionist approaches. We introduce a model-free method, based on embedding theorems in nonlinear state-space reconstruction, that permits a simultaneous characterization of complexity in local dynamics, directed interactions between brain areas, and how the complexity is produced by the interactions. We demonstrate this method in large-scale electrophysiological recordings from awake and anesthetized monkeys. The cross-embedding method captures structured interaction underlying cortex-wide dynamics that may be missed by conventional correlation-based analysis, demonstrating a critical role of time-series analysis in characterizing brain state. The method reveals a consciousness-related hierarchy of cortical areas, where dynamical complexity increases along with cross-area information flow. These findings demonstrate the advantages of the cross-embedding method in deciphering large-scale and heterogeneous neuronal systems, suggesting a crucial contribution by sensory-frontoparietal interactions to the emergence of complex brain dynamics during consciousness.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2015-11-21
    Description: by Joanne L. Dunster, Francoise Mazet, Michael J. Fry, Jonathan M. Gibbins, Marcus J. Tindall We present a data-driven mathematical model of a key initiating step in platelet activation, a central process in the prevention of bleeding following Injury. In vascular disease, this process is activated inappropriately and causes thrombosis, heart attacks and stroke. The collagen receptor GPVI is the primary trigger for platelet activation at sites of injury. Understanding the complex molecular mechanisms initiated by this receptor is important for development of more effective antithrombotic medicines. In this work we developed a series of nonlinear ordinary differential equation models that are direct representations of biological hypotheses surrounding the initial steps in GPVI-stimulated signal transduction. At each stage model simulations were compared to our own quantitative, high-temporal experimental data that guides further experimental design, data collection and model refinement. Much is known about the linear forward reactions within platelet signalling pathways but knowledge of the roles of putative reverse reactions are poorly understood. An initial model, that includes a simple constitutively active phosphatase, was unable to explain experimental data. Model revisions, incorporating a complex pathway of interactions (and specifically the phosphatase TULA-2), provided a good description of the experimental data both based on observations of phosphorylation in samples from one donor and in those of a wider population. Our model was used to investigate the levels of proteins involved in regulating the pathway and the effect of low GPVI levels that have been associated with disease. Results indicate a clear separation in healthy and GPVI deficient states in respect of the signalling cascade dynamics associated with Syk tyrosine phosphorylation and activation. Our approach reveals the central importance of this negative feedback pathway that results in the temporal regulation of a specific class of protein tyrosine phosphatases in controlling the rate, and therefore extent, of GPVI-stimulated platelet activation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2015-11-21
    Description: by Zhongyang Zhang, Ke Hao Cancer genomes exhibit profound somatic copy number alterations (SCNAs). Studying tumor SCNAs using massively parallel sequencing provides unprecedented resolution and meanwhile gives rise to new challenges in data analysis, complicated by tumor aneuploidy and heterogeneity as well as normal cell contamination. While the majority of read depth based methods utilize total sequencing depth alone for SCNA inference, the allele specific signals are undervalued. We proposed a joint segmentation and inference approach using both signals to meet some of the challenges. Our method consists of four major steps: 1) extracting read depth supporting reference and alternative alleles at each SNP/Indel locus and comparing the total read depth and alternative allele proportion between tumor and matched normal sample; 2) performing joint segmentation on the two signal dimensions; 3) correcting the copy number baseline from which the SCNA state is determined; 4) calling SCNA state for each segment based on both signal dimensions. The method is applicable to whole exome/genome sequencing (WES/WGS) as well as SNP array data in a tumor-control study. We applied the method to a dataset containing no SCNAs to test the specificity, created by pairing sequencing replicates of a single HapMap sample as normal/tumor pairs, as well as a large-scale WGS dataset consisting of 88 liver tumors along with adjacent normal tissues. Compared with representative methods, our method demonstrated improved accuracy, scalability to large cancer studies, capability in handling both sequencing and SNP array data, and the potential to improve the estimation of tumor ploidy and purity.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-11-22
    Description: by Peter C. St. John, Francis J. Doyle Stochastic noise at the cellular level has been shown to play a fundamental role in circadian oscillations, influencing how groups of cells entrain to external cues and likely serving as the mechanism by which cell-autonomous rhythms are generated. Despite this importance, few studies have investigated how clock perturbations affect stochastic noise—even as increasing numbers of high-throughput screens categorize how gene knockdowns or small molecules can change clock period and amplitude. This absence is likely due to the difficulty associated with measuring cell-autonomous stochastic noise directly, which currently requires the careful collection and processing of single-cell data. In this study, we show that the damping rate of population-level bioluminescence recordings can serve as an accurate measure of overall stochastic noise, and one that can be applied to future and existing high-throughput circadian screens. Using cell-autonomous fibroblast data, we first show directly that higher noise at the single-cell results in faster damping at the population level. Next, we show that the damping rate of cultured cells can be changed in a dose-dependent fashion by small molecule modulators, and confirm that such a change can be explained by single-cell noise using a mathematical model. We further demonstrate the insights that can be gained by applying our method to a genome-wide siRNA screen, revealing that stochastic noise is altered independently from period, amplitude, and phase. Finally, we hypothesize that the unperturbed clock is highly optimized for robust rhythms, as very few gene perturbations are capable of simultaneously increasing amplitude and lowering stochastic noise. Ultimately, this study demonstrates the importance of considering the effect of circadian perturbations on stochastic noise, particularly with regard to the development of small-molecule circadian therapeutics.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-11-22
    Description: by Chen Zhao, Aleksander S. Popel HRMs (hypoxia-responsive miRNAs) are a specific group of microRNAs that are regulated by hypoxia. Recent studies revealed that several HRMs including let-7 family miRNAs were highly induced in response to HIF (hypoxia-inducible factor) stabilization in hypoxia, and they potently participated in angiogenesis by targeting AGO1 (argonaute 1) and upregulating VEGF (vascular endothelial growth factor). Here we constructed a novel computational model of microRNA control of HIF-VEGF pathway in endothelial cells to quantitatively investigate the role of HRMs in modulating the cellular adaptation to hypoxia. The model parameters were optimized and the simulations based on these parameters were validated against several published in vitro experimental data. To advance the mechanistic understanding of oxygen sensing in hypoxia, we demonstrated that the rate of HIF-1α nuclear import substantially influences its stabilization and the formation of HIF-1 transcription factor complex. We described the biological feedback loops involving let-7 and AGO1 in which the impact of external perturbations were minimized; as a pair of master regulators when low oxygen tension was sensed, they coordinated the critical process of VEGF desuppression in a controlled manner. Prompted by the model-motivated discoveries, we proposed and assessed novel pathway-specific therapeutics that modulate angiogenesis by adjusting VEGF synthesis in tumor and ischemic cardiovascular disease. Through simulations that capture the complex interactions between miRNAs and miRNA-processing molecules, this model explores an innovative perspective about the distinctive yet integrated roles of different miRNAs in angiogenesis, and it will help future research to elucidate the dysregulated miRNA profiles found in cancer and various cardiovascular diseases.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2015-11-22
    Description: by Jia Li, Marie-Anne Poursat, Damien Drubay, Arnaud Motz, Zohra Saci, Antonin Morillon, Stefan Michiels, Daniel Gautheret We address here the issue of prioritizing non-coding mutations in the tumoral genome. To this aim, we created two independent computational models. The first (germline) model estimates purifying selection based on population SNP data. The second (somatic) model estimates tumor mutation density based on whole genome tumor sequencing. We show that each model reflects a different set of constraints acting either on the normal or tumor genome, and we identify the specific genome features that most contribute to these constraints. Importantly, we show that the somatic mutation model carries independent functional information that can be used to narrow down the non-coding regions that may be relevant to cancer progression. On this basis, we identify positions in non-coding RNAs and the non-coding parts of mRNAs that are both under purifying selection in the germline and protected from mutation in tumors, thus introducing a new strategy for future detection of cancer driver elements in the expressed non-coding genome.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...