ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (17,027)
  • 2020-2022  (3,032)
  • 2015-2019  (13,995)
  • 1975-1979
  • 1945-1949
  • PLoS Computational Biology  (2,535)
  • Bioinformatics  (1,636)
  • Algorithms  (852)
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics  (504)
  • 110151
  • 2184
  • 40961
  • 56466
  • Computer Science  (17,027)
Collection
  • Articles  (17,027)
Years
Year
Topic
  • 1
    Publication Date: 2020-08-29
    Description: Healthcare facilities are constantly deteriorating due to tight budgets allocated to the upkeep of building assets. This entails the need for improved deterioration modeling of such buildings in order to enforce a predictive maintenance approach that decreases the unexpected occurrence of failures and the corresponding downtime elapsed to repair or replace the faulty asset components. Currently, hospitals utilize subjective deterioration prediction methodologies that mostly rely on age as the sole indicator of degradation to forecast the useful lives of the building components. Thus, this paper aims at formulating a more efficient stochastic deterioration prediction model that integrates the latest observed condition into the forecasting procedure to overcome the subjectivity and uncertainties associated with the currently employed methods. This is achieved by means of developing a hybrid genetic algorithm-based fuzzy Markovian model that simulates the deterioration process given the scarcity of available data demonstrating the condition assessment and evaluation for such critical facilities. A nonhomogeneous transition probability matrix (TPM) based on fuzzy membership functions representing the condition, age and relative deterioration rate of the hospital systems is utilized to address the inherited uncertainties. The TPM is further calibrated by means of a genetic algorithm to circumvent the drawbacks of the expert-based models. A sensitivity analysis was carried out to analyze the possible changes in the output resulting from predefined modifications to the input parameters in order to ensure the robustness of the model. The performance of the deterioration prediction model developed is then validated through a comparison with a state-of-art stochastic model in contrast to real hospital datasets, and the results obtained from the developed model significantly outperformed the long-established Weibull distribution-based deterioration prediction methodology with mean absolute errors of 1.405 and 9.852, respectively. Therefore, the developed model is expected to assist decision-makers in creating more efficient maintenance programs as well as more data-driven capital renewal plans.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2020-08-29
    Description: The harmonic closeness centrality measure associates, to each node of a graph, the average of the inverse of its distances from all the other nodes (by assuming that unreachable nodes are at infinite distance). This notion has been adapted to temporal graphs (that is, graphs in which edges can appear and disappear during time) and in this paper we address the question of finding the top-k nodes for this metric. Computing the temporal closeness for one node can be done in O(m) time, where m is the number of temporal edges. Therefore computing exactly the closeness for all nodes, in order to find the ones with top closeness, would require O(nm) time, where n is the number of nodes. This time complexity is intractable for large temporal graphs. Instead, we show how this measure can be efficiently approximated by using a “backward” temporal breadth-first search algorithm and a classical sampling technique. Our experimental results show that the approximation is excellent for nodes with high closeness, allowing us to detect them in practice in a fraction of the time needed for computing the exact closeness of all nodes. We validate our approach with an extensive set of experiments.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2020-07-16
    Description: High order convective Cahn-Hilliard type equations describe the faceting of a growing surface, or the dynamics of phase transitions in ternary oil-water-surfactant systems. In this paper, we prove the well-posedness of the classical solutions for the Cauchy problem, associated with this equation.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2020-07-08
    Description: We consider a rather general problem of nonparametric estimation of an uncountable set of probability density functions (p.d.f.’s) of the form: f ( x ; r ) , where r is a non-random real variable and ranges from R 1 to R 2 . We put emphasis on the algorithmic aspects of this problem, since they are crucial for exploratory analysis of big data that are needed for the estimation. A specialized learning algorithm, based on the 2D FFT, is proposed and tested on observations that allow for estimate p.d.f.’s of a jet engine temperatures as a function of its rotation speed. We also derive theoretical results concerning the convergence of the estimation procedure that contains hints on selecting parameters of the estimation algorithm.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2020-07-09
    Description: We report the design of a Spiking Neural Network (SNN) edge detector with biologically inspired neurons that has a conceptual similarity with both Hodgkin-Huxley (HH) model neurons and Leaky Integrate-and-Fire (LIF) neurons. The computation of the membrane potential, which is used to determine the occurrence or absence of spike events, at each time step, is carried out by using the analytical solution to a simplified version of the HH neuron model. We find that the SNN based edge detector detects more edge pixels in images than those obtained by a Sobel edge detector. We designed a pipeline for image classification with a low-exposure frame simulation layer, SNN edge detection layers as pre-processing layers and a Convolutional Neural Network (CNN) as a classification module. We tested this pipeline for the task of classification with the Digits dataset, which is available in MATLAB. We find that the SNN based edge detection layer increases the image classification accuracy at lower exposure times, that is, for 1 〈 t 〈 T /4, where t is the number of milliseconds in a simulated exposure frame and T is the total exposure time, with reference to a Sobel edge or Canny edge detection layer in the pipeline. These results pave the way for developing novel cognitive neuromorphic computing architectures for millisecond timescale detection and object classification applications using event or spike cameras.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2020-07-05
    Description: Microscopic crowd simulation can help to enhance the safety of pedestrians in situations that range from museum visits to music festivals. To obtain a useful prediction, the input parameters must be chosen carefully. In many cases, a lack of knowledge or limited measurement accuracy add uncertainty to the input. In addition, for meaningful parameter studies, we first need to identify the most influential parameters of our parametric computer models. The field of uncertainty quantification offers standardized and fully automatized methods that we believe to be beneficial for pedestrian dynamics. In addition, many methods come at a comparatively low cost, even for computationally expensive problems. This allows for their application to larger scenarios. We aim to identify and adapt fitting methods to microscopic crowd simulation in order to explore their potential in pedestrian dynamics. In this work, we first perform a variance-based sensitivity analysis using Sobol’ indices and then crosscheck the results by a derivative-based measure, the activity scores. We apply both methods to a typical scenario in crowd simulation, a bottleneck. Because constrictions can lead to high crowd densities and delays in evacuations, several experiments and simulation studies have been conducted for this setting. We show qualitative agreement between the results of both methods. Additionally, we identify a one-dimensional subspace in the input parameter space and discuss its impact on the simulation. Moreover, we analyze and interpret the sensitivity indices with respect to the bottleneck scenario.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2020-06-30
    Description: Standard (Lomb-Scargle, likelihood, etc.) procedures for power-spectrum analysis provide convenient estimates of the significance of any peak in a power spectrum, based—typically—on the assumption that the measurements being analyzed have a normal (i.e., Gaussian) distribution. However, the measurement sequence provided by a real experiment or a real observational program may not meet this requirement. The RONO (rank-order normalization) procedure generates a proxy distribution that retains the rank-order of the original measurements but has a strictly normal distribution. The proxy distribution may then be analyzed by standard power-spectrum analysis. We show by an example that the resulting power spectrum may prove to be quite close to the power spectrum obtained from the original data by a standard procedure, even if the distribution of the original measurements is far from normal. Such a comparison would tend to validate the original analysis.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2020-06-30
    Description: Toward strong demand for very high-speed I/O for processors, physical performance growth of hardware I/O speed was drastically increased in this decade. However, the recent Big Data applications still demand the larger I/O bandwidth and the lower latency for the speed. Because the current I/O performance does not improve so drastically, it is the time to consider another way to increase it. To overcome this challenge, we focus on lossless data compression technology to decrease the amount of data itself in the data communication path. The recent Big Data applications treat data stream that flows continuously and never allow stalling processing due to the high speed. Therefore, an elegant hardware-based data compression technology is demanded. This paper proposes a novel lossless data compression, called ASE coding. It encodes streaming data by applying the entropy coding approach. ASE coding instantly assigns the fewest bits to the corresponding compressed data according to the number of occupied entries in a look-up table. This paper describes the detailed mechanism of ASE coding. Furthermore, the paper demonstrates performance evaluations to promise that ASE coding adaptively shrinks streaming data and also works on a small amount of hardware resources without stalling or buffering any part of data stream.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2020-07-01
    Description: Text annotation is the process of identifying the sense of a textual segment within a given context to a corresponding entity on a concept ontology. As the bag of words paradigm’s limitations become increasingly discernible in modern applications, several information retrieval and artificial intelligence tasks are shifting to semantic representations for addressing the inherent natural language polysemy and homonymy challenges. With extensive application in a broad range of scientific fields, such as digital marketing, bioinformatics, chemical engineering, neuroscience, and social sciences, community detection has attracted great scientific interest. Focusing on linguistics, by aiming to identify groups of densely interconnected subgroups of semantic ontologies, community detection application has proven beneficial in terms of disambiguation improvement and ontology enhancement. In this paper we introduce a novel distributed supervised knowledge-based methodology employing community detection algorithms for text annotation with Wikipedia Entities, establishing the unprecedented concept of community Coherence as a metric for local contextual coherence compatibility. Our experimental evaluation revealed that deeper inference of relatedness and local entity community coherence in the Wikipedia graph bears substantial improvements overall via a focus on accuracy amelioration of less common annotations. The proposed methodology is propitious for wider adoption, attaining robust disambiguation performance.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2020-06-30
    Description: Geomechanical modelling of the processes associated to the exploitation of subsurface resources, such as land subsidence or triggered/induced seismicity, is a common practice of major interest. The prediction reliability depends on different sources of uncertainty, such as the parameterization of the constitutive model characterizing the deep rock behaviour. In this study, we focus on a Sobol’-based sensitivity analysis and uncertainty reduction via assimilation of land deformations. A synthetic test case application on a deep hydrocarbon reservoir is considered, where land settlements are predicted with the aid of a 3-D Finite Element (FE) model. Data assimilation is performed via the Ensemble Smoother (ES) technique and its variation in the form of Multiple Data Assimilation (ES-MDA). However, the ES convergence is guaranteed with a large number of Monte Carlo (MC) simulations, that may be computationally infeasible in large scale and complex systems. For this reason, a surrogate model based on the generalized Polynomial Chaos Expansion (gPCE) is proposed as an approximation of the forward problem. This approach allows to efficiently compute the Sobol’ indices for the sensitivity analysis and greatly reduce the computational cost of the original ES and MDA formulations, also enhancing the accuracy of the overall prediction process.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2020-06-30
    Description: Clustering is an unsupervised machine learning technique with many practical applications that has gathered extensive research interest. Aside from deterministic or probabilistic techniques, fuzzy C-means clustering (FCM) is also a common clustering technique. Since the advent of the FCM method, many improvements have been made to increase clustering efficiency. These improvements focus on adjusting the membership representation of elements in the clusters, or on fuzzifying and defuzzifying techniques, as well as the distance function between elements. This study proposes a novel fuzzy clustering algorithm using multiple different fuzzification coefficients depending on the characteristics of each data sample. The proposed fuzzy clustering method has similar calculation steps to FCM with some modifications. The formulas are derived to ensure convergence. The main contribution of this approach is the utilization of multiple fuzzification coefficients as opposed to only one coefficient in the original FCM algorithm. The new algorithm is then evaluated with experiments on several common datasets and the results show that the proposed algorithm is more efficient compared to the original FCM as well as other clustering methods.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2020-07-03
    Description: Business processes evolve over time to adapt to changing business environments. This requires continuous monitoring of business processes to gain insights into whether they conform to the intended design or deviate from it. The situation when a business process changes while being analysed is denoted as Concept Drift. Its analysis is concerned with studying how a business process changes, in terms of detecting and localising changes and studying the effects of the latter. Concept drift analysis is crucial to enable early detection and management of changes, that is, whether to promote a change to become part of an improved process, or to reject the change and make decisions to mitigate its effects. Despite its importance, there exists no comprehensive framework for analysing concept drift types, affected process perspectives, and granularity levels of a business process. This article proposes the CONcept Drift Analysis in Process Mining (CONDA-PM) framework describing phases and requirements of a concept drift analysis approach. CONDA-PM was derived from a Systematic Literature Review (SLR) of current approaches analysing concept drift. We apply the CONDA-PM framework on current approaches to concept drift analysis and evaluate their maturity. Applying CONDA-PM framework highlights areas where research is needed to complement existing efforts.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2020-04-14
    Description: Let P be a set of n points in R d , k ≥ 1 be an integer and ε ∈ ( 0 , 1 ) be a constant. An ε-coreset is a subset C ⊆ P with appropriate non-negative weights (scalars), that approximates any given set Q ⊆ R d of k centers. That is, the sum of squared distances over every point in P to its closest point in Q is the same, up to a factor of 1 ± ε to the weighted sum of C to the same k centers. If the coreset is small, we can solve problems such as k-means clustering or its variants (e.g., discrete k-means, where the centers are restricted to be in P, or other restricted zones) on the small coreset to get faster provable approximations. Moreover, it is known that such coreset support streaming, dynamic and distributed data using the classic merge-reduce trees. The fact that the coreset is a subset implies that it preserves the sparsity of the data. However, existing such coresets are randomized and their size has at least linear dependency on the dimension d. We suggest the first such coreset of size independent of d. This is also the first deterministic coreset construction whose resulting size is not exponential in d. Extensive experimental results and benchmarks are provided on public datasets, including the first coreset of the English Wikipedia using Amazon’s cloud.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2015-08-08
    Description: : As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA’s improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants. Availability and implementation : AVIA is freely available through the web at http://avia.abcc.ncifcrf.gov . Contact : Hue.Vuong@fnlcr.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2015-08-08
    Description: : As new methods for multivariate analysis of genome wide association studies become available, it is important to be able to combine results from different cohorts in a meta-analysis. The R package MultiMeta provides an implementation of the inverse-variance-based method for meta-analysis, generalized to an n -dimensional setting. Availability and implementation: The R package MultiMeta can be downloaded from CRAN. Contact: dragana.vuckovic@burlo.trieste.it ; vi1@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2015-08-12
    Description: We examine a distributed detection problem in a wireless sensor network, where sensor nodes collaborate to detect a Gaussian signal with an unknown change of power, i.e., a scale parameter. Due to power/bandwidth constraints, we consider the case where each sensor quantizes its observation into a binary digit. The binary data are then transmitted through error-prone wireless links to a fusion center, where a generalized likelihood ratio test (GLRT) detector is employed to perform a global decision. We study the design of a binary quantizer based on an asymptotic analysis of the GLRT. Interestingly, the quantization threshold of the quantizer is independent of the unknown scale parameter. Numerical results are included to illustrate the performance of the proposed quantizer and GLRT in binary symmetric channels (BSCs).
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2015-08-13
    Description: More and more hybrid electric vehicles are driven since they offer such advantages as energy savings and better active safety performance. Hybrid vehicles have two or more power driving systems and frequently switch working condition, so controlling stability is very important. In this work, a two-stage Kalman algorithm method is used to fuse data in hybrid vehicle stability testing. First, the RT3102 navigation system and Dewetron system are introduced. Second, a modeling of data fusion is proposed based on the Kalman filter. Then, this modeling is simulated and tested on a sample vehicle, using Carsim and Simulink software to test the results. The results showed the merits of this modeling.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2015-08-05
    Description: Currently deep learning has made great breakthroughs in visual and speech processing, mainly because it draws lessons from the hierarchical mode that brain deals with images and speech. In the field of NLP, a topic model is one of the important ways for modeling documents. Topic models are built on a generative model that clearly does not match the way humans write. In this paper, we propose Event Model, which is unsupervised and based on the language processing mechanism of neurolinguistics, to model documents. In Event Model, documents are descriptions of concrete or abstract events seen, heard, or sensed by people and words are objects in the events. Event Model has two stages: word learning and dimensionality reduction. Word learning is to learn semantics of words based on deep learning. Dimensionality reduction is the process that representing a document as a low dimensional vector by a linear mode that is completely different from topic models. Event Model achieves state-of-the-art results on document retrieval tasks.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: This study proposes a quantitative measurement of split of the second heart sound (S2) based on nonstationary signal decomposition to deal with overlaps and energy modeling of the subcomponents of S2. The second heart sound includes aortic (A2) and pulmonic (P2) closure sounds. However, the split detection is obscured due to A2-P2 overlap and low energy of P2. To identify such split, HVD method is used to decompose the S2 into a number of components while preserving the phase information. Further, A2s and P2s are localized using smoothed pseudo Wigner-Ville distribution followed by reassignment method. Finally, the split iscalculated by taking the differences between the means of time indices of A2s and P2s. Experiments on total 33 clips of S2 signals are performed for evaluation of the method. The mean ± standard deviation of the split is 34.7 ± 4.6 ms. The method measures the splitefficiently, even when A2-P2 overlap is ≤ 20 ms and the normalized peak temporal ratio of P2 to A2 is low (≥ 0.22). This proposed method thus, demonstrates its robustness by defining split detectability (SDT), the split detection aptness through detecting P2s, by measuring upto 96 percent. Such findings reveal the effectiveness of the method as competent against the other baselines, especially for A2-P2 overlaps and low energy P2.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Post-acquisition denoising of magnetic resonance (MR) images is an important step to improve any quantitative measurement of the acquired data. In this paper, assuming a Rician noise model, a new filtering method based on the linear minimum mean square error (LMMSE) estimation is introduced, which employs the self-similarity property of the MR data to restore the noise-less signal. This method takes into account the structural characteristics of images and the Bayesian mean square error (Bmse) of the estimator to address the denoising problem. In general, a twofold data processing approach is developed; first, the noisy MR data is processed using a patch-based L 2 -norm similarity measure to provide the primary set of samples required for the estimation process. Afterwards, the Bmse of the estimator is derived as the optimization function to analyze the pre-selected samples and minimize the error between the estimated and the underlying signal. Compared to the LMMSE method and also its recently proposed SNR-adapted realization (SNLMMSE), the optimized way of choosing the samples together with the automatic adjustment of the filtering parameters lead to a more robust estimation performance with our approach. Experimental results show the competitive performance of the proposed method in comparison with related state-of-the-art methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2015-08-07
    Description: Large-scale ad hoc analytics of genomic data is popular using the R-programming language supported by over 700 software packages provided by Bioconductor. More recently, analytical jobs are benefitting from on-demand computing and storage, their scalability and their low maintenance cost, all of which are offered by the cloud. While biologists and bioinformaticists can take an analytical job and execute it on their personal workstations, it remains challenging to seamlessly execute the job on the cloud infrastructure without extensive knowledge of the cloud dashboard. How analytical jobs can not only with minimum effort be executed on the cloud, but also how both the resources and data required by the job can be managed is explored in this paper. An open-source light-weight framework for executing R-scripts using Bioconductor packages, referred to as ‘RBioCloud’, is designed and developed. RBioCloud offers a set of simple command-line tools for managing the cloud resources, the data and the execution of the job. Three biological test cases validate the feasibility of RBioCloud. The framework is available from http://www.rbiocloud.com .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2015-08-07
    Description: Of major interest to translational genomics is the intervention in gene regulatory networks (GRNs) to affect cell behavior; in particular, to alter pathological phenotypes. Owing to the complexity of GRNs, accurate network inference is practically challenging and GRN models often contain considerable amounts of uncertainty. Considering the cost and time required for conducting biological experiments, it is desirable to have a systematic method for prioritizing potential experiments so that an experiment can be chosen to optimally reduce network uncertainty. Moreover, from a translational perspective it is crucial that GRN uncertainty be quantified and reduced in a manner that pertains to the operational cost that it induces, such as the cost of network intervention. In this work, we utilize the concept of mean objective cost of uncertainty (MOCU) to propose a novel framework for optimal experimental design. In the proposed framework, potential experiments are prioritized based on the MOCU expected to remain after conducting the experiment. Based on this prioritization, one can select an optimal experiment with the largest potential to reduce the pertinent uncertainty present in the current network model. We demonstrate the effectiveness of the proposed method via extensive simulations based on synthetic and real gene regulatory networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2015-08-07
    Description: A novel approach to Contact Map Overlap (CMO) problem is proposed using the two dimensional clusters present in the contact maps. Each protein is represented as a set of the non-trivial clusters of contacts extracted from its contact map. The approach involves finding matching regions between the two contact maps using approximate 2D-pattern matching algorithm and dynamic programming technique. These matched pairs of small contact maps are submitted in parallel to a fast heuristic CMO algorithm. The approach facilitates parallelization at this level since all the pairs of contact maps can be submitted to the algorithm in parallel. Then, a merge algorithm is used in order to obtain the overall alignment. As a proof of concept, MSVNS, a heuristic CMO algorithm is used for global as well as local alignment. The divide and conquer approach is evaluated for two benchmark data sets that of Skolnick and Ding et al. It is interesting to note that along with achieving saving of time, better overlap is also obtained for certain protein folds.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Canalizing genes possess broad regulatory power over a wide swath of regulatory processes. On the other hand, it has been hypothesized that the phenomenon of intrinsically multivariate prediction (IMP) is associated with canalization. However, applications have relied on user-selectable thresholds on the IMP score to decide on the presence of IMP. A methodology is developed here that avoids arbitrary thresholds, by providing a statistical test for the IMP score. In addition, the proposed procedure allows the incorporation of prior knowledge if available, which can alleviate the problem of loss of power due to small sample sizes. The issue of multiplicity of tests is addressed by family-wise error rate (FWER) and false discovery rate (FDR) controlling approaches. The proposed methodology is demonstrated by experiments using synthetic and real gene-expression data from studies on melanoma and ionizing radiation (IR) responsive genes. The results with the real data identified DUSP1 and p53, two well-known canalizing genes associated with melanoma and IR response, respectively, as the genes with a clear majority of IMP predictor pairs. This validates the potential of the proposed methodology as a tool for discovery of canalizing genes from binary gene-expression data. The procedure is made available through an R package.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Eliseo Ferrante, Ali Emre Turgut, Edgar Duéñez-Guzmán, Marco Dorigo, Tom Wenseleers Division of labor is ubiquitous in biological systems, as evidenced by various forms of complex task specialization observed in both animal societies and multicellular organisms. Although clearly adaptive, the way in which division of labor first evolved remains enigmatic, as it requires the simultaneous co-occurrence of several complex traits to achieve the required degree of coordination. Recently, evolutionary swarm robotics has emerged as an excellent test bed to study the evolution of coordinated group-level behavior. Here we use this framework for the first time to study the evolutionary origin of behavioral task specialization among groups of identical robots. The scenario we study involves an advanced form of division of labor, common in insect societies and known as “task partitioning”, whereby two sets of tasks have to be carried out in sequence by different individuals. Our results show that task partitioning is favored whenever the environment has features that, when exploited, reduce switching costs and increase the net efficiency of the group, and that an optimal mix of task specialists is achieved most readily when the behavioral repertoires aimed at carrying out the different subtasks are available as pre-adapted building blocks. Nevertheless, we also show for the first time that self-organized task specialization could be evolved entirely from scratch, starting only from basic, low-level behavioral primitives, using a nature-inspired evolutionary method known as Grammatical Evolution. Remarkably, division of labor was achieved merely by selecting on overall group performance, and without providing any prior information on how the global object retrieval task was best divided into smaller subtasks. We discuss the potential of our method for engineering adaptively behaving robot swarms and interpret our results in relation to the likely path that nature took to evolve complex sociality and task specialization.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2015-08-07
    Description: by Patrícia Santos-Oliveira, António Correia, Tiago Rodrigues, Teresa M Ribeiro-Rodrigues, Paulo Matafome, Juan Carlos Rodríguez-Manzaneque, Raquel Seiça, Henrique Girão, Rui D. M. Travasso Sprouting angiogenesis, where new blood vessels grow from pre-existing ones, is a complex process where biochemical and mechanical signals regulate endothelial cell proliferation and movement. Therefore, a mathematical description of sprouting angiogenesis has to take into consideration biological signals as well as relevant physical processes, in particular the mechanical interplay between adjacent endothelial cells and the extracellular microenvironment. In this work, we introduce the first phase-field continuous model of sprouting angiogenesis capable of predicting sprout morphology as a function of the elastic properties of the tissues and the traction forces exerted by the cells. The model is very compact, only consisting of three coupled partial differential equations, and has the clear advantage of a reduced number of parameters. This model allows us to describe sprout growth as a function of the cell-cell adhesion forces and the traction force exerted by the sprout tip cell. In the absence of proliferation, we observe that the sprout either achieves a maximum length or, when the traction and adhesion are very large, it breaks. Endothelial cell proliferation alters significantly sprout morphology, and we explore how different types of endothelial cell proliferation regulation are able to determine the shape of the growing sprout. The largest region in parameter space with well formed long and straight sprouts is obtained always when the proliferation is triggered by endothelial cell strain and its rate grows with angiogenic factor concentration. We conclude that in this scenario the tip cell has the role of creating a tension in the cells that follow its lead. On those first stalk cells, this tension produces strain and/or empty spaces, inevitably triggering cell proliferation. The new cells occupy the space behind the tip, the tension decreases, and the process restarts. Our results highlight the ability of mathematical models to suggest relevant hypotheses with respect to the role of forces in sprouting, hence underlining the necessary collaboration between modelling and molecular biology techniques to improve the current state-of-the-art.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2015-08-08
    Description: by Sayed-Rzgar Hosseini, Aditya Barve, Andreas Wagner All biological evolution takes place in a space of possible genotypes and their phenotypes. The structure of this space defines the evolutionary potential and limitations of an evolving system. Metabolism is one of the most ancient and fundamental evolving systems, sustaining life by extracting energy from extracellular nutrients. Here we study metabolism’s potential for innovation by analyzing an exhaustive genotype-phenotype map for a space of 10 15 metabolisms that encodes all possible subsets of 51 reactions in central carbon metabolism. Using flux balance analysis, we predict the viability of these metabolisms on 10 different carbon sources which give rise to 1024 potential metabolic phenotypes. Although viable metabolisms with any one phenotype comprise a tiny fraction of genotype space, their absolute numbers exceed 10 9 for some phenotypes. Metabolisms with any one phenotype typically form a single network of genotypes that extends far or all the way through metabolic genotype space, where any two genotypes can be reached from each other through a series of single reaction changes. The minimal distance of genotype networks associated with different phenotypes is small, such that one can reach metabolisms with novel phenotypes – viable on new carbon sources – through one or few genotypic changes. Exceptions to these principles exist for those metabolisms whose complexity (number of reactions) is close to the minimum needed for viability. Increasing metabolic complexity enhances the potential for both evolutionary conservation and evolutionary innovation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2015-08-08
    Description: Motivation: Stem cell differentiation is largely guided by master transcriptional regulators, but it also depends on the expression of other types of genes, such as cell cycle genes, signaling genes, metabolic genes, trafficking genes, etc. Traditional approaches to understanding gene expression patterns across multiple conditions, such as principal components analysis or K-means clustering, can group cell types based on gene expression, but they do so without knowledge of the differentiation hierarchy. Hierarchical clustering can organize cell types into a tree, but in general this tree is different from the differentiation hierarchy itself. Methods: Given the differentiation hierarchy and gene expression data at each node, we construct a weighted Euclidean distance metric such that the minimum spanning tree with respect to that metric is precisely the given differentiation hierarchy. We provide a set of linear constraints that are provably sufficient for the desired construction and a linear programming approach to identify sparse sets of weights, effectively identifying genes that are most relevant for discriminating different parts of the tree. Results: We apply our method to microarray gene expression data describing 38 cell types in the hematopoiesis hierarchy, constructing a weighted Euclidean metric that uses just 175 genes. However, we find that there are many alternative sets of weights that satisfy the linear constraints. Thus, in the style of random-forest training, we also construct metrics based on random subsets of the genes and compare them to the metric of 175 genes. We then report on the selected genes and their biological functions. Our approach offers a new way to identify genes that may have important roles in stem cell differentiation. Contact: tperkins@ohri.ca Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2015-08-08
    Description: Motivation: Principal component analysis (PCA) is a basic tool often used in bioinformatics for visualization and dimension reduction. However, it is known that PCA may not consistently estimate the true direction of maximal variability in high-dimensional, low sample size settings, which are typical for molecular data. Assuming that the underlying signal is sparse, i.e. that only a fraction of features contribute to a principal component (PC), this estimation consistency can be retained. Most existing sparse PCA methods use L1-penalization, i.e. the lasso , to perform feature selection. But, the lasso is known to lack variable selection consistency in high dimensions and therefore a subsequent interpretation of selected features can give misleading results. Results: We present S4VDPCA, a sparse PCA method that incorporates a subsampling approach, namely stability selection. S4VDPCA can consistently select the truly relevant variables contributing to a sparse PC while also consistently estimate the direction of maximal variability. The performance of the S4VDPCA is assessed in a simulation study and compared to other PCA approaches, as well as to a hypothetical oracle PCA that ‘knows’ the truly relevant features in advance and thus finds optimal, unbiased sparse PCs. S4VDPCA is computationally efficient and performs best in simulations regarding parameter estimation consistency and feature selection consistency. Furthermore, S4VDPCA is applied to a publicly available gene expression data set of medulloblastoma brain tumors. Features contributing to the first two estimated sparse PCs represent genes significantly over-represented in pathways typically deregulated between molecular subgroups of medulloblastoma. Availability and implementation: Software is available at https://github.com/mwsill/s4vdpca . Contact: m.sill@dkfz.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2015-08-08
    Description: Motivation: Glycans play critical roles in many biological processes, and their structural diversity is key for specific protein-glycan recognition. Comparative structural studies of biological molecules provide useful insight into their biological relationships. However, most computational tools are designed for protein structure, and despite their importance, there is no currently available tool for comparing glycan structures in a sequence order- and size-independent manner. Results: A novel method, GS-align, is developed for glycan structure alignment and similarity measurement. GS-align generates possible alignments between two glycan structures through iterative maximum clique search and fragment superposition. The optimal alignment is then determined by the maximum structural similarity score, GS-score, which is size-independent. Benchmark tests against the Protein Data Bank (PDB) N -linked glycan library and PDB homologous/non-homologous N -glycoprotein sets indicate that GS-align is a robust computational tool to align glycan structures and quantify their structural similarity. GS-align is also applied to template-based glycan structure prediction and monosaccharide substitution matrix generation to illustrate its utility. Availability and implementation: http://www.glycanstructure.org/gsalign . Contact: wonpil@ku.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2015-08-08
    Description: Motivation: Impedance-based technologies are advancing methods for measuring proliferation of adherent cell cultures non-invasively and in real time. The analysis of the resulting data has so far been hampered by inappropriate computational methods and the lack of systematic data to evaluate the characteristics of the assay. Results: We used a commercially available system for impedance-based growth measurement (xCELLigence) and compared the reported cell index with data from microscopy. We found that the measured signal correlates linearly with the cell number throughout the time of an experiment with sufficient accuracy in subconfluent cell cultures. The resulting growth curves for various colon cancer cells could be well described with the empirical Richards growth model, which allows for extracting quantitative parameters (such as characteristic cycle times). We found that frequently used readouts like the cell index at a specific time or the area under the growth curve cannot be used to faithfully characterize growth inhibition. We propose to calculate the average growth rate of selected time intervals to accurately estimate time-dependent IC50 values of drugs from growth curves. Contact: nils.bluethgen@charite.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2015-08-19
    Description: by Pengxing Cao, Ada W. C. Yan, Jane M. Heffernan, Stephen Petrie, Robert G. Moss, Louise A. Carolan, Teagan A. Guarnaccia, Anne Kelso, Ian G. Barr, Jodie McVernon, Karen L. Laurie, James M. McCaw Influenza is an infectious disease that primarily attacks the respiratory system. Innate immunity provides both a very early defense to influenza virus invasion and an effective control of viral growth. Previous modelling studies of virus–innate immune response interactions have focused on infection with a single virus and, while improving our understanding of viral and immune dynamics, have been unable to effectively evaluate the relative feasibility of different hypothesised mechanisms of antiviral immunity. In recent experiments, we have applied consecutive exposures to different virus strains in a ferret model, and demonstrated that viruses differed in their ability to induce a state of temporary immunity or viral interference capable of modifying the infection kinetics of the subsequent exposure. These results imply that virus-induced early immune responses may be responsible for the observed viral hierarchy. Here we introduce and analyse a family of within-host models of re-infection viral kinetics which allow for different viruses to stimulate the innate immune response to different degrees. The proposed models differ in their hypothesised mechanisms of action of the non-specific innate immune response. We compare these alternative models in terms of their abilities to reproduce the re-exposure data. Our results show that 1) a model with viral control mediated solely by a virus-resistant state, as commonly considered in the literature, is not able to reproduce the observed viral hierarchy; 2) the synchronised and desynchronised behaviour of consecutive virus infections is highly dependent upon the interval between primary virus and challenge virus exposures and is consistent with virus-dependent stimulation of the innate immune response. Our study provides the first mechanistic explanation for the recently observed influenza viral hierarchies and demonstrates the importance of understanding the host response to multi-strain viral infections. Re-exposure experiments provide a new paradigm in which to study the immune response to influenza and its role in viral control.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2015-08-21
    Description: by Paul M. Harrison, Laurent Badel, Mark J. Wall, Magnus J. E. Richardson Models of neocortical networks are increasingly including the diversity of excitatory and inhibitory neuronal classes. Significant variability in cellular properties are also seen within a nominal neuronal class and this heterogeneity can be expected to influence the population response and information processing in networks. Recent studies have examined the population and network effects of variability in a particular neuronal parameter with some plausibly chosen distribution. However, the empirical variability and covariance seen across multiple parameters are rarely included, partly due to the lack of data on parameter correlations in forms convenient for model construction. To addess this we quantify the heterogeneity within and between the neocortical pyramidal-cell classes in layers 2/3, 4, and the slender-tufted and thick-tufted pyramidal cells of layer 5 using a combination of intracellular recordings, single-neuron modelling and statistical analyses. From the response to both square-pulse and naturalistic fluctuating stimuli, we examined the class-dependent variance and covariance of electrophysiological parameters and identify the role of the h current in generating parameter correlations. A byproduct of the dynamic I-V method we employed is the straightforward extraction of reduced neuron models from experiment. Empirically these models took the refractory exponential integrate-and-fire form and provide an accurate fit to the perisomatic voltage responses of the diverse pyramidal-cell populations when the class-dependent statistics of the model parameters were respected. By quantifying the parameter statistics we obtained an algorithm which generates populations of model neurons, for each of the four pyramidal-cell classes, that adhere to experimentally observed marginal distributions and parameter correlations. As well as providing this tool, which we hope will be of use for exploring the effects of heterogeneity in neocortical networks, we also provide the code for the dynamic I-V method and make the full electrophysiological data set available.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2015-08-22
    Description: Community detection in a complex network is an important problem of much interest in recent years. In general, a community detection algorithm chooses an objective function and captures the communities of the network by optimizing the objective function, and then, one uses various heuristics to solve the optimization problem to extract the interesting communities for the user. In this article, we demonstrate the procedure to transform a graph into points of a metric space and develop the methods of community detection with the help of a metric defined for a pair of points. We have also studied and analyzed the community structure of the network therein. The results obtained with our approach are very competitive with most of the well-known algorithms in the literature, and this is justified over the large collection of datasets. On the other hand, it can be observed that time taken by our algorithm is quite less compared to other methods and justifies the theoretical findings.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-08-25
    Description: Motivation: The storage and transmission of high-throughput sequencing data consumes significant resources. As our capacity to produce such data continues to increase, this burden will only grow. One approach to reduce storage and transmission requirements is to compress this sequencing data. Results: We present a novel technique to boost the compression of sequencing that is based on the concept of bucketing similar reads so that they appear nearby in the file. We demonstrate that, by adopting a data-dependent bucketing scheme and employing a number of encoding ideas, we can achieve substantially better compression ratios than existing de novo sequence compression tools, including other bucketing and reordering schemes. Our method, Mince, achieves up to a 45% reduction in file sizes (28% on average) compared with existing state-of-the-art de novo compression schemes. Availability and implementation : Mince is written in C++11, is open source and has been made available under the GPLv3 license. It is available at http://www.cs.cmu.edu/~ckingsf/software/mince . Contact: carlk@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2015-08-25
    Description: : Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys 2 His 2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail. Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/ . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: t.hughes@utoronto.ca
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2015-08-25
    Description: Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al. , 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. Results : Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub ( http://github.com/ ) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. Availability and implementation : Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api . The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem . A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator . Code for that tool is available from https://github.com/OpenTreeOfLife/opentree . Contact : mtholder@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2015-08-25
    Description: Motivation: Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) detects genome-wide DNA–protein interactions and chromatin modifications, returning enriched regions (ERs), usually associated with a significance score. Moderately significant interactions can correspond to true, weak interactions, or to false positives; replicates of a ChIP-seq experiment can provide co-localised evidence to decide between the two cases. We designed a general methodological framework to rigorously combine the evidence of ERs in ChIP-seq replicates, with the option to set a significance threshold on the repeated evidence and a minimum number of samples bearing this evidence. Results : We applied our method to Myc transcription factor ChIP-seq datasets in K562 cells available in the ENCODE project. Using replicates, we could extend up to 3 times the ER number with respect to single-sample analysis with equivalent significance threshold. We validated the ‘rescued’ ERs by checking for the overlap with open chromatin regions and for the enrichment of the motif that Myc binds with strongest affinity; we compared our results with alternative methods (IDR and jMOSAiCS), obtaining more validated peaks than the former and less peaks than latter, but with a better validation. Availability and implementation : An implementation of the proposed method and its source code under GPLv3 license are freely available at http://www.bioinformatics.deib.polimi.it/MSPC/ and http://mspc.codeplex.com/ , respectively. Contact : marco.morelli@iit.it Supplementary information: Supplementary Material are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2015-08-25
    Description: : We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2. Availability and implementation : kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/ Contact: barryghall@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2015-08-25
    Description: Motivation: We have created an R package named phylogeo that provides a set of geographic utilities for sequencing-based microbial ecology studies. Although the geographic location of samples is an important aspect of environmental microbiology, none of the major software packages used in processing microbiome data include utilities that allow users to map and explore the spatial dimension of their data. phylogeo solves this problem by providing a set of plotting and mapping functions that can be used to visualize the geographic distribution of samples, to look at the relatedness of microbiomes using ecological distance, and to map the geographic distribution of particular sequences. By extending the popular phyloseq package and using the same data structures and command formats, phylogeo allows users to easily map and explore the geographic dimensions of their data from the R programming language. Availability and Implementation: phylogeo is documented and freely available http://zachcp.github.io/phylogeo Contact : zcharlop@rockefeller.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2015-08-25
    Description: : Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX. Availability and implementation : Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. Contact : ozan@cosbi.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-08-25
    Description: Motivation: Molecular dynamics simulations provide atomic insight into the physicochemical characteristics of lipid membranes and hence, a wide range of force field families capable of modelling various lipid types have been developed in recent years. To model membranes in a biologically realistic lipid composition, simulation systems containing multiple different lipids must be assembled. Results: We present a new web service called MemGen that is capable of setting up simulation systems of heterogenous lipid membranes. MemGen is not restricted to certain lipid force fields or lipid types, but instead builds membranes from uploaded structure files which may contain any kind of amphiphilic molecule. MemGen works with any all-atom or united-atom lipid representation. Availability and implementation: MemGen is freely available without registration at http://memgen.uni-goettingen.de . Contact: jhub@gwdg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2015-08-19
    Description: by Ariel Afek, Hila Cohen, Shiran Barber-Zucker, Raluca Gordân, David B. Lukatsky Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF) binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs) have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C . elegans and D . melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq) results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large-scale analysis using in vitro TF-DNA preferences obtained from the universal protein binding microarrays (PBM) for ~90 eukaryotic TFs belonging to 22 different DNA-binding domain types. As a result of this new analysis, we conclude that nonconsensus protein-DNA binding is a widespread phenomenon that significantly affects protein-DNA binding preferences and need not require the presence of consensus (specific) TFBSs in order to achieve genome-wide TF-DNA binding specificity.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2015-08-21
    Description: A three-step iterative method with fifth-order convergence as a new modification of Newton’s method was presented. This method is for finding multiple roots of nonlinear equation with unknown multiplicity m whose multiplicity m is the highest multiplicity. Its order of convergence is analyzed and proved. Results for some numerical examples show the efficiency of the new method.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2015-08-20
    Description: by The PLOS Computational Biology Staff
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2015-08-21
    Description: by James Tamerius, Cécile Viboud, Jeffrey Shaman, Gerardo Chowell While a relationship between environmental forcing and influenza transmission has been established in inter-pandemic seasons, the drivers of pandemic influenza remain debated. In particular, school effects may predominate in pandemic seasons marked by an atypical concentration of cases among children. For the 2009 A/H1N1 pandemic, Mexico is a particularly interesting case study due to its broad geographic extent encompassing temperate and tropical regions, well-documented regional variation in the occurrence of pandemic outbreaks, and coincidence of several school breaks during the pandemic period. Here we fit a series of transmission models to daily laboratory-confirmed influenza data in 32 Mexican states using MCMC approaches, considering a meta-population framework or the absence of spatial coupling between states. We use these models to explore the effect of environmental, school–related and travel factors on the generation of spatially-heterogeneous pandemic waves. We find that the spatial structure of the pandemic is best understood by the interplay between regional differences in specific humidity (explaining the occurrence of pandemic activity towards the end of the school term in late May-June 2009 in more humid southeastern states), school vacations (preventing influenza transmission during July-August in all states), and regional differences in residual susceptibility (resulting in large outbreaks in early fall 2009 in central and northern Mexico that had yet to experience fully-developed outbreaks). Our results are in line with the concept that very high levels of specific humidity, as present during summer in southeastern Mexico, favor influenza transmission, and that school cycles are a strong determinant of pandemic wave timing.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2015-08-21
    Description: by Alireza Alemi, Carlo Baldassi, Nicolas Brunel, Riccardo Zecchina Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2015-08-25
    Description: Motivation: In practice, identifying and interpreting the functional impacts of the regulatory relationships between micro-RNA and messenger-RNA is non-trivial. The sheer scale of possible micro-RNA and messenger-RNA interactions can make the interpretation of results difficult. Results: We propose a supervised framework, pMim, built upon concepts of significance combination, for jointly ranking regulatory micro-RNA and their potential functional impacts with respect to a condition of interest. Here, pMim directly tests if a micro-RNA is differentially expressed and if its predicted targets, which lie in a common biological pathway, have changed in the opposite direction. We leverage the information within existing micro-RNA target and pathway databases to stabilize the estimation and annotation of micro-RNA regulation making our approach suitable for datasets with small sample sizes. In addition to outputting meaningful and interpretable results, we demonstrate in a variety of datasets that the micro-RNA identified by pMim, in comparison to simpler existing approaches, are also more concordant with what is described in the literature. Availability and implementation: This framework is implemented as an R function, pMim , in the package sydSeq available from http://www.ellispatrick.com/r-packages . Contact: jean.yang@sydney.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2015-08-25
    Description: Motivation: Cellular mRNA levels originate from the combined action of multiple regulatory processes, which can be recapitulated by the rates of pre-mRNA synthesis, pre-mRNA processing and mRNA degradation. Recent experimental and computational advances set the basis to study these intertwined levels of regulation. Nevertheless, software for the comprehensive quantification of RNA dynamics is still lacking. Results: INSPEcT is an R package for the integrative analysis of RNA- and 4sU-seq data to study the dynamics of transcriptional regulation. INSPEcT provides gene-level quantification of these rates, and a modeling framework to identify which of these regulatory processes are most likely to explain the observed mRNA and pre-mRNA concentrations. Software performance is tested on a synthetic dataset, instrumental to guide the choice of the modeling parameters and the experimental design. Availability and implementation: INSPEcT is submitted to Bioconductor and is currently available as Supplementary Additional File S1 . Contact: mattia.pelizzola@iit.it Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2015-08-25
    Description: Motivation: Experimentally determined gene regulatory networks can be enriched by computational inference from high-throughput expression profiles. However, the prediction of regulatory interactions is severely impaired by indirect and spurious effects, particularly for eukaryotes. Recently, published methods report improved predictions by exploiting the a priori known targets of a regulator (its local topology) in addition to expression profiles. Results: We find that methods exploiting known targets show an unexpectedly high rate of false discoveries. This leads to inflated performance estimates and the prediction of an excessive number of new interactions for regulators with many known targets. These issues are hidden from common evaluation and cross-validation setups, which is due to Simpson’s paradox. We suggest a confidence score recalibration method (CoRe) that reduces the false discovery rate and enables a reliable performance estimation. Conclusions: CoRe considerably improves the results of network inference methods that exploit known targets. Predictions then display the biological process specificity of regulators more correctly and enable the inference of accurate genome-wide regulatory networks in eukaryotes. For yeast, we propose a network with more than 22 000 confident interactions. We point out that machine learning approaches outside of the area of network inference may be affected as well. Availability and implementation: Results, executable code and networks are available via our website http://www.bio.ifi.lmu.de/forschung/CoRe . Contact: robert.kueffner@helmholtz-muenchen.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2015-08-25
    Description: Motivation: Stoichiometric and constraint-based methods of computational strain design have become an important tool for rational metabolic engineering. One of those relies on the concept of constrained minimal cut sets (cMCSs). However, as most other techniques, cMCSs may consider only reaction (or gene) knockouts to achieve a desired phenotype. Results : We generalize the cMCSs approach to constrained regulatory MCSs (cRegMCSs), where up/downregulation of reaction rates can be combined along with reaction deletions. We show that flux up/downregulations can virtually be treated as cuts allowing their direct integration into the algorithmic framework of cMCSs. Because of vastly enlarged search spaces in genome-scale networks, we developed strategies to (optionally) preselect suitable candidates for flux regulation and novel algorithmic techniques to further enhance efficiency and speed of cMCSs calculation. We illustrate the cRegMCSs approach by a simple example network and apply it then by identifying strain designs for ethanol production in a genome-scale metabolic model of Escherichia coli. The results clearly show that cRegMCSs combining reaction deletions and flux regulations provide a much larger number of suitable strain designs, many of which are significantly smaller relative to cMCSs involving only knockouts. Furthermore, with cRegMCSs, one may also enable the fine tuning of desired behaviours in a narrower range. The new cRegMCSs approach may thus accelerate the implementation of model-based strain designs for the bio-based production of fuels and chemicals. Availability and implementation: MATLAB code and the examples can be downloaded at http://www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html . Contact : krishna.mahadevan@utoronto.ca or klamt@mpi-magdeburg.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2015-08-25
    Description: Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it. Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge. Availability: SwissLipids is freely available at http://www.swisslipids.org/ . Contact: alan.bridge@isb-sib.ch Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2015-08-25
    Description: Motivation: Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments. Results: chipPCR is an R package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R ’s S 4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R , such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation. Availability and implementation: http://cran.r-project.org/web/packages/chipPCR . Source code: https://github.com/michbur/chipPCR . Contact : stefan.roediger@b-tu.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2015-08-25
    Description: : A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (〈50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods. Availability and implementation: http://ifoldrna.dokhlab.org Contact: dokh@unc.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2015-08-25
    Description: : The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library. Availability and implementation: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api . Supplementary information: Supplementary data are available at Bioinformatics online Contact: juan@ebi.ac.uk
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2015-08-25
    Description: : Scanning probe microscopy (SPM) is already a relevant tool in biological research at the nanoscale. We present ‘Flatten plus’, a recent and helpful implementation in the well-known WSxM free software package. ‘Flatten plus’ allows reducing low-frequency noise in SPM images in a semi-automated way preventing the appearance of typical artifacts associated with such filters. Availability and implementation: WSxM is a free software implemented in C++ supported on MS Windows, but it can also be run under Mac or Linux using emulators such as Wine or Parallels. WSxM can be downloaded from http://www.wsxmsolutions.com/ . Contact: ignacio.horcas@wsxmsolutions.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2015-08-25
    Description: : Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings. Availability and Implementation: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot . The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/ . A detailed manual of the package with sample figures can be found at https://wencke.github.io/ Contact: fscabo@cnic.es or mricote@cnic.es
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2015-08-25
    Description: Motivation: Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, 〉300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database. Results: HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats. Availability and implementation: This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase . HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects. Contact: bdotto82@gmail.com or gabriel.wallau@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2015-08-12
    Description: by Sander Land, Steven A. Niederer Biophysical models of cardiac tension development provide a succinct representation of our understanding of force generation in the heart. The link between protein kinetics and interactions that gives rise to high cooperativity is not yet fully explained from experiments or previous biophysical models. We propose a biophysical ODE-based representation of cross-bridge (XB), tropomyosin and troponin within a contractile regulatory unit (RU) to investigate the mechanisms behind cooperative activation, as well as the role of cooperativity in dynamic tension generation across different species. The model includes cooperative interactions between regulatory units (RU-RU), between crossbridges (XB-XB), as well more complex interactions between crossbridges and regulatory units (XB-RU interactions). For the steady-state force-calcium relationship, our framework predicts that: (1) XB-RU effects are key in shifting the half-activation value of the force-calcium relationship towards lower [Ca 2+ ], but have only small effects on cooperativity. (2) XB-XB effects approximately double the duty ratio of myosin, but do not significantly affect cooperativity. (3) RU-RU effects derived from the long-range action of tropomyosin are a major factor in cooperative activation, with each additional unblocked RU increasing the rate of additional RU’s unblocking. (4) Myosin affinity for short (1–4 RU) unblocked stretches of actin of is very low, and the resulting suppression of force at low [Ca 2+ ] is a major contributor in the biphasic force-calcium relationship. We also reproduce isometric tension development across mouse, rat and human at physiological temperature and pacing rate, and conclude that species differences require only changes in myosin affinity and troponin I/troponin C affinity. Furthermore, we show that the calcium dependence of the rate of tension redevelopment k tr is explained by transient blocking of RU’s by a temporary decrease in XB-RU effects.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-12
    Description: by Jonas Paulsen, Odin Gramstad, Philippe Collas The three-dimensional (3D) structure of the genome is important for orchestration of gene expression and cell differentiation. While mapping genomes in 3D has for a long time been elusive, recent adaptations of high-throughput sequencing to chromosome conformation capture (3C) techniques, allows for genome-wide structural characterization for the first time. However, reconstruction of "consensus" 3D genomes from 3C-based data is a challenging problem, since the data are aggregated over millions of cells. Recent single-cell adaptations to the 3C-technique, however, allow for non-aggregated structural assessment of genome structure, but data suffer from sparse and noisy interaction sampling. We present a manifold based optimization (MBO) approach for the reconstruction of 3D genome structure from chromosomal contact data. We show that MBO is able to reconstruct 3D structures based on the chromosomal contacts, imposing fewer structural violations than comparable methods. Additionally, MBO is suitable for efficient high-throughput reconstruction of large systems, such as entire genomes, allowing for comparative studies of genomic structure across cell-lines and different species.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2015-08-12
    Description: by Hiroo Kenzaki, Shoji Takada Nucleosomes, basic units of chromatin, are known to show spontaneous DNA unwrapping dynamics that are crucial for transcriptional activation, but its structural details are yet to be elucidated. Here, employing a coarse-grained molecular model that captures residue-level structural details up to histone tails, we simulated equilibrium fluctuations and forced unwrapping of single nucleosomes at various conditions. The equilibrium simulations showed spontaneous unwrapping from outer DNA and subsequent rewrapping dynamics, which are in good agreement with experiments. We found several distinct partially unwrapped states of nucleosomes, as well as reversible transitions among these states. At a low salt concentration, histone tails tend to sit in the concave cleft between the histone octamer and DNA, tightening the nucleosome. At a higher salt concentration, the tails tend to bound to the outer side of DNA or be expanded outwards, which led to higher degree of unwrapping. Of the four types of histone tails, H3 and H2B tail dynamics are markedly correlated with partial unwrapping of DNA, and, moreover, their contributions were distinct. Acetylation in histone tails was simply mimicked by changing their charges, which enhanced the unwrapping, especially markedly for H3 and H2B tails.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-13
    Description: by Sebastian Bitzer, Jelle Bruineberg, Stefan J. Kiebel Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: The Regression Network plugin for Cytoscape ( RegNetC ) implements the RegNet algorithm for the inference of transcriptional association network from gene expression profiles. This algorithm is a model tree-based method to detect the relationship between each gene and the remaining genes simultaneously instead of analyzing individually each pair of genes as correlation-based methods do. Model trees are a very useful technique to estimate the gene expression value by regression models and favours localized similarities over more global similarity, which is one of the major drawbacks of correlation-based methods. Here, we present an integrated software suite, named RegNetC , as a Cytoscape plugin that can operate on its own as well. RegNetC facilitates, according to user-defined parameters, the resulted transcriptional gene association network in .sif format for visualization, analysis and interoperates with other Cytoscape plugins, which can be exported for publication figures. In addition to the network, the RegNetC plugin also provides the quantitative relationships between genes expression values of those genes involved in the inferred network, i.e., those defined by the regression models.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2015-08-07
    Description: Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Named-entity recognition (NER) plays an important role in the development of biomedical databases. However, the existing NER tools produce multifarious named-entities which may result in both curatable and non-curatable markers. To facilitate biocuration with a straightforward approach, classifying curatable named-entities is helpful with regard to accelerating the biocuration workflow. Co-occurrence Interaction Nexus with Named-entity Recognition (CoINNER) is a web-based tool that allows users to identify genes, chemicals, diseases, and action term mentions in the Comparative Toxicogenomic Database (CTD). To further discover interactions, CoINNER uses multiple advanced algorithms to recognize the mentions in the BioCreative IV CTD Track. CoINNER is developed based on a prototype system that annotated gene, chemical, and disease mentions in PubMed abstracts at BioCreative 2012 Track I (literature triage). We extended our previous system in developing CoINNER. The pre-tagging results of CoINNER were developed based on the state-of-the-art named entity recognition tools in BioCreative III. Next, a method based on conditional random fields (CRFs) is proposed to predict chemical and disease mentions in the articles. Finally, action term mentions were collected by latent Dirichlet allocation (LDA). At the BioCreative IV CTD Track, the best F-measures reached for gene/protein, chemical/drug and disease NER were 54 percent while CoINNER achieved a 61.5 percent F-measure. System URL: http://ikmbio.csie.ncku.edu.tw/coinner/introduction.htm.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2015-08-07
    Description: Next-generation short-read sequencing is widely utilized in genomic studies. Biological applications require an alignment step to map sequencing reads to the reference genome, before acquiring expected genomic information. This requirement makes alignment accuracy a key factor for effective biological interpretation. Normally, when accounting for measurement errors and single nucleotide polymorphisms, short read mappings with a few mismatches are generally considered acceptable. However, to further improve the efficiency of short-read sequencing alignment, we propose a method to retrieve additional reliably aligned reads (reads with more than a pre-defined number of mismatches), using a Bayesian-based approach. In this method, we first retrieve the sequence context around the mismatched nucleotides within the already aligned reads; these loci contain the genomic features where sequencing errors occur. Then, using the derived pattern, we evaluate the remaining (typically discarded) reads with more than the allowed number of mismatches, and calculate a score that represents the probability that a specific alignment is correct. This strategy allows the extraction of more reliably aligned reads, therefore improving alignment sensitivity. Implementation: The source code of our tool, ResSeq, can be downloaded from: https://github.com/hrbeubiocenter/Resseq.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In genome assembly graphs, motifs such as tips, bubbles, and cross links are studied in order to find sequencing errors and to understand the nature of the genome. Superbubble, a complex generalization of bubbles, was recently proposed as an important subgraph class for analyzing assembly graphs. At present, a quadratic time algorithm is known. This paper gives an -time algorithm to solve this problem for a graph with $m$ edges.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: The papers in this special section focus on software and databases that are central in bioinformatics and computational biology.. These programs are playing more and more important roles in biology and medical research. These papers cover a broad range of topics, including computational genomics and transcriptomics, analysis of biological networks and interactions, drug design, biomedical signal/image analysis, biomedical text mining and ontologies, biological data mining, visualization and integration, and high performance computing application in bioinformatics.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In the computational biology community, machine learning algorithms are key instruments for many applications, including the prediction of gene-functions based upon the available biomolecular annotations. Additionally, they may also be employed to compute similarity between genes or proteins. Here, we describe and discuss a software suite we developed to implement and make publicly available some of such prediction methods and a computational technique based upon Latent Semantic Indexing (LSI), which leverages both inferred and available annotations to search for semantically similar genes. The suite consists of three components. BioAnnotationPredictor is a computational software module to predict new gene-functions based upon Singular Value Decomposition of available annotations. SimilBio is a Web module that leverages annotations available or predicted by BioAnnotationPredictor to discover similarities between genes via LSI. The suite includes also SemSim , a new Web service built upon these modules to allow accessing them programmatically. We integrated SemSim in the Bio Search Computing framework (http://www.bioinformatics.deib.polimi.it/bio-seco/seco/), where users can exploit the Search Computing technology to run multi-topic complex queries on multiple integrated Web services. Accordingly, researchers may obtain ranked answers involving the computation of the functional similarity between genes in support of biomedical knowledge discovery.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2015-08-07
    Description: Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personali- ed cancer therapy.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Malachi Griffith, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Arjun Bharioke, Dmitri B. Chklovskii Neurons must faithfully encode signals that can vary over many orders of magnitude despite having only limited dynamic ranges. For a correlated signal, this dynamic range constraint can be relieved by subtracting away components of the signal that can be predicted from the past, a strategy known as predictive coding, that relies on learning the input statistics. However, the statistics of input natural signals can also vary over very short time scales e.g., following saccades across a visual scene. To maintain a reduced transmission cost to signals with rapidly varying statistics, neuronal circuits implementing predictive coding must also rapidly adapt their properties. Experimentally, in different sensory modalities, sensory neurons have shown such adaptations within 100 ms of an input change. Here, we show first that linear neurons connected in a feedback inhibitory circuit can implement predictive coding. We then show that adding a rectification nonlinearity to such a feedback inhibitory circuit allows it to automatically adapt and approximate the performance of an optimal linear predictive coding network, over a wide range of inputs, while keeping its underlying temporal and synaptic properties unchanged. We demonstrate that the resulting changes to the linearized temporal filters of this nonlinear network match the fast adaptations observed experimentally in different sensory modalities, in different vertebrate species. Therefore, the nonlinear feedback inhibitory network can provide automatic adaptation to fast varying signals, maintaining the dynamic range necessary for accurate neuronal transmission of natural inputs.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2015-08-08
    Description: by Murat Alp, Vipan K. Parihar, Charles L. Limoli, Francis A. Cucinotta In this work, a stochastic computational model of microscopic energy deposition events is used to study for the first time damage to irradiated neuronal cells of the mouse hippocampus. An extensive library of radiation tracks for different particle types is created to score energy deposition in small voxels and volume segments describing a neuron’s morphology that later are sampled for given particle fluence or dose. Methods included the construction of in silico mouse hippocampal granule cells from neuromorpho.org with spine and filopodia segments stochastically distributed along the dendritic branches. The model is tested with high-energy 56 Fe, 12 C, and 1 H particles and electrons. Results indicate that the tree-like structure of the neuronal morphology and the microscopic dose deposition of distinct particles may lead to different outcomes when cellular injury is assessed, leading to differences in structural damage for the same absorbed dose. The significance of the microscopic dose in neuron components is to introduce specific local and global modes of cellular injury that likely contribute to spine, filopodia, and dendrite pruning, impacting cognition and possibly the collapse of the neuron. Results show that the heterogeneity of heavy particle tracks at low doses, compared to the more uniform dose distribution of electrons, juxtaposed with neuron morphology make it necessary to model the spatial dose painting for specific neuronal components. Going forward, this work can directly support the development of biophysical models of the modifications of spine and dendritic morphology observed after low dose charged particle irradiation by providing accurate descriptions of the underlying physical insults to complex neuron structures at the nano-meter scale.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-08
    Description: by Brinda Vallat, Carlos Madrid-Aliste, Andras Fiser Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2015-08-08
    Description: Motivation : The majority of variation identified by genome wide association studies falls in non-coding genomic regions and is hypothesized to impact regulatory elements that modulate gene expression. Here we present a statistically rigorous software tool GREGOR (Genomic Regulatory Elements and Gwas Overlap algoRithm) for evaluating enrichment of any set of genetic variants with any set of regulatory features. Using variants from five phenotypes, we describe a data-driven approach to determine the tissue and cell types most relevant to a trait of interest and to identify the subset of regulatory features likely impacted by these variants. Last, we experimentally evaluate six predicted functional variants at six lipid-associated loci and demonstrate significant evidence for allele-specific impact on expression levels. GREGOR systematically evaluates enrichment of genetic variation with the vast collection of regulatory data available to explore novel biological mechanisms of disease and guide us toward the functional variant at trait-associated loci. Availability and implementation : GREGOR, including source code, documentation, examples, and executables, is available at http://genome.sph.umich.edu/wiki/GREGOR . Contact : cristen@umich.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2015-08-08
    Description: Motivation: Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. Results: An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores ( P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. Availability and implementation: The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/ . Contact: yudi.pawitan@ki.se Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2015-08-08
    Description: Motivation: With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. A number of robust statistical methods are available to identify genes showing differential expression among multiple conditions, but most assume conditions are exchangeable and thereby sacrifice power and precision when applied to ordered data. Results: We propose an empirical Bayes mixture modeling approach called EBSeq-HMM. In EBSeq-HMM, an auto-regressive hidden Markov model is implemented to accommodate dependence in gene expression across ordered conditions. As demonstrated in simulation and case studies, the output proves useful in identifying differentially expressed genes and in specifying gene-specific expression paths. EBSeq-HMM may also be used for inference regarding isoform expression. Availability and implementation: An R package containing examples and sample datasets is available at Bioconductor. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-08
    Description: Motivation: Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies. Results: We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm’s robustness and discuss its sensitivity with respect to the free parameters. Availability and implementation: The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics . Contact: vlassis@adobe.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2015-08-08
    Description: Motivation: Next-generation high-throughput sequencing has become a state-of-the-art technique in genome assembly. Scaffolding is one of the main stages of the assembly pipeline. During this stage, contigs assembled from the paired-end reads are merged into bigger chains called scaffolds. Because of a high level of statistical noise, chimeric reads, and genome repeats the problem of scaffolding is a challenging task. Current scaffolding software packages widely vary in their quality and are highly dependent on the read data quality and genome complexity. There are no clear winners and multiple opportunities for further improvements of the tools still exist. Results: This article presents an efficient scaffolding algorithm ScaffMatch that is able to handle reads with both short (〈600 bp) and long (〉35 000 bp) insert sizes producing high-quality scaffolds. We evaluate our scaffolding tool with the F score and other metrics (N50, corrected N50) on eight datasets comparing it with the most available packages. Our experiments show that ScaffMatch is the tool of preference for the most datasets. Availability and implementation: The source code is available at http://alan.cs.gsu.edu/NGS/?q=content/scaffmatch . Contact: mandric@cs.gsu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2015-08-08
    Description: Motivation: Identifying protein subchloroplast localization in chloroplast organelle is very helpful for understanding the function of chloroplast proteins. There have existed a few computational prediction methods for protein subchloroplast localization. However, these existing works have ignored proteins with multiple subchloroplast locations when constructing prediction models, so that they can predict only one of all subchloroplast locations of this kind of multilabel proteins. Results: To address this problem, through utilizing label-specific features and label correlations simultaneously, a novel multilabel classifier was developed for predicting protein subchloroplast location(s) with both single and multiple location sites. As an initial study, the overall accuracy of our proposed algorithm reaches 55.52%, which is quite high to be able to become a promising tool for further studies. Availability and implementation: An online web server for our proposed algorithm named MultiP-SChlo was developed, which are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/multip-schlo/ . Contact: pandaxiaoxi@gmail.com or gzli@tongji.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2015-08-08
    Description: Motivation: Loops in proteins are often involved in biochemical functions. Their irregularity and flexibility make experimental structure determination and computational modeling challenging. Most current loop modeling methods focus on modeling single loops. In protein structure prediction, multiple loops often need to be modeled simultaneously. As interactions among loops in spatial proximity can be rather complex, sampling the conformations of multiple interacting loops is a challenging task. Results: In this study, we report a new method called m ulti-loop Di stance-guided S equential chain- Gro wth Monte Carlo ( M -D i SG ro ) for prediction of the conformations of multiple interacting loops in proteins. Our method achieves an average RMSD of 1.93 Å for lowest energy conformations of 36 pairs of interacting protein loops with the total length ranging from 12 to 24 residues. We further constructed a data set containing proteins with 2, 3 and 4 interacting loops. For the most challenging target proteins with four loops, the average RMSD of the lowest energy conformations is 2.35 Å. Our method is also tested for predicting multiple loops in β-barrel membrane proteins. For outer-membrane protein G, the lowest energy conformation has a RMSD of 2.62 Å for the three extracellular interacting loops with a total length of 34 residues (12, 12 and 10 residues in each loop). Availability and implementation : The software is freely available at: tanto.bioe.uic.edu/m-DiSGro. Contact: jinfeng@stat.fsu.edu or jliang@uic.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2015-08-08
    Description: Motivation: Network comparison is a computationally intractable problem with important applications in systems biology and other domains. A key challenge is to properly quantify similarity between wiring patterns of two networks in an alignment-free fashion. Also, alignment-based methods exist that aim to identify an actual node mapping between networks and as such serve a different purpose. Various alignment-free methods that use different global network properties (e.g. degree distribution) have been proposed. Methods based on small local subgraphs called graphlets perform the best in the alignment-free network comparison task, due to high level of topological detail that graphlets can capture. Among different graphlet-based methods, Graphlet Correlation Distance (GCD) was shown to be the most accurate for comparing networks. Recently, a new graphlet-based method called NetDis was proposed, which was claimed to be superior. We argue against this, as the performance of NetDis was not properly evaluated to position it correctly among the other alignment-free methods. Results : We evaluate the performance of available alignment-free network comparison methods, including GCD and NetDis. We do this by measuring accuracy of each method (in a systematic precision-recall framework) in terms of how well the method can group (cluster) topologically similar networks. By testing this on both synthetic and real-world networks from different domains, we show that GCD remains the most accurate, noise-tolerant and computationally efficient alignment-free method. That is, we show that NetDis does not outperform the other methods, as originally claimed, while it is also computationally more expensive. Furthermore, since NetDis is dependent on the choice of a network null model (unlike the other graphlet-based methods), we show that its performance is highly sensitive to the choice of this parameter. Finally, we find that its performance is not independent on network sizes and densities, as originally claimed. Contact : natasha@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2015-08-08
    Description: Motivation: The functional impact of small molecules is increasingly being assessed in different eukaryotic species through large-scale phenotypic screening initiatives. Identifying the targets of these molecules is crucial to mechanistically understand their function and uncover new therapeutically relevant modes of action. However, despite extensive work carried out in model organisms and human, it is still unclear to what extent one can use information obtained in one species to make predictions in other species. Results: Here, for the first time, we explore and validate at a large scale the use of protein homology relationships to predict the targets of small molecules across different species. Our results show that exploiting target homology can significantly improve the predictions, especially for molecules experimentally tested in other species. Interestingly, when considering separately orthology and paralogy relationships, we observe that mapping small molecule interactions among orthologs improves prediction accuracy, while including paralogs does not improve and even sometimes worsens the prediction accuracy. Overall, our results provide a novel approach to integrate chemical screening results across multiple species and highlight the promises and remaining challenges of using protein homology for small molecule target identification. Availability and implementation: Homology-based predictions can be tested on our website http://www.swisstargetprediction.ch . Contact: david.gfeller@unil.ch or vincent.zoete@isb-sib.ch . Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: The papers in this special issue contain extended versions of works that were originally presented at the Brazilian Symposium on Bioinformatics 2013 (BSB 2013), held in Recife, Brazil, November 3-6, 2013.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2015-06-06
    Description: Computational methods for predicting protein-protein interactions are important tools that can complement high-throughput technologies and guide biologists in designing new laboratory experiments. The proteins and the interactions between them can be described by a network which is characterized by several topological properties. Information about proteins and interactions between them, in combination with knowledge about topological properties of the network, can be used for developing computational methods that can accurately predict unknown protein-protein interactions. This paper presents a supervised learning framework based on Bayesian inference for combining two types of information: i) network topology information, and ii) information related to proteins and the interactions between them. The motivation of our model is that by combining these two types of information one can achieve a better accuracy in predicting protein-protein interactions, than by using models constructed from these two types of information independently.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: We develop a theory of algebraic operations over linear and context-free grammars that makes it possible to combine simple “atomic” grammars operating on single sequences into complex, multi-dimensional grammars. We demonstrate the utility of this framework by constructing the search spaces of complex alignment problems on multiple input sequences explicitly as algebraic expressions of very simple one-dimensional grammars. In particular, we provide a fully worked frameshift-aware, semiglobal DNA-protein alignment algorithm whose grammar is composed of products of small, atomic grammars. The compiler accompanying our theory makes it easy to experiment with the combination of multiple grammars and different operations. Composite grammars can be written out in $ {rm L}^AT_{E}X$ for documentation and as a guide to implementation of dynamic programming algorithms. An embedding in Haskell as a domain-specific language makes the theory directly accessible to writing and using grammar products without the detour of an external compiler. Software and supplemental files available here: http://www.bioinf.uni-leipzig.de/Software/gramprod/
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: Recent advancements in genomics and proteomics provide a solid foundation for understanding the pathogenesis of diabetes. Proteomics of diabetes associated pathways help to identify the most potent target for the management of diabetes. The relevant datasets are scattered in various prominent sources which takes much time to select the therapeutic target for the clinical management of diabetes. However, additional information about target proteins is needed for validation. This lacuna may be resolved by linking diabetes associated genes, pathways and proteins and it will provide a strong base for the treatment and planning management strategies of diabetes. Thus, a web source “Diabetes Associated Proteins Database (DAPD)” has been developed to link the diabetes associated genes, pathways and proteins using PHP, MySQL. The current version of DAPD has been built with proteins associated with different types of diabetes. In addition, DAPD has been linked to external sources to gain the access to more participatory proteins and their pathway network. DAPD will reduce the time and it is expected to pave the way for the discovery of novel anti-diabetic leads using computational drug designing for diabetes management. DAPD is open accessed via following url www.mkarthikeyan.bioinfoau.org/dapd.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2015-06-06
    Description: Determining the glycan topology automatically from mass spectra represents a great challenge. Existing methods fall into approximate and exact ones. The former including greedy and heuristic ones can reduce the computational complexity, but suffer from information lost in the procedure of glycan interpretation. The latter including dynamic programming and exhaustive enumeration are much slower than the former. In the past years, nearly all emerging methods adopted a tree structure to represent a glycan. They share such problems as repetitive peak counting in reconstructing a candidate structure. Besides, tree-based glycan representation methods often have to give different computational formulas for binary and ternary glycans. We propose a new directed acyclic graph structure for glycan representation. Based on it, this work develops a de novo algorithm to accurately reconstruct the tree structure iteratively from mass spectra with logical constraints and some known biosynthesis rules, by a single computational formula. The experiments on multiple complex glycans extracted from human serum show that the proposed algorithm can achieve higher accuracy to determine a glycan topology than prior methods without increasing computational burden.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2015-06-06
    Description: A crucial step in understanding the architecture of cells and tissues from microscopy images, and consequently explain important biological events such as wound healing and cancer metastases, is the complete extraction and enumeration of individual filaments from the cellular cytoskeletal network. Current efforts at quantitative estimation of filament length distribution, architecture and orientation from microscopy images are predominantly limited to visual estimation and indirect experimental inference. Here we demonstrate the application of a new algorithm to reliably estimate centerlines of biological filament bundles and extract individual filaments from the centerlines by systematically disambiguating filament intersections. We utilize a filament enhancement step followed by reverse diffusion based filament localization and an integer programming based set combination to systematically extract accurate filaments automatically from microscopy images. Experiments on simulated and real confocal microscope images of flat cells (2D images) show efficacy of the new method.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: The Local/Global Alignment (Zemla, 2003), or LGA, is a popular method for the comparison of protein structures. One of the two components of LGA requires us to compute the longest common contiguous segments between two protein structures. That is, given two structures $A=(a_1, ldots , a_n)$ and $B=(b_1, ldots , b_n)$ where $a_k$ , $b_kin mathbb {R}^3$ , we are to find, among all the segments $f=(a_i,ldots ,a_j)$ and $g=(b_i,ldots ,b_j)$ that fulfill a certain criterion regarding their similarity, those of the maximum length. We consider the following criteria: (1) the root mean squared deviation (RMSD) between $f$ and $g$ is to be within a given $tin mathbb {R}$ ; (2) $f$ and $g$ can be superposed such that for each $k$ , $ile kle j$ , $Vert a_k-b_kVert le t$ for a given $tin mathbb {R}$ . We give an algorithm of $O(n;log; n+n{{boldsymbol l}})$ time complexity when the first requirement applies, where ${{boldsymbol l}}$ is the maximum length of the segments fulfilling the criterion. We show an FPTAS which, for any
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: Noise can induce various dynamical behaviors in nonlinear systems. White noise perturbed systems have been extensively investigated during the last decades. In gene networks, experimentally observed extrinsic noise is colored. As an attempt, we investigate the genetic toggle switch systems perturbed by colored extrinsic noise and with kinetic parameters. Compared with white noise perturbed systems, we show there also exists optimal colored noise strength to induce the best stochastic switch behaviors in the single toggle switch, and the best synchronized switching in the networked systems, which demonstrate that noise-induced optimal switch behaviors are widely in existence. Moreover, under a wide range of system parameter regions, we find there exist wider ranges of white and colored noises strengths to induce good switch and synchronization behaviors, respectively; therefore, white noise is beneficial for switch and colored noise is beneficial for population synchronization. Our observations are very robust to extrinsic stimulus strength, cell density, and diffusion rate. Finally, based on the Waddington’s epigenetic landscape and the Wiener-Khintchine theorem, physical mechanisms underlying the observations are interpreted. Our investigations can provide guidelines for experimental design, and have potential clinical implications in gene therapy and synthetic biology.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2015-06-06
    Description: Revealing the underlying evolutionary mechanism plays an important role in understanding protein interaction networks in the cell. While many evolutionary models have been proposed, the problem about applying these models to real network data, especially for differentiating which model can better describe evolutionary process for the observed network remains a challenge. The traditional way is to use a model with presumed parameters to generate a network, and then evaluate the fitness by summary statistics, which however cannot capture the complete network structures information and estimate parameter distribution. In this work, we developed a novel method based on Approximate Bayesian Computation and modified Differential Evolution algorithm (ABC-DEP) that is capable of conducting model selection and parameter estimation simultaneously and detecting the underlying evolutionary mechanisms for PPI networks more accurately. We tested our method for its power in differentiating models and estimating parameters on simulated data and found significant improvement in performance benchmark, as compared with a previous method. We further applied our method to real data of protein interaction networks in human and yeast. Our results show duplication attachment model as the predominant evolutionary mechanism for human PPI networks and Scale-Free model as the predominant mechanism for yeast PPI networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2015-06-06
    Description: Single nucleotide polymorphisms, a dominant type of genetic variants, have been used successfully to identify defective genes causing human single gene diseases. However, most common human diseases are complex diseases and caused by gene-gene and gene-environment interactions. Many SNP-SNP interaction analysis methods have been introduced but they are not powerful enough to discover interactions more than three SNPs. The paper proposes a novel method that analyzes all SNPs simultaneously. Different from existing methods, the method regards an individual’s genotype data on a list of SNPs as a point with a unit of energy in a multi-dimensional space, and tries to find a new coordinate system where the energy distribution difference between cases and controls reaches the maximum. The method will find different multiple SNPs combinatorial patterns between cases and controls based on the new coordinate system. The experiment on simulated data shows that the method is efficient. The tests on the real data of age-related macular degeneration (AMD) disease show that it can find out more significant multi-SNP combinatorial patterns than existing methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2015-06-06
    Description: Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form discriminative features. Based on the extracted features, a random forest regression model is performed to predict protein disulfide connectivity. We compare the proposed method with popular existing predictors by performing both cross-validation and independent validation tests on benchmark datasets. The experimental results demonstrate the superiority of the proposed method over existing predictors. We believe the superiority of the proposed method benefits from both the good discriminative capability of the newly developed features and the powerful modelling capability of the random forest. The web server implementation, called TargetDisulfide, and the benchmark datasets are freely available at: http://csbio.njust.edu.cn/bioinf/TargetDisulfide for academic use.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2015-06-06
    Description: Proteins are molecules that form the mass of living beings. These proteins exist in dissociated forms like amino-acids and carry out various biological functions, in fact, almost all body reactions occur with the participation of proteins. This is one of the reasons why the analysis of proteins has become a major issue in biology. In a more concrete way, the identification of conserved patterns in a set of related protein sequences can provide relevant biological information about these protein functions. In this paper, we present a novel algorithm based on teaching learning based optimization (TLBO) combined with a local search function specialized to predict common patterns in sets of protein sequences. This population-based evolutionary algorithm defines a group of individuals (solutions) that enhance their knowledge (quality) by means of different learning stages. Thus, if we correctly adapt it to the biological context of the mentioned problem, we can get an acceptable set of quality solutions. To evaluate the performance of the proposed technique, we have used six instances composed of different related protein sequences obtained from the PROSITE database. As we will see, the designed approach makes good predictions and improves the quality of the solutions found by other well-known biological tools.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2015-06-06
    Description: We introduce Supervised Variational Relevance Learning (Suvrel), a variational method to determine metric tensors to define distance based similarity in pattern classification, inspired in relevance learning. The variational method is applied to a cost function that penalizes large intraclass distances and favors small interclass distances. We find analytically the metric tensor that minimizes the cost function. Preprocessing the patterns by doing linear transformations using the metric tensor yields a dataset which can be more efficiently classified. We test our methods using publicly available datasets, for some standard classifiers. Among these datasets, two were tested by the MAQC-II project and, even without the use of further preprocessing, our results improve on their performance.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2015-07-30
    Description: In this paper, we present three improvements to a three-point third order variant of Newton’s method derived from the Simpson rule. The first one is a fifth order method using the same number of functional evaluations as the third order method, the second one is a four-point 10th order method and the last one is a five-point 20th order method. In terms of computational point of view, our methods require four evaluations (one function and three first derivatives) to get fifth order, five evaluations (two functions and three derivatives) to get 10th order and six evaluations (three functions and three derivatives) to get 20th order. Hence, these methods have efficiency indexes of 1.495, 1.585 and 1.648, respectively which are better than the efficiency index of 1.316 of the third order method. We test the methods through some numerical experiments which show that the 20th order method is very efficient.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-08-05
    Description: by Po-Wei Chen, Luis L. Fonseca, Yusuf A. Hannun, Eberhard O. Voit The article demonstrates that computational modeling has the capacity to convert metabolic snapshots, taken sequentially over time, into a description of cellular, dynamic strategies. The specific application is a detailed analysis of a set of actions with which Saccharomyces cerevisiae responds to heat stress. Using time dependent metabolic concentration data, we use a combination of mathematical modeling, reverse engineering, and optimization to infer dynamic changes in enzyme activities within the sphingolipid pathway. The details of the sphingolipid responses to heat stress are important, because they guide some of the longer-term alterations in gene expression, with which the cells adapt to the increased temperature. The analysis indicates that all enzyme activities in the system are affected and that the shapes of the time trends in activities depend on the fatty-acyl CoA chain lengths of the different ceramide species in the system.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2015-07-30
    Description: Robust small target detection of low signal-to-noise ratio (SNR) is very important in infrared search and track applications for self-defense or attacks. Due to the complex background, current algorithms have some unsolved issues with false alarm rate. In order to reduce the false alarm rate, an infrared small target detection algorithm based on saliency detection and support vector machine was proposed. Firstly, we detect salient regions that may contain targets with phase spectrum Fourier transform (PFT) approach. Then, target recognition was performed in the salient regions. Experimental results show the proposed algorithm has ideal robustness and efficiency for real infrared small target detection applications.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...