ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (111)
  • Institute of Electrical and Electronics Engineers (IEEE)  (111)
  • 2015-2019  (111)
  • 1990-1994
  • 1945-1949
  • 2016  (111)
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics  (111)
  • 40961
  • Computer Science  (111)
  • Geography
  • Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
Collection
  • Articles  (111)
Publisher
  • Institute of Electrical and Electronics Engineers (IEEE)  (111)
Years
  • 2015-2019  (111)
  • 1990-1994
  • 1945-1949
Year
Topic
  • Computer Science  (111)
  • Geography
  • Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
  • Biology  (111)
  • 1
    Publication Date: 2016-04-01
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM ( http://www.bioinformatics.deib.polimi.it/SOSGeM/ ), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists’ queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists’ queries.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: The behaviour of a high dimensional stochastic system described by a chemical master equation (CME) depends on many parameters, rendering explicit simulation an inefficient method for exploring the properties of such models. Capturing their behaviour by low-dimensional models makes analysis of system behaviour tractable. In this paper, we present low dimensional models for the noise-induced excitable dynamics in Bacillus subtilis , whereby a key protein ComK, which drives a complex chain of reactions leading to bacterial competence, gets expressed rapidly in large quantities (competent state) before subsiding to low levels of expression (vegetative state). These rapid reactions suggest the application of an adiabatic approximation of the dynamics of the regulatory model that, however, lead to competence durations that are incorrect by a factor of 2. We apply a modified version of an iterative functional procedure that faithfully approximates the time-course of the trajectories in terms of a two-dimensional model involving proteins ComK and ComS. Furthermore, in order to describe the bimodal bivariate marginal probability distribution obtained from the Gillespie simulations of the CME, we introduce a tunable multiplicative noise term in a two-dimensional Langevin model whose stationary state is described by the time-independent solution of the corresponding Fokker-Planck equation.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2016-04-01
    Description: The inference of demographic history of populations is an important undertaking in population genetics. A few recent studies have developed identity-by-descent (IBD) based methods to reveal the signature of the relatively recent historical events. Notably, Pe'er and his colleagues have introduced a novel method (named PIBD here) by employing IBD sharing to infer effective population size and migration rate. However, under island model, PIBD neglects the coalescent information before the time to the most recent common ancestor (tMRCA) which leads to apparent deviations in certain situations. In this paper, we propose a new method, MIBD, by adopting a Markov process to describe the island model and develop a new formula for estimating IBD sharing. The new formula considers the coalescent information before tMRCA and the joint effect of the coalescent and migration events. We apply both MIBD and PIBD to the genome-wide data of two human populations (Palestinian and Bedouin) obtained from the HGDP-CEPH database, and demonstrate that MIBD is competitive to PIBD. Our simulation analyses also show that the results of MIBD are more accurate than those of PIBD especially in the case of small effective population size.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: Genes and their protein products are essential molecular units of a living organism. The knowledge of their functions is key for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. The association of a gene or protein with its functions, described by controlled terms of biomolecular terminologies or ontologies, is named gene functional annotation . Very many and valuable gene annotations expressed through terminologies and ontologies are available. Nevertheless, they might include some erroneous information, since only a subset of annotations are reviewed by curators. Furthermore, they are incomplete by definition, given the rapidly evolving pace of biomolecular knowledge. In this scenario, computational methods that are able to quicken the annotation curation process and reliably suggest new annotations are very important. Here, we first propose a computational pipeline that uses different semantic and machine learning methods to predict novel ontology-based gene functional annotations; then, we introduce a new semantic prioritization rule to categorize the predicted annotations by their likelihood of being correct. Our tests and validations proved the effectiveness of our pipeline and prioritization of predicted annotations, by selecting as most likely manifold predicted annotations that were later confirmed.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2016-04-01
    Description: This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: Determining the biological functions of proteins is one of the key challenges in the post-genomic era. The rapidly accumulated large volumes of proteomic and genomic data drives to develop computational models for automatically predicting protein function in large scale. Recent approaches focus on integrating multiple heterogeneous data sources and they often get better results than methods that use single data source alone. In this paper, we investigate how to integrate multiple biological data sources with the biological knowledge, i.e., Gene Ontology (GO), for protein function prediction. We propose a method, called SimNet , to S emantically i ntegrate m ultiple functional association Net works derived from heterogenous data sources. SimNet firstly utilizes GO annotations of proteins to capture the semantic similarity between proteins and introduces a semantic kernel based on the similarity. Next, SimNet constructs a composite network, obtained as a weighted summation of individual networks, and aligns the network with the kernel to get the weights assigned to individual networks. Then, it applies a network-based classifier on the composite network to predict protein function. Experiment results on heterogenous proteomic data sources of Yeast, Human, Mouse, and Fly show that, SimNet not only achieves better (or comparable) results than other related competitive approaches, but also takes much less time. The Matlab codes of SimNet are available at https://sites.google.com/site/guoxian85/simnet .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: Inferring gene regulatory networks (GRNs) from high-throughput gene-expression data is an important and challenging problem in systems biology. Several existing algorithms formulate GRN inference as a regression problem. The available regression based algorithms are based on the assumption that all regulatory interactions are linear. However, nonlinear transcription regulation mechanisms are common in biology. In this work, we propose a new regression based method named bLARS that permits a variety of regulatory interactions from a predefined but otherwise arbitrary family of functions. On three DREAM benchmark datasets, namely gene expression data from E. coli, Yeast, and a synthetic data set, bLARS outperforms state-of-the-art algorithms in the terms of the overall score . On the individual networks, bLARS offers the best performance among currently available similar algorithms, namely algorithms that do not use perturbation information and are not meta-algorithms. Moreover, the presented approach can also be utilized for general feature selection problems in domains other than biology, provided they are of a similar structure.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2016-04-01
    Description: Understanding complex biological phenomena involves answering complex biomedical questions on multiple biomolecular information simultaneously, which are expressed through multiple genomic and proteomic semantic annotations scattered in many distributed and heterogeneous data sources; such heterogeneity and dispersion hamper the biologists’ ability of asking global queries and performing global evaluations. To overcome this problem, we developed a software architecture to create and maintain a Genomic and Proteomic Knowledge Base (GPKB), which integrates several of the most relevant sources of such dispersed information (including Entrez Gene, UniProt, IntAct, Expasy Enzyme, GO, GOA, BioCyc, KEGG, Reactome, and OMIM). Our solution is general, as it uses a flexible, modular, and multilevel global data schema based on abstraction and generalization of integrated data features, and a set of automatic procedures for easing data integration and maintenance, also when the integrated data sources evolve in data content, structure, and number. These procedures also assure consistency, quality, and provenance tracking of all integrated data, and perform the semantic closure of the hierarchical relationships of the integrated biomedical ontologies. At http://www.bioinformatics.deib.polimi.it/GPKB/ , a Web interface allows graphical easy composition of queries, although complex, on the knowledge base, supporting also semantic query expansion and comprehensive explorative search of the integrated data to better sustain biomedical knowledge extraction.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2016-04-01
    Description: Analysis of metagenomic sequence data requires a multi-stage workflow. The results of each intermediate step possess an inherent uncertainty and potentially impact the as-yet-unmeasured statistical significance of downstream analyses. Here, we describe our phylogenetic analysis pipeline which uses the pplacer program to place many shotgun sequences corresponding to a single functional gene onto a fixed phylogenetic tree. We then use the squash clustering method to compare multiple samples with respect to that gene. We approximate the statistical significance of each gene's clustering result by measuring its cluster stability, the consistency of that clustering result when the probabilistic placements made by pplacer are systematically reassigned and then clustered again, as measured by the adjusted Rand Index. We find that among the genes investigated, the majority of analyses are stable, based on the average adjusted Rand Index. We investigated properties of each gene that may explain less stable results. These genes tended to have less convex reference trees, less total reads recruited to the gene, and a greater Expected Distance between Placement Locations as given by pplacer when examined in aggregate. However, for an individual functional gene, these measures alone do not predict cluster stability.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2016-04-01
    Description: Biological systems encompass complexity that far surpasses many artificial systems. Modeling and simulation of large and complex biochemical pathways is a computationally intensive challenge. Traditional tools, such as ordinary differential equations, partial differential equations, stochastic master equations, and Gillespie type methods, are all limited either by their modeling fidelity or computational efficiency or both. In this work, we present a scalable computational framework based on modeling biochemical reactions in explicit 3D space, that is suitable for studying the behavior of large and complex biological pathways. The framework is designed to exploit parallelism and scalability offered by commodity massively parallel processors such as the graphics processing units (GPUs) and other parallel computing platforms. The reaction modeling in 3D space is aimed at enhancing the realism of the model compared to traditional modeling tools and framework. We introduce the Parallel Select algorithm that is key to breaking the sequential bottleneck limiting the performance of most other tools designed to study biochemical interactions. The algorithm is designed to be computationally tractable, handle hundreds of interacting chemical species and millions of independent agents by considering all-particle interactions within the system. We also present an implementation of the framework on the popular graphics processing units and apply it to the simulation study of JAK-STAT Signal Transduction Pathway. The computational framework will offer a deeper insight into various biological processes within the cell and help us observe key events as they unfold in space and time. This will advance the current state-of-the-art in simulation study of large scale biological systems and also enable the realistic simulation study of macro-biological cultures, where inter-cellular interactions are prevalent.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2016-04-01
    Description: Recently, many biological studies reported that two groups of genes tend to show negatively correlated or opposite expression tendency in many biological processes or pathways. The negative correlation between genes may imply an important biological mechanism. In this study, we proposed a FCA-based negative correlation algorithm (NCFCA) that can effectively identify opposite expression tendency between two gene groups in gene expression data. After applying it to expression data of cell cycle-regulated genes in yeast, we found that six minichromosome maintenance family genes showed the opposite changing tendency with eight core histone family genes. Furthermore, we confirmed that the negative correlation expression pattern between these two families may be conserved in the cell cycle. Finally, we discussed the reasons underlying the negative correlation of six minichromosome maintenance (MCM) family genes with eight core histone family genes. Our results revealed that negative correlation is an important and potential mechanism that maintains the balance of biological systems by repressing some genes while inducing others. It can thus provide new understanding of gene expression and regulation, the causes of diseases, etc.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2016-04-01
    Description: With the development of deep sequencing technologies, many RNA-Seq data have been generated. Researchers have proposed many methods based on the sparse theory to identify the differentially expressed genes from these data. In order to improve the performance of sparse principal component analysis, in this paper, we propose a novel class-information-based sparse component analysis (CISCA) method which introduces the class information via a total scatter matrix. First, CISCA normalizes the RNA-Seq data by using a Poisson model to obtain their differential sections. Second, the total scatter matrix is gotten by combining the between-class and within-class scatter matrices. Third, we decompose the total scatter matrix by using singular value decomposition and construct a new data matrix by using singular values and left singular vectors. Then, aiming at obtaining sparse components, CISCA decomposes the constructed data matrix by solving an optimization problem with sparse constraints on loading vectors. Finally, the differentially expressed genes are identified by using the sparse loading vectors. The results on simulation and real RNA-Seq data demonstrate that our method is effective and suitable for analyzing these data.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: Scaling linkage disequilibrium (LD) based haplotype block recognition to the entire human genome has always been a challenge. The best-known algorithm has quadratic runtime complexity and, even when sophisticated search space pruning is applied, still requires several days of computations. Here, we propose a novel sampling-based algorithm, called S-MIG $^{++}$ , where the main idea is to estimate the area that most likely contains all haplotype blocks by sampling a very small number of SNP pairs. A subsequent refinement step computes the exact blocks by considering only the SNP pairs within the estimated area. This approach significantly reduces the number of computed LD statistics, making the recognition of haplotype blocks very fast. We theoretically and empirically prove that the area containing all haplotype blocks can be estimated with a very high degree of certainty. Through experiments on the 243,080 SNPs on chromosome 20 from the 1,000 Genomes Project, we compared our previous algorithm MIG $^{++}$ with the new S-MIG $^{++}$ and observed a runtime reduction from 2.8 weeks to 34.8 hours. In a parallelized version of the S-MIG $^{++}$ algorithm using 32 parallel processes, the runtime was further reduced to 5.1 hours.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2016-02-09
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2016-02-09
    Description: A hybrid framework composed of two stages for gene selection and classification of DNA microarray data is proposed. At the first stage, five traditional statistical methods are combined for preliminary gene selection (Multiple Fusion Filter). Then, different relevant gene subsets are selected by using an embedded Genetic Algorithm (GA), Tabu Search (TS), and Support Vector Machine (SVM). A gene subset, consisting of the most relevant genes, is obtained from this process, by analyzing the frequency of each gene in the different gene subsets. Finally, the most frequent genes are evaluated by the embedded approach to obtain a final relevant small gene subset with high performance. The proposed method is tested in four DNA microarray datasets. From simulation study, it is observed that the proposed approach works better than other methods reported in the literature.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2016-02-09
    Description: A gene involved in complex regulatory interactions may have multiple regulators since gene expression in such interactions is often controlled by more than one gene. Another thing that makes gene regulatory interactions complicated is that regulatory interactions are not static, but change over time during the cell cycle. Most research so far has focused on identifying gene regulatory relations between individual genes in a particular stage of the cell cycle. In this study we developed a method for identifying dynamic gene regulations of several types from the time-series gene expression data. The method can find gene regulations with multiple regulators that work in combination or individually as well as those with single regulators. The method has been implemented as the second version of GeneNetFinder (hereafter called GeneNetFinder2) and tested on several gene expression datasets. Experimental results with gene expression data revealed the existence of genes that are not regulated by individual genes but rather by a combination of several genes. Such gene regulatory relations cannot be found by conventional methods. Our method finds such regulatory relations as well as those with multiple, independent regulators or single regulators, and represents gene regulatory relations as a dynamic network in which different gene regulatory relations are shown in different stages of the cell cycle. GeneNetFinder2 is available at http://bclab.inha.ac.kr/GeneNetFinder and will be useful for modeling dynamic gene regulations with multiple regulators.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-02-09
    Description: Profiling cancer molecules has several advantages; however, using microarray technology in routine clinical diagnostics is challenging for physicians. The classification of microarray data has two main limitations: 1) the data set is unreliable for building classifiers; and 2) the classifiers exhibit poor performance. Current microarray classification algorithms typically yield a high rate of false-positives cases, which is unacceptable in diagnostic applications. Numerous algorithms have been developed to detect false-positive cases; however, they require a considerable computation time. To address this problem, this study enhanced a previously proposed gene expression graph (GEG)-based classifier to shorten the computation time. The modified classifier filters genes by using an edge weight to determine their significance, thereby facilitating accurate comparison and classification. This study experimentally compared the proposed classifier with a GEG-based classifier by using real data and benchmark tests. The results show that the proposed classifier is faster at detecting false-positives.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2016-02-09
    Description: Recent developments in high-throughput technologies for measuring protein-protein interaction (PPI) have profoundly advanced our ability to systematically infer protein function and regulation. However, inherently high false positive and false negative rates in measurement have posed great challenges in computational approaches for the prediction of PPI. A good PPI predictor should be 1) resistant to high rate of missing and spurious PPIs, and 2) robust against incompleteness of observed PPI networks. To predict PPI in a network, we developed an intrinsic geometry structure (IGS) for network, which exploits the intrinsic and hidden relationship among proteins in network through a heat diffusion process. In this process, all explicit PPIs participate simultaneously to glue local infinitesimal and noisy experimental interaction data to generate a global macroscopic descriptions about relationships among proteins. The revealed implicit relationship can be interpreted as the probability of two proteins interacting with each other. The revealed relationship is intrinsic and robust against individual, local and explicit protein interactions in the original network. We apply our approach to publicly available PPI network data for the evaluation of the performance of PPI prediction. Experimental results indicate that, under different levels of the missing and spurious PPIs, IGS is able to robustly exploit the intrinsic and hidden relationship for PPI prediction with a higher sensitivity and specificity compared to that of recently proposed methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-02-09
    Description: Finding approximately conserved sequences, called motifs , across multiple DNA or protein sequences is an important problem in computational biology. In this paper, we consider the $(l,d)$ motif search problem of identifying one or more motifs of length $l$ present in at least $q$ of the $n$ given sequences, with each occurrence differing from the motif in at most $d$ substitutions. The problem is known to be NP-complete, and the largest solved instance reported to date is $(26,11)$ . We propose a novel algorithm for the $(l,d)$ motif search problem using streaming execution over a large set of non-deterministic finite automata (NFA). This solution is designed to take advantage of the micron automata processor, a new technology close to deployment that can simultaneously execute multiple NFA in parallel. We demonstrate the capability for solving much- larger instances of the $(l,d)$ motif search problem using the resources available within a single automata processor board, by estimating run-times for problem instances $(39,18)$ and $(40,17)$ . The paper serves as a useful guide to solving problems using this new accelerator technology.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2016-02-09
    Description: This paper deals with the problem of globally asymptotic stability for nonnegative equilibrium points of genetic regulatory networks (GRNs) with mixed delays (i.e., time-varying discrete delays and constant distributed delays). Up to now, all existing stability criteria for equilibrium points of the kind of considered GRNs are in the form of the linear matrix inequalities (LMIs). In this paper, the Brouwer’s fixed point theorem is employed to obtain sufficient conditions such that the kind of GRNs under consideration here has at least one nonnegative equilibrium point. Then, by using the nonsingular M-matrix theory and the functional differential equation theory, M-matrix-based sufficient conditions are proposed to guarantee that the kind of GRNs under consideration here has a unique nonnegative equilibrium point which is globally asymptotically stable. The M-matrix-based sufficient conditions derived here are to check whether a constant matrix is a nonsingular M-matrix, which can be easily verified, as there are many equivalent statements on the nonsingular M-matrices. So, in terms of computational complexity, the M-matrix-based stability criteria established in this paper are superior to the LMI-based ones in literature. To illustrate the effectiveness of the approach proposed in this paper, several numerical examples and their simulations are given.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2016-02-09
    Description: In recent years, thanks to the efforts of individual scientists and research consortiums, a huge amount of chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experimental data have been accumulated. Instead of investigating them independently, several recent studies have convincingly demonstrated that a wealth of scientific insights can be gained by integrative analysis of these ChIP-seq data. However, when used for the purpose of integrative analysis, a serious drawback of current ChIP-seq technique is that it is still expensive and time-consuming to generate ChIP-seq datasets of high standard. Most researchers are therefore unable to obtain complete ChIP-seq data for several TFs in a wide variety of cell lines, which considerably limits the understanding of transcriptional regulation pattern. In this paper, we propose a novel method called ChIP-PIT to overcome the aforementioned limitation. In ChIP-PIT, ChIP-seq data corresponding to a diverse collection of cell types, TFs and genes are fused together using the three-mode pair-wise interaction tensor (PIT) model, and the prediction of unperformed ChIP-seq experimental results is formulated as a tensor completion problem. Computationally, we propose efficient first-order method based on extensions of coordinate descent method to learn the optimal solution of ChIP-PIT, which makes it particularly suitable for the analysis of massive scale ChIP-seq data. Experimental evaluation the ENCODE data illustrate the usefulness of the proposed model.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2016-02-09
    Description: Cervical cancer is the third most common malignancy in women worldwide. It remains a leading cause of cancer-related death for women in developing countries. In order to contribute to the treatment of the cervical cancer, in our work, we try to find a few key genes resulting in the cervical cancer. Employing functions of several bioinformatics tools, we selected 143 differentially expressed genes (DEGs) associated with the cervical cancer. The results of bioinformatics analysis show that these DEGs play important roles in the development of cervical cancer. Through comparing two differential co-expression networks (DCNs) at two different states, we found a common sub-network and two differential sub-networks as well as some hub genes in three sub-networks. Moreover, some of the hub genes have been reported to be related to the cervical cancer. Those hub genes were analyzed from Gene Ontology function enrichment, pathway enrichment and protein binding three aspects. The results can help us understand the development of the cervical cancer and guide further experiments about the cervical cancer.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2016-02-09
    Description: Protein sub-cellular localization prediction has attracted much attention in recent years because of its importance for protein function studying and targeted drug discovery, and that makes it to be an important research field in bioinformatics. Traditional experimental methods which ascertain the protein sub-cellular locations are costly and time consuming. In the last two decades, machine learning methods got increasing development, and a large number of machine learning based protein sub-cellular location predictors have been developed. However, most of such predictors can only predict proteins in only one subcellular location. With the development of biology techniques, more and more proteins which have two or even more sub-cellular locations have been found. It is much more significant to study such proteins because they have extremely useful implication for both basic biology and bioinformatics research. In order to improve the accuracy of prediction, much more feature information which can represent the protein sequence should be extracted. In this paper, several feature extraction methods were fused together to extract the feature information, then the multi-label k nearest neighbors (ML-KNN) algorithm was used to predict protein sub-cellular locations. The best overall accuracies we got for dataset s1 in constructing Gpos-mploc is 66.7304 and 59.9206 percent for dataset s2 in constructing Virus-mPLoc.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2016-02-09
    Description: Human Serum Albumin (HSA) has been suggested to be an alternate biomarker to the existing Hemoglobin-A1c (HbA1c) marker for glycemic monitoring. Development and usage of HSA as an alternate biomarker requires the identification of glycation sites, or equivalently, glucose-binding pockets. In this work, we combine molecular dynamics simulations of HSA and the state-of-art machine learning method Support Vector Machine (SVM) to predict glucose-binding pockets in HSA. SVM uses the three dimensional arrangement of atoms and their chemical properties to predict glucose-binding ability of a pocket. Feature selection reveals that the arrangement of atoms and their chemical properties within the first 4Å from the centroid of the pocket play an important role in the binding of glucose. With a 10-fold cross validation accuracy of 84 percent, our SVM model reveals seven new potential glucose-binding sites in HSA of which two are exposed only during the dynamics of HSA. The predictions are further corroborated using docking studies. These findings can complement studies directed towards the development of HSA as an alternate biomarker for glycemic monitoring.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2016-02-09
    Description: Modeling gene regulatory networks (GRNs) is essential for conceptualizing how genes are expressed and how they influence each other. Typically, a reverse engineering approach is employed; this strategy is effective in reproducing possible fitting models of GRNs. To use this strategy, however, two daunting tasks must be undertaken: one task is to optimize the accuracy of inferred network behaviors; and the other task is to designate valid biological topologies for target networks. Although existing studies have addressed these two tasks for years, few of the studies can satisfy both of the requirements simultaneously. To address these difficulties, we propose an integrative modeling framework that combines knowledge-based and data-driven input sources to construct biological topologies with their corresponding network behaviors. To validate the proposed approach, a real dataset collected from the cell cycle of the yeast S. cerevisiae is used. The results show that the proposed framework can successfully infer solutions that meet the requirements of both the network behaviors and biological structures. Therefore, the outcomes are exploitable for future in vivo experimental design.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-02-09
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-02-09
    Description: Computational methods to engineer cellular metabolism promise to play a critical role in producing pharmaceutical, repairing defective genes, destroying cancer cells, and generating biofuels. Elementary Flux Mode (EFM) analysis is one such powerful technique that has elucidated cell growth and regulation, predicted product yield, and analyzed network robustness. EFM analysis, however, is a computationally daunting task because it requires the enumeration of all independent and stoichiometrically balanced pathways within a cellular network. We present in this paper an EFM enumeration algorithm, termed graphical EFM or gEFM. The algorithm is based on graph traversal, an approach previously assumed unsuitable for enumerating EFMs. The approach is derived from a pathway synthesis method proposed by Mavrovouniotis et al. The algorithm is described and proved correct. We apply gEFM to several networks and report runtimes in comparison with other EFM computation tools. We show how gEFM benefits from network compression. Like other EFM computational techniques, gEFM is sensitive to constraint ordering; however, we are able to demonstrate that knowledge of the underlying network structure leads to better constraint ordering. gEFM is shown to be competitive with state-of-the-art EFM computational techniques for several networks, but less so for networks with a larger number of EFMs.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2016-02-09
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2016-02-09
    Description: Research on similarity searching of cheminformatic data sets has been focused on similarity measures using fingerprints. However, nominal scales are the least informative of all metric scales, increasing the tied similarity scores, and decreasing the effectivity of the retrieval engines. Tanimoto's coefficient has been claimed to be the most prominent measure for this task. Nevertheless, this field is far from being exhausted since the computer science no free lunch theorem predicts that “no similarity measure has overall superiority over the population of data sets”. We introduce 12 relational a greement (RA) coefficients for seven metric scales, which are integrated within a group fusion-based similarity searching algorithm. These similarity measures are compared to a reference panel of 21 proximity quantifiers over 17 benchmark data sets (MUV), by using informative descriptors, a feature selection stage, a suitable performance metric, and powerful comparison tests. In this stage, RA coefficients perform favourably with repect to the state-of-the-art proximity measures. Afterward, the RA-based method outperform another four nearest neighbor searching algorithms over the same data domains. In a third validation stage, RA measures are successfully applied to the virtual screening of the NCI data set. Finally, we discuss a possible molecular interpretation for these similarity variants.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2016-02-09
    Description: Many single nucleotide polymorphisms (SNPs) for complex genetic diseases are genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) in small-scale basic research studies. It is an essential work to design feasible PCR-RFLP primer pair and find out available restriction enzymes to recognize the target SNP for PCR experiments. However, many SNPs are incapable of performing PCR-RFLP makes SNP genotyping become unpractical. A genetic algorithm (GA) had been proposed for designing mutagenic primer and get available restriction enzymes, but it gives an unrefined solution in mutagenic primers. In order to improve the mutagenic primer design, we propose TLBOMPD (TLBO-based Mutagenic Primer Design) a novel computational intelligence-based method that uses the notion of “teaching and learning” to search for more feasible mutagenic primers and provide the latest available restriction enzymes. The original Wallace's formula for the calculation of melting temperature is maintained, and more accurate calculation formulas of GC-based melting temperature and thermodynamic melting temperature are introduced into the proposed method. Mutagenic matrix is also reserved to increase the efficiency of judging a hypothetical mutagenic primer if involve available restriction enzymes for recognizing the target SNP. Furthermore, the core of SNP-RFLPing version 2 is used to enhance the mining work for restriction enzymes based on the latest REBASE. Twenty-five SNPs with mismatch PCR-RFLP screened from 288 SNPs in human SLC6A4 gene are used to appraise the TLBOMPD. Also, the computational results are compared with those of the GAMPD. In the future, the usage of the mutagenic primers in the wet lab needs to been validated carefully to increase the reliability of the method. The TLBOMPD is implemented in JAVA and it is freely available at http://tlbompd.googlecode.com/.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2016-04-01
    Description: Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-04-01
    Description: Transcription factor binding sites (TFBSs) are relatively short (5-15 bp) and degenerate. Identifying them is a computationally challenging task. In particular, protein binding microarray (PBM) is a high-throughput platform that can measure the DNA binding preference of a protein in a comprehensive and unbiased manner; for instance, a typical PBM experiment can measure binding signal intensities of a protein to all possible DNA k-mers ( $k = 8sim$ 10). Since proteins can often bind to DNA with different binding intensities, one of the major challenges is to build TFBS (also known as DNA motif) models which can fully capture the quantitative binding affinity data. To learn DNA motif models from the non-convex objective function landscape, several optimization methods are compared and applied to the PBM motif model building problem. In particular, representative methods from different optimization paradigms have been chosen for modeling performance comparison on hundreds of PBM datasets. The results suggest that the multimodal optimization methods are very effective for capturing the binding preference information from PBM data. In particular, we observe a general performance improvement if choosing di-nucleotide modeling over mono-nucleotide modeling. In addition, the models learned by the best-performing method are applied to two independent applications: PBM probe rotation testing and ChIP-Seq peak sequence prediction, demonstrating its biological applicability.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2016-04-05
    Description: Automated image analysis of microscopic images such as protein crystallization images and cellular images is one of the important research areas. If objects in a scene appear at different depths with respect to the camera's focal point, objects outside the depth of field usually appear blurred. Therefore, scientists capture a collection of images with different depths of field. Focal stacking is a technique of creating a single focused image from a stack of images collected with different depths of field. In this paper, we introduce a novel focal stacking technique, FocusALL, which is based on our modified Harris Corner Response Measure. We also propose enhanced FocusALL for application on images collected under high resolution and varying illumination. FocusALL resolves problems related to the assumption that focus regions have high contrast and high intensity. Especially, FocusALL generates sharper boundaries around protein crystal regions and good in focus images for high resolution images in reasonable time. FocusALL outperforms other methods on protein crystallization images and performs comparably well on other datasets such as retinal epithelial images and simulated datasets.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2016-08-09
    Description: The problem of finding all reversals that take a permutation one step closer to a target permutation is called the all sorting reversals problem (the ASR problem). For this problem, Siepel had an O ( n 3 )-time algorithm. Most complications of his algorithm stem from some peculiar structures called bad components. Since bad components are very rare in both real and simulated data, it is practical to study the ASR problem with no bad components. For the ASR problem with no bad components, Swenson et al. gave an O ( n 2 )-time algorithm. Very recently, Swenson found that their algorithm does not always work. In this paper, a new algorithm is presented for the ASR problem with no bad components. The time complexity is O ( n 2 ) in the worst case and is linear in the size of input and output in practice.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2016-08-09
    Description: Detecting functional modules from a Protein-Protein Interaction (PPI) network is a fundamental and hot issue in proteomics research, where many computational approaches have played an important role in recent years. However, how to effectively and efficiently detect functional modules in large-scale PPI networks is still a challenging problem. We present a new framework, based on a multiple-grain model of PPI networks, to detect functional modules in PPI networks. First, we give a multiple-grain representation model of a PPI network, which has a smaller scale with super nodes. Next, we design the protein grain partitioning method, which employs a functional similarity or a structural similarity to merge some proteins layer by layer. Thirdly, a refining mechanism with border node tests is proposed to address the protein overlapping of different modules during the grain eliminating process. Finally, systematic experiments are conducted on five large-scale yeast and human networks. The results show that the framework not only significantly reduces the running time of functional module detection, but also effectively identifies overlapping modules while keeping some competitive performances, thus it is highly competent to detect functional modules in large-scale PPI networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2016-08-09
    Description: Read mapping is a key task in next-generation sequencing (NGS) data analysis. To achieve an optimal combination of accuracy, speed, and low memory footprint, popular mapping tools often focus on identifying one or a few best mapping locations for each read. However, for many downstream analyses such as prediction of genomic variants or protein binding motifs located in repeat regions, isoform expression quantification, metagenomics analysis, it is more desirable to have a comprehensive set of all possible mapping locations of NGS reads. In this paper, we introduce AMAS, a read mapping tool that exhaustively searches for possible mapping locations of NGS reads in a reference sequence within a given edit distance. AMAS features improvements of the mapping, partition, and filtration of adaptive seeds to speed up the read mapping. Performance results on simulated and real data sets show that AMAS run several times faster than other state-of-the-art read mappers while achieving similar sensitivity and accuracy. AMAS is implemented in C++ and is freely available at https://sourceforge.net/projects/ngsamas/ .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2016-08-09
    Description: In biomedical text mining tasks, distributed word representation has succeeded in capturing semantic regularities, but most of them are shallow-window based models, which are not sufficient for expressing the meaning of words. To represent words using deeper information, we make explicit the semantic regularity to emerge in word relations, including dependency relations and context relations, and propose a novel architecture for computing continuous vector representation by leveraging those relations. The performance of our model is measured on word analogy task and Protein-Protein Interaction Extraction (PPIE) task. Experimental results show that our method performs overall better than other word representation models on word analogy task and have many advantages on biomedical text mining.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2016-08-09
    Description: Ductal Carcinoma In Situ (DCIS) is a precursor lesion of Invasive Ductal Carcinoma (IDC) of the breast. Investigating its temporal progression could provide fundamental new insights for the development of better diagnostic tools to predict which cases of DCIS will progress to IDC. We investigate the problem of reconstructing a plausible progression from single-cell sampled data of an individual with synchronous DCIS and IDC. Specifically, by using a number of assumptions derived from the observation of cellular atypia occurring in IDC, we design a possible predictive model using integer linear programming (ILP). Computational experiments carried out on a preexisting data set of 13 patients with simultaneous DCIS and IDC show that the corresponding predicted progression models are classifiable into categories having specific evolutionary characteristics. The approach provides new insights into mechanisms of clonal progression in breast cancers and helps illustrate the power of the ILP approach for similar problems in reconstructing tumor evolution scenarios under complex sets of constraints.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2016-08-09
    Description: Recurrent copy number aberrations (RCNAs) in multiple cancer samples are strongly associated with tumorigenesis, and RCNA discovery is helpful to cancer research and treatment. Despite the emergence of numerous RCNA discovering methods, most of them are unable to detect RCNAs in complex patterns that are influenced by complicating factors including aberration in partial samples, co-existing of gains and losses and normal-like tumor samples. Here, we propose a novel computational method, called non-negative sparse singular value decomposition (NN-SSVD), to address the RCNA discovering problem in complex patterns. In NN-SSVD, the measurement of RCNA is based on the aberration frequency in a part of samples rather than all samples, which can circumvent the complexity of different RCNA patterns. We evaluate NN-SSVD on synthetic dataset by comparison on detection scores and Receiver Operating Characteristics curves, and the results show that NN-SSVD outperforms existing methods in RCNA discovery and demonstrate more robustness to RCNA complicating factors. Applying our approach on a breast cancer dataset, we successfully identify a number of genomic regions that are strongly correlated with previous studies, which harbor a bunch of known breast cancer associated genes.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-08-09
    Description: In this paper, we survey algorithms that perform global alignment of networks or graphs. Global network alignment aligns two or more given networks to find the best mapping from nodes in one network to nodes in other networks. Since graphs are a common method of data representation, graph alignment has become important with many significant applications. Protein-protein interactions can be modeled as networks and aligning these networks of protein interactions has many applications in biological research. In this survey, we review algorithms for global pairwise alignment highlighting various proposed approaches, and classify them based on their methodology. Evaluation metrics that are used to measure the quality of the resulting alignments are also surveyed. We discuss and present a comparison between selected aligners on the same datasets and evaluate using the same evaluation metrics. Finally, a quick overview of the most popular databases of protein interaction networks is presented focusing on datasets that have been used recently.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2016-08-09
    Description: Membrane proteins play important roles in various biological processes within organisms. Predicting the functional types of membrane proteins is indispensable to the characterization of membrane proteins. Recent studies have extended to predicting single- and multi-type membrane proteins. However, existing predictors perform poorly and more importantly, they are often lack of interpretability. To address these problems, this paper proposes an efficient predictor, namely Mem-mEN, which can produce sparse and interpretable solutions for predicting membrane proteins with single- and multi-label functional types. Given a query membrane protein, its associated gene ontology (GO) information is retrieved by searching a compact GO-term database with its homologous accession number, which is subsequently classified by a multi-label elastic net (EN) classifier. Experimental results show that Mem-mEN significantly outperforms existing state-of-the-art membrane-protein predictors. Moreover, by using Mem-mEN, 338 out of more than 7,900 GO terms are found to play more essential roles in determining the functional types. Based on these 338 essential GO terms, Mem-mEN can not only predict the functional type of a membrane protein, but also explain why it belongs to that type. For the reader's convenience, the Mem-mEN server is available online at http://bioinfo.eie.polyu.edu.hk/MemmENServer/ .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-08-09
    Description: Prediction of protein coding regions is an important topic in the field of genomic sequence analysis. Several spectrum-based techniques for the prediction of protein coding regions have been proposed. However, the outstanding issue in most of the proposed techniques is that these techniques depend on an experimentally-selected, predefined value of the window length. In this paper, we propose a new Wide-Range Wavelet Window (WRWW) method for the prediction of protein coding regions. The analysis of the proposed wavelet window shows that its frequency response can adapt its width to accommodate the change in the window length so that it can allow or prevent frequencies other than the basic frequency in the analysis of DNA sequences. This feature makes the proposed window capable of analyzing DNA sequences with a wide range of the window lengths without degradation in the performance. The experimental analysis of applying the WRWW method and other spectrum-based methods to five benchmark datasets has shown that the proposed method outperforms other methods along a wide range of the window lengths. In addition, the experimental analysis has shown that the proposed method is dominant in the prediction of both short and long exons.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-08-09
    Description: Blebbing is an important biological indicator in determining the health of human embryonic stem cells (hESC). Especially, areas of a bleb sequence in a video are often used to distinguish two cell blebbing behaviors in hESC: dynamic and apoptotic blebbings. This paper analyzes various segmentation methods for bleb extraction in hESC videos and introduces a bio-inspired score function to improve the performance in bleb extraction. Full bleb formation consists of bleb expansion and retraction. Blebs change their size and image properties dynamically in both processes and between frames. Therefore, adaptive parameters are needed for each segmentation method. A score function derived from the change of bleb area and orientation between consecutive frames is proposed which provides adaptive parameters for bleb extraction in videos. In comparison to manual analysis, the proposed method provides an automated fast and accurate approach for bleb sequence extraction.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2016-08-09
    Description: Chemotaxis is the biological phenomenon in which organisms move to a more favorable location in an environment with a chemical attractant or repellent. Since chemotaxis is a typical example of the environmental response of organisms, it is a fundamental topic in biology and related fields. We discuss the performance of the internal controllers that generate chemotaxis. We first propose performance indices to evaluate the controllers. Based on these indices, we evaluate the performance of two controller models of Escherichia coli and Paramecium caudatum . As a result, it is disclosed that the E. coli-type controller achieves chemotaxis quickly but roughly, whereas the P. caudatum-type controller achieves it slowly but precisely. This result will be a biological contribution from a control theoretic point of view.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-08-09
    Description: This paper addresses robust multiobjective identification of driver nodes in the neuronal network of a cat's brain, in which uncertainties in determination of driver nodes and control gains are considered. A framework for robust multiobjective controllability is proposed by introducing interval uncertainties and optimization algorithms. By appropriate definitions of robust multiobjective controllability, a robust nondominated sorting adaptive differential evolution (NSJaDE) is presented by means of the nondominated sorting mechanism and the adaptive differential evolution (JaDE). The simulation experimental results illustrate the satisfactory performance of NSJaDE for robust multiobjective controllability, in comparison with six statistical methods and two multiobjective evolutionary algorithms (MOEAs): nondominated sorting genetic algorithms II (NSGA-II) and nondominated sorting composite differential evolution. It is revealed that the existence of uncertainties in choosing driver nodes and designing control gains heavily affects the controllability of neuronal networks. We also unveil that driver nodes play a more drastic role than control gains in robust controllability. The developed NSJaDE and obtained results will shed light on the understanding of robustness in controlling realistic complex networks such as transportation networks, power grid networks, biological networks, etc.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2016-08-09
    Description: With the growth of high-throughput proteomic data, in particular time series gene expression data from various perturbations, a general question that has arisen is how to organize inherently heterogenous data into meaningful structures. Since biological systems such as breast cancer tumors respond differently to various treatments, little is known about exactly how these gene regulatory networks (GRNs) operate under different stimuli. Challenges due to the lack of knowledge not only occur in modeling the dynamics of a GRN but also cause bias or uncertainties in identifying parameters or inferring the GRN structure. This paper describes a new algorithm which enables us to estimate bias error due to the effect of perturbations and correctly identify the common graph structure among biased inferred graph structures. To do this, we retrieve common dynamics of the GRN subject to various perturbations. We refer to the task as “repairing” inspired by “image repairing” in computer vision. The method can automatically correctly repair the common graph structure across perturbed GRNs, even without precise information about the effect of the perturbations. We evaluate the method on synthetic data sets and demonstrate an application to the DREAM data sets and discuss its implications to experiment design.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2016-08-09
    Description: Accurately reconstructing gene regulatory network (GRN) from gene expression data is a challenging task in systems biology. Although some progresses have been made, the performance of GRN reconstruction still has much room for improvement. Because many regulatory events are asynchronous, learning gene interactions with multiple time delays is an effective way to improve the accuracy of GRN reconstruction. Here, we propose a new approach, called Max-Min high-order dynamic Bayesian network (MMHO-DBN) by extending the Max-Min hill-climbing Bayesian network technique originally devised for learning a Bayesian network's structure from static data. Our MMHO-DBN can explicitly model the time lags between regulators and targets in an efficient manner. It first uses constraint-based ideas to limit the space of potential structures, and then applies search-and-score ideas to search for an optimal HO-DBN structure. The performance of MMHO-DBN to GRN reconstruction was evaluated using both synthetic and real gene expression time-series data. Results show that MMHO-DBN is more accurate than current time-delayed GRN learning methods, and has an intermediate computing performance. Furthermore, it is able to learn long time-delayed relationships between genes. We applied sensitivity analysis on our model to study the performance variation along different parameter settings. The result provides hints on the setting of parameters of MMHO-DBN.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2016-08-09
    Description: SEQUEST is a database-searching engine, which calculates the correlation score between observed spectrum and theoretical spectrum deduced from protein sequences stored in a flat text file, even though it is not a relational and object-oriental repository. Nevertheless, the SEQUEST score functions fail to discriminate between true and false PSMs accurately. Some approaches, such as PeptideProphet and Percolator, have been proposed to address the task of distinguishing true and false PSMs. However, most of these methods employ time-consuming learning algorithms to validate peptide assignments [1] . In this paper, we propose a fast algorithm for validating peptide identification by incorporating heterogeneous information from SEQUEST scores and peptide digested knowledge. To automate the peptide identification process and incorporate additional information, we employ ${ell}_2$ multiple kernel learning (MKL) to implement the current peptide identification task. Results on experimental datasets indicate that compared with state-of-the-art methods, i.e., PeptideProphet and Percolator, our data fusing strategy has comparable performance but reduces the running time significantly.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-08-09
    Description: Extracting biomedical event from literatures has attracted much attention recently. By now, most of the state-of-the-art systems have been based on pipelines which suffer from cascading errors, and the words encoded by one-hot are unable to represent the semantic information. Joint inference with dual decomposition and novel word embeddings are adopted to address the two problems, respectively, in this work. Word embeddings are learnt from large scale unlabeled texts and integrated as an unsupervised feature into other rich features based on dependency parse graphs to detect triggers and arguments. The proposed system consists of four components: trigger detector, argument detector, jointly inference with dual decomposition, and rule-based semantic post-processing, and outperforms the state-of-the-art systems. On the development set of BioNLP'09, the F-score is 59.77 percent on the primary task, which is 0.96 percent higher than the best system. On the test set of BioNLP'11, the F-score is 56.09 and 0.89 percent higher than the best published result that do not adopt additional techniques. On the test set of BioNLP'13, the F-score reaches 53.19 percent which is 2.22 percent higher than the best result.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2016-08-09
    Description: Popular tools to evaluate classifier performance are the false positive rate (FPR), true positive rate (TPR), receiver operator characteristic (ROC) curve, and area under the curve (AUC). Typically, these quantities are estimated from training data using simple resampling and counting methods, which have been shown to perform poorly when the sample size is small, as is typical in many applications. This work takes a model-based approach in classifier training and performance analysis, where we assume the true population densities are members of an uncertainty class of distributions. Given a prior over the uncertainty class and data, we form a posterior and derive optimal mean-squared-error (MSE) FPR and TPR estimators, as well as the sample-conditioned MSE performance of these estimators. The theory also naturally leads to optimal ROC and AUC estimators. Finally, we develop a Neyman-Pearson-based approach to optimal classifier design, which maximizes the estimated TPR for a given estimated FPR. These tools are optimal over the uncertainty class of distributions given the sample, and are available in closed form or can be easily approximated for many models. Applications are demonstrated on both synthetic and real genomic data. MATLAB code and simulations results are available in the online supplementary material.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2016-08-09
    Description: mRNA translation is a complex process involving the progression of ribosomes on the mRNA, resulting in the synthesis of proteins, and is subject to multiple layers of regulation. This process has been modelled using different formalisms, both stochastic and deterministic. Recently, we introduced a Probabilistic Boolean modelling framework for mRNA translation, which possesses the advantage of tools for numerically exact computation of steady state probability distribution, without requiring simulation. Here, we extend this model to incorporate both random sequential and parallel update rules, and demonstrate its effectiveness in various settings, including its flexibility in accommodating additional static and dynamic biological complexities and its role in parameter sensitivity analysis. In these applications, the results from the model analysis match those of TASEP model simulations. Importantly, the proposed modelling framework maintains the stochastic aspects of mRNA translation and provides a way to exactly calculate probability distributions, providing additional tools of analysis in this context. Finally, the proposed modelling methodology provides an alternative approach to the understanding of the mRNA translation process, by bridging the gap between existing approaches, providing new analysis tools, and contributing to a more robust platform for modelling and understanding translation.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: A digital microfluidic biochip (DMFB) is an emerging technology that enables miniaturized analysis systems for point-of-care clinical diagnostics, DNA sequencing, and environmental monitoring. A DMFB reduces the rate of sample and reagent consumption, and automates the analysis of assays. In this paper, we provide the first assessment of the security vulnerabilities of DMFBs. We identify result-manipulation attacks on a DMFB that maliciously alter the assay outcomes. Two practical result-manipulation attacks are shown on a DMFB platform performing enzymatic glucose assay on serum. In the first attack, the attacker adjusts the concentration of the glucose sample and thereby modifies the final result. In the second attack, the attacker tampers with the calibration curve of the assay operation. We then identify denial-of-service attacks, where the attacker can disrupt the assay operation by tampering either with the droplet-routing algorithm or with the actuation sequence. We demonstrate these attacks using a digital microfluidic synthesis simulator. The results show that the attacks are easy to implement and hard to detect. Therefore, this work highlights the need for effective protections against malicious modifications in DMFBs.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: It is challenging to obtain reliable and optimal mapping between networks for alignment algorithms when both nodal and topological structures are taken into consideration due to the underlying NP-hard problem. Here, we introduce an adaptive hybrid algorithm that combines the classical Hungarian algorithm and the Greedy algorithm (HGA) for the global alignment of biomolecular networks. With this hybrid algorithm, every pair of nodes with one in each network is first aligned based on node information (e.g., their sequence attributes) and then followed by an adaptive and convergent iteration procedure for aligning the topological connections in the networks. For four well-studied protein interaction networks, i.e., C.elegans , yeast , D.melanogaster , and human , applications of HGA lead to improved alignments in acceptable running time. The mapping between yeast and human PINs obtained by the new algorithm has the largest value of common gene ontology (GO) terms compared to those obtained by other existing algorithms, while it still has lower Mean normalized entropy (MNE) and good performances on several other measures. Overall, the adaptive HGA is effective and capable of providing good mappings between aligned networks in which the biological properties of both the nodes and the connections are important.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: The following decade will witness a surge in remote health-monitoring systems that are based on body-worn monitoring devices. These Medical Cyber Physical Systems (MCPS) will be capable of transmitting the acquired data to a private or public cloud for storage and processing. Machine learning algorithms running in the cloud and processing this data can provide decision support to healthcare professionals. There is no doubt that the security and privacy of the medical data is one of the most important concerns in designing an MCPS. In this paper, we depict the general architecture of an MCPS consisting of four layers: data acquisition, data aggregation, cloud processing, and action. Due to the differences in hardware and communication capabilities of each layer, different encryption schemes must be used to guarantee data privacy within that layer. We survey conventional and emerging encryption schemes based on their ability to provide secure storage, data sharing, and secure computation. Our detailed experimental evaluation of each scheme shows that while the emerging encryption schemes enable exciting new features such as secure sharing and secure computation, they introduce several orders-of-magnitude computational and storage overhead. We conclude our paper by outlining future research directions to improve the usability of the emerging encryption schemes in an MCPS.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2016-06-03
    Description: Flagellum is a lash-like cellular appendage found in many single-celled living organisms. The flagellin protofilaments contain 11-helix dual turn structure in a single flagellum. Each flagellin consists of four sub-domains - two inner domains (D0, D1) and two outer domains (D2, D3). While inner domains predominantly consist of α-helices, the outer domains are primarily beta sheets with D3. In flagellum, the outermost sub-domain is the only one that is exposed to the native environment. This study focuses on the interactions of the residues of D3 of an R-type flagellin with 5nm long chiral (5,15) and arm-chair (12,12) single-walled carbon nanotubes (SWNT) using molecular dynamics simulation. It presents the interactive forces between the SWNT and the residues of D3 from the perspectives of size and chirality of the SWNT. It is found that the metallic (arm-chair) SWNT interacts the most with glycine and threonine residues through van der Waals and hydrophobic interactions, whereas the semiconducting (chiral) SWNT interacts largely with the area of protein devoid of glycine by van der Waals, hydrophobic interactions, and hydrogen bonding. This indicates a crucial role that glycine plays in distinguishing metallic from semiconducting SWNTs.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2016-06-03
    Description: MicroRNAs (miRNAs) are post-transcriptional regulators that repress the expression of their targets. They are known to work cooperatively with genes and play important roles in numerous cellular processes. Identification of miRNA regulatory modules (MRMs) would aid deciphering the combinatorial effects derived from the many-to-many regulatory relationships in complex cellular systems. Here, we develop an effective method called BiCliques Merging (BCM) to predict MRMs based on bicliques merging. By integrating the miRNA/mRNA expression profiles from The Cancer Genome Atlas (TCGA) with the computational target predictions, we construct a weighted miRNA regulatory network for module discovery. The maximal bicliques detected in the network are statistically evaluated and filtered accordingly. We then employed a greedy-based strategy to iteratively merge the remaining bicliques according to their overlaps together with edge weights and the gene-gene interactions. Comparing with existing methods on two cancer datasets from TCGA, we showed that the modules identified by our method are more densely connected and functionally enriched. Moreover, our predicted modules are more enriched for miRNA families and the miRNA-mRNA pairs within the modules are more negatively correlated. Finally, several potential prognostic modules are revealed by Kaplan-Meier survival analysis and breast cancer subtype analysis. Availability: BCM is implemented in Java and available for download in the supplementary materials, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/ TCBB.2015.2462370 .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2016-06-03
    Description: Genome-wide association studies (GWASs), which assay more than a million single nucleotide polymorphisms (SNPs) in thousands of individuals, have been widely used to identify genetic risk variants for complex diseases. However, most of the variants that have been identified contribute relatively small increments of risk and only explain a small portion of the genetic variation in complex diseases. This is the so-called missing heritability problem. Evidence has indicated that many complex diseases are genetically related, meaning these diseases share common genetic risk variants. Therefore, exploring the genetic correlations across multiple related studies could be a promising strategy for removing spurious associations and identifying underlying genetic risk variants, and thereby uncovering the mystery of missing heritability in complex diseases. We present a general and robust method to identify genetic patterns from multiple large-scale genomic datasets. We treat the summary statistics as a matrix and demonstrate that genetic patterns will form a low-rank matrix plus a sparse component. Hence, we formulate the problem as a matrix recovering problem, where we aim to discover risk variants shared by multiple diseases/traits and those for each individual disease/trait. We propose a convex formulation for matrix recovery and an efficient algorithm to solve the problem. We demonstrate the advantages of our method using both synthesized datasets and real datasets. The experimental results show that our method can successfully reconstruct both the shared and the individual genetic patterns from summary statistics and achieve comparable performances compared with alternative methods under a wide range of scenarios. The MATLAB code is available at: http://www.comp.hkbu.edu.hk/~xwan/iga.zip .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2016-06-03
    Description: Hybridization networks represent plausible evolutionary histories of species that are affected by reticulate evolutionary processes. An established computational problem on hybridization networks is constructing the most parsimonious hybridization network such that each of the given phylogenetic trees (called gene trees) is “displayed” in the network. There have been several previous approaches, including an exact method and several heuristics, for this NP-hard problem. However, the exact method is only applicable to a limited range of data, and heuristic methods can be less accurate and also slow sometimes. In this paper, we develop a new algorithm for constructing near parsimonious networks for multiple binary gene trees. This method is more efficient for large numbers of gene trees than previous heuristics. This new method also produces more parsimonious results on many simulated datasets as well as a real biological dataset than a previous method. We also show that our method produces topologically more accurate networks for many datasets.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2016-06-03
    Description: Robust adaptation plays a key role in gene regulatory networks, and it is thought to be an important attribute for the organic or cells to survive in fluctuating conditions. In this paper, a simplified three-node enzyme network is modeled by the Michaelis-Menten rate equations for all possible topologies, and a family of topologies and the corresponding parameter sets of the network with satisfactory adaptation are obtained using the multi-objective genetic algorithm. The proposed approach improves the computation efficiency significantly as compared to the time consuming exhaustive searching method. This approach provides a systemic way for searching the feasible topologies and the corresponding parameter sets to make the gene regulatory networks have robust adaptation. The proposed methodology, owing to its universality and simplicity, can be used to address more complex issues in biological networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2016-06-03
    Description: Whole genome prediction of complex phenotypic traits using high-density genotyping arrays has attracted a lot of attention, as it is relevant to the fields of plant and animal breeding and genetic epidemiology. Since the number of genotypes is generally much bigger than the number of samples, predictive models suffer from the curse of dimensionality . The curse of dimensionality problem not only affects the computational efficiency of a particular genomic selection method, but can also lead to a poor performance, mainly due to possible overfitting, or un-informative features. In this work, we propose a novel transductive feature selection method, called MINT, which is based on the MRMR (Max-Relevance and Min-Redundancy) criterion. We apply MINT on genetic trait prediction problems and show that, in general, MINT is a better feature selection method than the state-of-the-art inductive method MRMR.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2016-06-03
    Description: We aim to improve the performance of the previously proposed signal decomposition matched filtering (SDMF) method [26] for the detection of copy-number variations (CNV) in the human genome. Through simulations, we show that the modified SDMF is robust even at high noise levels and outperforms the original SDMF method, which indirectly depends on CNV frequency. Simulations are also used to develop a systematic approach for selecting relevant parameter thresholds in order to optimize sensitivity, specificity and computational efficiency. We apply the modified method to array CGH data from normal samples in the cancer genome atlas (TCGA) and compare detected CNVs to those estimated using circular binary segmentation (CBS) [19] , a hidden Markov model (HMM)-based approach [11] and a subset of CNVs in the Database of Genomic Variants. We show that a substantial number of previously identified CNVs are detected by the optimized SDMF, which also outperforms the other two methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2016-06-03
    Description: Next-generation sequencing technologies have led to the sequencing of more and more genomes, propelling related research into the era of big data. In this paper, we present ParaBWT, a parallelized Burrows-Wheeler transform (BWT) and suffix array construction algorithm for big genome data. In ParaBWT, we have investigated a progressive construction approach to constructing the BWT of single genome sequences in linear space complexity, but with a small constant factor. This approach has been further parallelized using multi-threading based on a master-slave coprocessing model. After gaining the BWT, the suffix array is constructed in a memory-efficient manner. The performance of ParaBWT has been evaluated using two sequences generated from two human genome assemblies: the Ensembl Homo sapiens assembly and the human reference genome. Our performance comparison to FMD-index and Bwt-disk reveals that on 12 CPU cores, ParaBWT runs up to $2.2times$ faster than FMD-index and up to $99.0times$ faster than Bwt-disk. BWT construction algorithms for very long genomic sequences are time consuming and (due to their incremental nature) inherently difficult to parallelize. Thus, their parallelization is challenging and even relatively small speedups like the ones of our method over FMD-index are of high importance to research. ParaBWT is written in C++, and is freely available at http://parabwt.sourceforge.net .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: Advances in biomedical sensors and mobile communication technologies have fostered the rapid growth of mobile health (mHealth) applications in the past years. Users generate a high volume of biomedical data during health monitoring, which can be used by the mHealth server for training predictive models for disease diagnosis and treatment. However, the biomedical sensing data raise serious privacy concerns because they reveal sensitive information such as health status and lifestyles of the sensed subjects. This paper proposes and experimentally studies a scheme that keeps the training samples private while enabling accurate construction of predictive models. We specifically consider logistic regression models which are widely used for predicting dichotomous outcomes in healthcare, and decompose the logistic regression problem into small subproblems over two types of distributed sensing data, i.e., horizontally partitioned data and vertically partitioned data. The subproblems are solved using individual private data, and thus mHealth users can keep their private data locally and only upload (encrypted) intermediate results to the mHealth server for model training. Experimental results based on real datasets show that our scheme is highly efficient and scalable to a large number of mHealth users.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2016-06-03
    Description: Biologists often need to know the set of genes associated with a given set of genes or a given disease. We propose in this paper a classifier system called Monte Carlo for Genetic Network (MCforGN) that can construct genetic networks, identify functionally related genes, and predict gene-disease associations. MCforGN identifies functionally related genes based on their co-occurrences in the abstracts of biomedical literature. For a given gene g , the system first extracts the set of genes found within the abstracts of biomedical literature associated with g . It then ranks these genes to determine the ones with high co-occurrences with g . It overcomes the limitations of current approaches that employ analytical deterministic algorithms by applying Monte Carlo Simulation to approximate genetic networks. It does so by conducting repeated random sampling to obtain numerical results and to optimize these results. Moreover, it analyzes results to obtain the probabilities of different genes’ co-occurrences using series of statistical tests. MCforGN can detect gene-disease associations by employing a combination of centrality measures (to identify the central genes in disease-specific genetic networks) and Monte Carlo Simulation. MCforGN aims at enhancing state-of-the-art biological text mining by applying novel extraction techniques. We evaluated MCforGN by comparing it experimentally with nine approaches. Results showed marked improvement.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2016-06-03
    Description: Parameter estimation is a key concern for reliable and predictive models of biological systems. In this paper, we propose a multi-objective, multi-state optimization framework that allows multiple data sources to be incorporated into the parameter estimation process. This enables the model to better represent a diverse range of data from both within and outwith the training set; and to determine more biologically relevant parameter values for the model parameters. The framework is based on a multi-objective PSwarm implementation (MoPSwarm) and is validated via a case study on the ERK signalling pathway, in which significant advantages over the conventional single-state approach are demonstrated. Several variants of the framework are analyzed to determine the optimal configuration for convergence and solution quality.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: The production and sale of counterfeit and substandard pharmaceutical products, such as essential medicines, is an important global public health problem. We describe a chemometric passport-based approach to improve the security of the pharmaceutical supply chain. Our method is based on applying nuclear quadrupole resonance (NQR) spectroscopy to authenticate the contents of medicine packets. NQR is a non-invasive, non-destructive, and quantitative radio frequency (RF) spectroscopic technique. It is sensitive to subtle features of the solid-state chemical environment and thus generates unique chemical fingerprints that are intrinsically difficult to replicate. We describe several advanced NQR techniques, including two-dimensional measurements, polarization enhancement, and spin density imaging, that further improve the security of our authentication approach. We also present experimental results that confirm the specificity and sensitivity of NQR and its ability to detect counterfeit medicines.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2016-06-03
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2016-06-03
    Description: We present a modeling framework aimed at capturing both the positional and temporal behavior of transcriptional regulatory proteins in eukaryotic cells. There is growing evidence that transcriptional regulation is the complex behavior that emerges not solely from the individual components, but rather from their collective behavior, including competition and cooperation. Our framework describes individual regulatory components using generic action oriented descriptions of their biochemical interactions with a DNA sequence. All the possible actions are based on the current state of factors bound to the DNA. We developed a rule builder to automatically generate the complete set of biochemical interaction rules for any given DNA sequence. Off-the-shelf stochastic simulation engines can model the behavior of a system of rules and the resulting changes in the configuration of bound factors can be visualized. We compared our model to experimental data at well-studied loci in yeast, confirming that our model captures both the positional and temporal behavior of transcriptional regulation.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: High-throughput DNA sequencing technologies allow fast and affordable sequencing of individual genomes and thus enable unprecedented studies of genetic variations. Information about variations in the genome of an individual is provided by haplotypes, ordered collections of single nucleotide polymorphisms. Knowledge of haplotypes is instrumental in finding genes associated with diseases, drug development, and evolutionary studies. Haplotype assembly from high-throughput sequencing data is challenging due to errors and limited lengths of sequencing reads. The key observation made in this paper is that the minimum error-correction formulation of the haplotype assembly problem is identical to the task of deciphering a coded message received over a noisy channel—a classical problem in the mature field of communication theory. Exploiting this connection, we develop novel haplotype assembly schemes that rely on the bit-flipping and belief propagation algorithms often used in communication systems. The latter algorithm is then adapted to the haplotype assembly of polyploids. We demonstrate on both simulated and experimental data that the proposed algorithms compare favorably with state-of-the-art haplotype assembly methods in terms of accuracy, while being scalable and computationally efficient.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2016-06-03
    Description: Traditional drug discovery practice usually follows the “one drug - one target” approach, seeking to identify drug molecules that act on individual targets, which ignores the systemic nature of human diseases. Pathway-based drug discovery recently emerged as an appealing approach to overcome this limitation. An important first step of such pathway-based drug discovery is to identify associations between drug molecules and biological pathways. This task has been made feasible by the accumulating data from high-throughput transcription and drug sensitivity profiling. In this paper, we developed “iPaD”, an i ntegrative P enalized M a trix D ecomposition method to identify drug-pathway associations through jointly modeling of such high-throughput transcription and drug sensitivity data. A scalable bi-convex optimization algorithm was implemented and gave iPaD tremendous advantage in computational efficiency over current state-of-the-art method, which allows it to handle the ever-growing large-scale data sets that current method cannot afford to. On two widely used real data sets, iPaD also significantly outperformed the current method in terms of the number of validated drug-pathway associations that were identified. The Matlab code of our algorithm publicly available at http://licong-jason.github.io/iPaD/
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-06-03
    Description: The biological function of a macromolecule often requires that a small molecule or ion is transported through its structure. The transport pathway often leads through void spaces in the structure. The properties of transport pathways change significantly in time; therefore, the analysis of a trajectory from molecular dynamics rather than of a single static structure is needed for understanding the function of pathways. The identification and analysis of transport pathways are challenging because of the high complexity and diversity of macromolecular shapes, the thermal motion of their atoms, and the large amount of conformations needed to properly describe conformational space of protein structure. In this paper, we describe the principles of the CAVER 3.0 algorithms for the identification and analysis of properties of transport pathways both in static and dynamic structures. Moreover, we introduce the improved clustering solution for finding tunnels in macromolecules, which is included in the latest CAVER 3.02 version. Voronoi diagrams are used to identify potential pathways in each snapshot of a molecular dynamics trajectory and clustering is then used to find the correspondence between tunnels from different snapshots. Furthermore, the geometrical properties of pathways and their evolution in time are computed and visualized.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2016-04-08
    Description: This study considers the problem of describing and predicting cleft formation during the early stages of branching morphogenesis in mouse submandibular salivary glands (SMG) under the influence of varied concentrations of epidermal growth factors (EGF). Given a time-lapse video of a growing SMG, first we build a descriptive model that captures the underlying biological process and quantifies the ground truth. Tissue-scale (global) and morphological features related to regions of interest (local features) are used to characterize the biological ground truth. Second, we devise a predictive growth model that simulates EGF-modulated branching morphogenesis using a dynamic graph algorithm, which is driven by biological parameters such as EGF concentration, mitosis rate, and cleft progression rate. Given the initial configuration of the SMG, the evolution of the dynamic graph predicts the cleft formation, while maintaining the local structural characteristics of the SMG. We determined that higher EGF concentrations cause the formation of higher number of buds and comparatively shallow cleft depths. Third, we compared the prediction accuracy of our model to the Glazier-Graner-Hogeweg (GGH) model, an on-lattice Monte-Carlo simulation model, under a specific energy function parameter set that allows new rounds of de novo cleft formation. The results demonstrate that the dynamic graph model yields comparable simulations of gland growth to that of the GGH model with a significantly lower computational complexity. Fourth, we enhanced this model to predict the SMG morphology for an EGF concentration without the assistance of a ground truth time-lapse biological video data; this is a substantial benefit of our model over other similar models that are guided and terminated by information regarding the final SMG morphology. Hence, our model is suitable for testing the impact of different biological parameters involved with the process of branching morphogenesis in silico , while reducing the requirement of in vivo experiments.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2016-02-12
    Description: The molecular system built with true chemical bonds or strong molecular interaction can be described using conceptual mathematical tools. Modeling of the natural generated ionic currents on the human pancreatic β-cell activity had been already studied using complicated analytical models. In our present contribution, we prove the same using our simple electrical model. The ionic currents are associated with different proteins membrane channels (K-Ca, K v , K ATP , Ca v -L) and Na/Ca Exchanger (NCX). The proteins are Ohmic conductors and are modeled by conductance randomly distributed. Switches are placed in series with conductances in order to highlight the channel activity. However, the K ATP channel activity is stimulated by glucose, and the NCX's conductance change according to the intracellular calcium concentration. The percolation threshold of the system is calculated by the fractal nature of the infinite cluster using the Tarjan's depth-first-search algorithm. It is shown that the behavior of the internal concentration of Ca 2+ and the membrane potential are modulated by glucose. The results confirm that the inhibition of K ATP channels depolarizes the membrane and increases the influx of [Ca 2+ ] i through NCX and Ca v -L channel for high glucose concentrations.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-10-08
    Description: Mechanical ventilation is an important method to help people breathe. Respiratory parameters of ventilated patients are usually tracked for pulmonary diagnostics and respiratory treatment assessment. In this paper, to improve the estimation accuracy of respiratory parameters, a pneumatic model for mechanical ventilation was proposed. Furthermore, based on the mathematical model, a recursive least-squares algorithm was adopted to estimate the respiratory parameters. Finally, through experimental and numerical study, it was demonstrated that the proposed estimation method was effective and the method can be used in pulmonary diagnostics and treatment.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2016-10-08
    Description: In this paper, a framework for shape-based similarity search of 3D molecular structures is presented. The proposed framework exploits simultaneously the discriminative capabilities of a global, a local, and a hybrid local-global shape feature to produce a geometric descriptor that achieves higher retrieval accuracy than each feature does separately. Global and hybrid features are extracted using pairwise computations of diffusion distances between the points of the molecular surface, while the local feature is based on accumulating pairwise relations among oriented surface points into local histograms. The local features are integrated into a global descriptor vector using the bag-of-features approach. Due to the intrinsic property of its constituting shape features to be invariant to articulations of the 3D objects, the framework is appropriate for similarity search of flexible 3D molecules, while at the same time it is also accurate in retrieving rigid 3D molecules. The proposed framework is evaluated in flexible and rigid shape matching of 3D protein structures as well as in shape-based virtual screening of large ligand databases with quite promising results.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2016-12-10
    Description: Mutual information (MI) is a powerful concept for correlation-centric applications. It has been used for feature selection from microarray gene expression data in many works. One of the merits of MI is that, unlike many other heuristic methods, it is based on a mature theoretic foundation. When applied to microarray data, however, it faces some challenges. First, due to the large number of features (i.e., genes) present in microarray data, the true distributions for the expression values of some genes may be distorted by noise. Second, evaluating inter-group mutual information requires estimating multi-variate distributions, which is quite difficult if not impossible. To address these problems, in this paper, we propose a new MI-based feature selection approach for microarray data. Our approach relies on two strategies: one is relevance boosting , which requires a desirable feature to show substantially additional relevance with class labeling beyond the already selected features, the other is feature interaction enhancing , which probabilistically compensates for feature interaction missing from simple aggregation-based evaluation. We justify our approach from both theoretical perspective and experimental results. We use a synthetic dataset to show the statistical significance of the proposed strategies, and real-life datasets to show the improved performance of our approach over the existing methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: MicroRNAs (miRNAs) are a class of small endogenous non-coding genes, acting as regulators in the post-transcriptional processes. Recently, the miRNAs are found to be widely involved in different types of diseases. Therefore, the identification of disease associated miRNAs can help understand the mechanisms that underlie the disease and identify new biomarkers. However, it is not easy to identify the miRNAs related to diseases due to its extensive involvements in various biological processes. In this work, we present a new approach to identify disease associated miRNAs based on domains, the functional and structural blocks of proteins. The results on real datasets demonstrate that our method can effectively identify disease related miRNAs with high precision.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: MicroRNAs (miRNAs) are crucial regulators of gene expression at post-transcriptional level. Understanding origin and evolution of miRNAs will improve the current available algorithm for the prediction of novel miRNAs and their functions. Transposable elements (TEs) provide a natural mechanism for the origin of new miRNAs. In the paper, 2,583 miRNAs derived from TEs (MDTEs) were collected to construct a database named MDTE database (MDTE DB) for storing, searching, and analyzing MDTEs. The database provides a convenient source for studying the origin and evolution of miRNAs. Database URL: http://bioinf.njnu.edu.cn/MDTE/MDTE.php .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2016-12-14
    Description: Many methods have been considered for gene selection and analysis of gene expression data. Nonetheless, there still exists the considerable space for improving the explicitness and reliability of gene selection. To this end, this paper proposes a novel method named robust graph regularized non-negative matrix factorization for characteristic gene selection using gene expression data, which mainly contains two aspects: Firstly, enforcing ${L_{21}}$ -norm minimization on error function which is robust to outliers and noises in data points. Secondly, it considers that the samples lie in low-dimensional manifold which embeds in a high-dimensional ambient space, and reveals the data geometric structure embedded in the original data. To demonstrate the validity of the proposed method, we apply it to gene expression data sets involving various human normal and tumor tissue samples and the results demonstrate that the method is effective and feasible.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2016-12-10
    Description: Transmembrane β-barrels (TMBs) are one important class of membrane proteins that play crucial functions in the cell. Membrane proteins are difficult wet-lab targets of structural biology, which call for accurate computational prediction approaches. Here, we developed a novel method named MemBrain-TMB to predict the spanning segments of transmembrane β-barrel from amino acid sequence. MemBrain-TMB is a statistical machine learning-based model, which is constructed using a new chain learning algorithm with input features encoded by the image sparse representation approach. We considered the relative status information between neighboring residues for enhancing the performance, and the matrix of features was translated into feature image by sparse coding algorithm for noise and dimension reduction. To deal with the diverse loop length problem, we applied a dynamic threshold method, which is particularly useful for enhancing the recognition of short loops and tight turns. Our experiments demonstrate that the new protocol designed in MemBrain-TMB effectively helps improve prediction performance.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2016-12-10
    Description: This study comprehensively evaluates the performance of five types of probabilistic hierarchical classification methods used for predicting Gene Ontology (GO) terms related to ageing. Of those tested, a new hybrid of a Local Hierarchical Classifier (LHC) and the Predictive Clustering Tree algorithm (LHC-PCT) had the best predictive accuracy results. We also tested the impact of two types of variations in most hierarchical classification algorithms, namely: (a) changing the base algorithm (we tested Naive Bayes and Support Vector Machines), and the impact of (b) using or not the Correlation based Feature Selection (CFS) algorithm in a pre-processing step. In total, we evaluated the predictive performance of 17 variations of hierarchical classifiers across 15 datasets of ageing and longevity-related genes. We conclude that the LHC-PCT algorithm ranks better across several tests (seven out of 12). In addition, we interpreted the models generated by the PCT algorithm to show how hierarchical classification algorithms can be used to extract biological insights out of the ageing-related datasets that we compiled.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2016-12-10
    Description: The natural progression of HIV-1 begins with a short acute retroviral syndrome which typically transit to chronic and clinical latency stages and subsequently progresses to a symptomatic, life-threatening immunodeficiency disease known as AIDS. Microarray analysis based on gene coexpression is widely used to investigate the coregulation pattern of a group (or cluster) of genes in a specific phenotype. Moreover, an investigation on the topological patterns across multiple phenotypes can facilitate the understanding of stage specific infection pattern of HIV-1 virus. Here, we develop a novel framework to identify topological patterns of gene co-expression network and detect changes of modular structure across different stages of HIV progression. This is achieved by comparing the topological and intramodular properties of HIV infection modules. To capture the diversity in modular structure, some topological, correlation based, and eigengene based measures are utilized here. We have applied a rank aggregation scheme to rank all the modules to provide a good agreement between these measures. Some novel transcription factors like ‘FOXO1’, ‘GATA3’, ‘GFI1’, ‘IRF1’, ‘IRF7’, ‘MAX’, ‘STAT1’, ‘STAT3’, ‘XBP1’, and ‘YY1’ that merge from the modules show significant change in expression pattern over HIV progression stages. Moreover, we have performed an eigengene based analysis to reveal the perturbation in modular structure across three stages of HIV-1 progression.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2016-12-10
    Description: We study the number of samples required to uniquely determine the structure of a probabilistic Boolean network (PBN), where PBNs are probabilistic extensions of Boolean networks. We show via theoretical analysis and computational analysis that the structure of a PBN can be exactly identified with high probability from a relatively small number of samples for interesting classes of PBNs of bounded indegree. On the other hand, we also show that there exist classes of PBNs for which it is impossible to uniquely determine the structure of a PBN from samples.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: High-throughput experimental screening techniques have resulted in a large number of biological network data such as protein-protein interactions (PPI) data. The analysis of these data can enhance our understanding of cellular processes. PPI network alignment is one of the comparative analysis methods for analyzing biological networks. Research on PPI networks can identify conserved subgraphs and help us to understand evolutionary trajectories across species. Some evolutionary algorithms have been proposed for coping with PPI network alignment, but most of them are limited by the lower search efficiency due to the lack of the priori knowledge. In this paper, we propose a memetic algorithm, denoted as MeAlgn, to solve the biological network alignment by optimizing an objective function which introduces topological structure and sequence similarities. MeAlign combines genetic algorithm with a local search refinement. The genetic algorithm is to find interesting alignment solution regions, and the local search is to find optimal solutions around the regions. The proposed algorithm first develops a coarse similarity score matrix for initialization and then it uses a specific neighborhood heuristic local search strategy to find an optimal alignment. In MeAlign, the information of topological structure and sequence similarities is used to guide our mapping. Experimental results demonstrate that our algorithm can achieve a better mapping than some state-of-the-art algorithms and it makes a better balance between the network topology and nodes sequence similarities.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: We developed a Bayesian clustering method to identify significant regions of brain activation. Coordinate-based meta data originating from functional magnetic resonance imaging (fMRI) were of primary interest. Individual fMRI has the ability to measure the intensity of blood flow and oxygen to a location within the brain that was activated by a given thought or emotion. The proposed method performed clustering on two levels, latent foci center and study activation center, with a spatial Cox point process utilizing the Dirichlet process to describe the distribution of foci. Intensity was modeled as a function of distance between the focus and the center of the cluster of foci using a Gaussian kernel. Simulation studies were conducted to evaluate the sensitivity and robustness of the method with respect to cluster identification and underlying data distributions. We applied the method to a meta data set to identify emotion foci centers.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2016-12-10
    Description: The metabolic network model allows for an in-depth insight into the molecular mechanism of a particular organism. Because most parameters of the metabolic network cannot be directly measured, they must be estimated by using optimization algorithms. However, three characteristics of the metabolic network model, i.e., high nonlinearity, large amount parameters, and huge variation scopes of parameters, restrict the application of many traditional optimization algorithms. As a result, there is a growing demand to develop efficient optimization approaches to address this complex problem. In this paper, a Kriging-based algorithm aiming at parameter estimation is presented for constructing the metabolic networks. In the algorithm, a new infill sampling criterion, named expected improvement and mutual information (EI&MI), is adopted to improve the modeling accuracy by selecting multiple new sample points at each cycle, and the domain decomposition strategy based on the principal component analysis is introduced to save computing time. Meanwhile, the convergence speed is accelerated by combining a single-dimensional optimization method with the dynamic coordinate perturbation strategy when determining the new sample points. Finally, the algorithm is applied to the arachidonic acid metabolic network to estimate its parameters. The obtained results demonstrate the effectiveness of the proposed algorithm in getting precise parameter values under a limited number of iterations.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: MicroRNAs (miRNAs) regulate genes that are associated with various diseases. To better understand miRNAs, the miRNA regulatory mechanism needs to be investigated and the real targets identified. Here, we present miRTDL, a new miRNA target prediction algorithm based on convolutional neural network (CNN). The CNN automatically extracts essential information from the input data rather than completely relying on the input dataset generated artificially when the precise miRNA target mechanisms are poorly known. In this work, the constraint relaxing method is first used to construct a balanced training dataset to avoid inaccurate predictions caused by the existing unbalanced dataset. The miRTDL is then applied to 1,606 experimentally validated miRNA target pairs. Finally, the results show that our miRTDL outperforms the existing target prediction algorithms and achieves significantly higher sensitivity, specificity and accuracy of 88.43, 96.44, and 89.98 percent, respectively. We also investigate the miRNA target mechanism, and the results show that the complementation features are more important than the others.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: Prediction of essential proteins which is aided by computer science and supported from high throughput data is a more efficient method compared with time consuming and expensive experimental approaches. There are many computational approaches reported, however they are usually sensitive to various network structures so that their robustness are generally poor. In this paper, a novel topological centrality measure for predicting essential proteins based on local interaction density, named as LID, is proposed. It is different from previous measures that LID takes the essentiality of a node from interaction densities among its neighbors through topological analyses of real proteins in a protein complex set first time at the viewpoint of biological modules. LID is applied to four different yeast protein interaction networks, which are obtained, respectively, from the DIP database and the BioGRID database. The experimental results show that the number of essential proteins detected by LID universally exceeds or approximates the best performance of other 10 topological centrality measures in all 24 comparisons of four networks: DC, BC, ClusterC, CloseC, MNC, SoECC(NC), LAC, SC, EigC, and InfoC. The better robustness of LID for multiple data sets will make it to be a new core topological centrality measure to improve the performance of prediction for more species protein interaction networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: This paper investigates the controller designing for disturbance decoupling problem (DDP) of singular Boolean control networks (SBCNs). Using semi-tensor product (STP) of matrices and the Implicit Function Theorem, a SBCN is converted into the standard BCN. Based on the redundant variable separation technique, both state feedback and output feedback controllers are designed to solve the DDP of the SBCN. Sufficient conditions are also given to analyze the invariance of controllers concerning the DDP of the SBCN with function perturbation. Two illustrative examples are presented to support the effectiveness of these obtained results.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-12-10
    Description: DNA encodes the genetic information of most living beings, except viruses that use RNA. Unlike other types of molecules, DNA is not usually described by its atomic structure being instead usually described by its base-pair sequence, i.e., the textual sequence of its subsidiary molecules known as nucleotides ( adenine (A), cytosine (C), guanine (G), and thymine (T)). The three-dimensional assembling of DNA molecules based on its base-pair sequence has been, for decades, a topic of interest for many research groups all over the world. In this paper, we survey the major methods found in the literature to assemble and visualize DNA molecules from their base-pair sequences. We divided these methods into three categories: predictive methods , adaptive methods , and thermodynamic methods . Predictive methods aim to predict a conformation of the DNA from its base pair sequence, while the goal of adaptive methods is to assemble DNA base-pairs sequences along previously known conformations, as needed in scenarios such as DNA Monte Carlo simulations. Unlike these two geometric methods, thermodynamic methods are energy-based and aim to predict secondary structural motifs of DNA in cases where hydrogen bonds between base pairs might be broken because of temperature changes. We also present the major software tools that implements predictive, adaptive, and thermodynamic methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2016-12-10
    Description: Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Net work-based method (NetInf) to Inf er cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2016-10-08
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2016-10-08
    Description: Currently, binary biclustering algorithms are too slow and non-specific to handle biological datasets that have a large number of attributes, which is essential for the computational biology problem of microarray analysis. Specialized computers may be needed to execute an algorithm, and may fail to produce a solution, due to its large resource needs. The biclusters also include too many false positives, the type I error, which hinders biological discovery. We propose an algorithm that can analyze datasets with a large attribute set at different densities, and can operate on a laptop, which makes it accessible to practitioners. EMFP produces biclusters that have a very low Root Mean Squared Error and false positive rate, with very few type II errors. Our binary biclustering algorithm is a hybrid, axis-parallel, pattern-based algorithm that finds multiple, non-overlapping, near-constant, deterministic, binary submatricies, with a variable confidence threshold, and the novel use of local density comparisons versus the standard global threshold. EMFP introduces a new, and intuitive way to calculate internal measures for binary biclustering methods. We also introduce a framework to ease comparison with other algorithms, and compare to both binary and general biclustering algorithms using two real, and 80 synthetic databases.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2016-10-08
    Description: As protein phosphorylation plays an important role in numerous cellular processes, many studies have been undertaken to analyze phosphorylation-related activities for drug design and disease treatment. However, although progresses have been made in illustrating the relationship between phosphorylation and diseases, no existing method focuses on disease-associated phosphorylation sites prediction. In this work, we proposed a multi-layer heterogeneous network model that makes use of the kinase information to infer disease-phosphorylation site relationship and implemented random walk on the heterogeneous network. Experimental results reveal that multi-layer heterogeneous network model with kinase layer is superior in inferring disease-phosphorylation site relationship when comparing with existing random walk model and common used classification methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2016-10-08
    Description: Glioblastoma multiforme (GBM) is a highly aggressive type of brain cancer with very low median survival. In order to predict the patient's prognosis, researchers have proposed rules to classify different glioma cancer cell subtypes. However, survival time of different subtypes of GBM is often various due to different individual basis. Recent development in gene testing has evolved classic subtype rules to more specific classification rules based on single biomolecular features. These classification methods are proven to perform better than traditional simple rules in GBM prognosis prediction. However, the real power behind the massive data is still under covered. We believe a combined prediction model based on more than one data type could perform better, which will contribute further to clinical treatment of GBM. The Cancer Genome Atlas (TCGA) database provides huge dataset with various data types of many cancers that enables us to inspect this aggressive cancer in a new way. In this research, we have improved GBM prognosis prediction accuracy further by taking advantage of the minimum redundancy feature selection method (mRMR) and Multiple Kernel Machine (MKL) learning method. Our goal is to establish an integrated model which could predict GBM prognosis with high accuracy.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2016-10-08
    Description: The paper presents a neutral Codons Probability Mutations (CPM) model of molecular evolution and genetic decay of an organism. The CPM model uses a Markov process with a 20-dimensional state space of probability distributions over amino acids. The transition matrix of the Markov process includes the mutation rate and those single point mutations compatible with the genetic code. This is an alternative to the standard Point Accepted Mutation (PAM) and BLOcks of amino acid SUbstitution Matrix (BLOSUM). Genetic decay is quantified as a similarity between the amino acid distribution of proteins from a (group of) species on one hand, and the equilibrium distribution of the Markov chain on the other. Amino acid data for the eukaryote, bacterium, and archaea families are used to illustrate how both the CPM and PAM models predict their genetic decay towards the equilibrium value of 1. A family of bacteria is studied in more detail. It is found that warm environment organisms on average have a higher degree of genetic decay compared to those species that live in cold environments. The paper addresses a new codon-based approach to quantify genetic decay due to single point mutations compatible with the genetic code. The present work may be seen as a first approach to use codon-based Markov models to study how genetic entropy increases with time in an effectively neutral biological regime. Various extensions of the model are also discussed.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2016-10-08
    Description: MrBayes is a widespread phylogenetic inference tool harnessing empirical evolutionary models and Bayesian statistics. However, the computational cost on the likelihood estimation is very expensive, resulting in undesirably long execution time. Although a number of multi-threaded optimizations have been proposed to speed up MrBayes, there are bottlenecks that severely limit the GPU thread-level parallelism of likelihood estimations. This study proposes a high performance and resource-efficient method for GPU-oriented parallelization of likelihood estimations. Instead of having to rely on empirical programming, the proposed novel decomposition storage model implements high performance data transfers implicitly. In terms of performance improvement, a speedup factor of up to 178 can be achieved on the analysis of simulated datasets by four Tesla K40 cards. In comparison to the other publicly available GPU-oriented MrBayes, the tgMC 3 ++ method (proposed herein) outperforms the tgMC 3 (v1.0), nMC 3 (v2.1.1) and oMC 3 (v1.00) methods by speedup factors of up to 1.6, 1.9 and 2.9, respectively. Moreover, tgMC 3 ++ supports more evolutionary models and gamma categories, which previous GPU-oriented methods fail to take into analysis.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2016-10-08
    Description: To address the searching problem of protein conformational space in ab-initio protein structure prediction, a novel method using abstract convex underestimation (ACUE) based on the framework of evolutionary algorithm was proposed. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and rugged energy surface of the protein conformational space. As a consequence, the dimension of protein conformational space should be reduced to a proper level. In this paper, the high-dimensionality original conformational space was converted into feature space whose dimension is considerably reduced by feature extraction technique. And, the underestimate space could be constructed according to abstract convex theory. Thus, the entropy effect caused by searching in the high-dimensionality conformational space could be avoided through such conversion. The tight lower bound estimate information was obtained to guide the searching direction, and the invalid searching area in which the global optimal solution is not located could be eliminated in advance. Moreover, instead of expensively calculating the energy of conformations in the original conformational space, the estimate value is employed to judge if the conformation is worth exploring to reduce the evaluation time, thereby making computational cost lower and the searching process more efficient. Additionally, fragment assembly and the Monte Carlo method are combined to generate a series of metastable conformations by sampling in the conformational space. The proposed method provides a novel technique to solve the searching problem of protein conformational space. Twenty small-to-medium structurally diverse proteins were tested, and the proposed ACUE method was compared with It Fix, HEA, Rosetta and the developed method LEDE without underestimate information. Test results show that the ACUE method can more rapidly and more effi- iently obtain the near-native protein structure.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2016-10-08
    Description: Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. The distance and similarity between two sequence are very important and widely studied. During the last decades, Similarity(distance) metric learning is one of the hottest topics of machine learning/data mining as well as their applications in the bioinformatics field. It is feasible to introduce machine learning technology to learn similarity metric from biological data. In this paper, we propose a novel framework of guaranteed similarity metric learning (GMSL) to perform alignment of biology sequences in any feature vector space. It introduces the $(epsilon, gamma, tau)$ -goodness similarity theory to Mahalanobis metric learning. As a theoretical guaranteed similarity metric learning approach, GMSL guarantees that the learned similarity function performs well in classification and clustering. Our experiments on the most used datasets demonstrate that our approach outperforms the state-of-the-art biological sequences alignment methods and other similarity metric learning algorithms in both accuracy and stability.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...