ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Books
  • Articles  (13,402)
  • Oxford University Press  (8,538)
  • Institute of Electrical and Electronics Engineers (IEEE)  (3,963)
  • MDPI Publishing  (584)
  • White Horse Press  (317)
  • 2010-2014  (13,402)
  • Computer Science  (12,732)
  • Philosophy  (670)
Collection
  • Books
  • Articles  (13,402)
Publisher
Years
Year
  • 1
    Publication Date: 2014-12-13
    Description: Replication in herpesvirus genomes is a major concern of public health as they multiply rapidly during the lytic phase of infection that cause maximum damage to the host cells. Earlier research has established that sites of replication origin are dominated by high concentration of rare palindrome sequences of DNA. Computational methods are devised based on scoring to determine the concentration of palindromes. In this paper, we propose both extraction and localization of rare palindromes in an automated manner. Discrete Cosine Transform (DCT-II), a widely recognized image compression algorithm is utilized here to extract palindromic sequences based on their reverse complimentary symmetry property of existence. We formulate a novel approach to localize the rare palindrome clusters by devising a Minimum Quadratic Entropy (MQE) measure based on the Renyi’s Quadratic Entropy (RQE) function. Experimental results over a large number of herpesvirus genomes show that the RQE based scoring of rare palindromes have higher order of sensitivity, and lesser false alarm in detecting concentration of rare palindromes and thereby sites of replication origin.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: GO relation embodies some aspects of existence dependency. If GO term x is existence-dependent on GO term y , the presence of y implies the presence of x . Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x . A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term x cannot be existence-dependent on GO term y , if x and y have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: The Tikhonov regularized nonnegative matrix factorization (TNMF) is an NMF objective function that enforces smoothness on the computed solutions, and has been successfully applied to many problem domains including text mining, spectral data analysis, and cancer clustering. There is, however, an issue that is still insufficiently addressed in the development of TNMF algorithms, i.e., how to develop mechanisms that can learn the regularization parameters directly from the data sets. The common approach is to use fixed values based on a priori knowledge about the problem domains. However, from the linear inverse problems study it is known that the quality of the solutions of the Tikhonov regularized least square problems depends heavily on the choosing of appropriate regularization parameters. Since least squares are the building blocks of the NMF, it can be expected that similar situation also applies to the NMF. In this paper, we propose two formulas to automatically learn the regularization parameters from the data set based on the L-curve approach. We also develop a convergent algorithm for the TNMF based on the additive update rules. Finally, we demonstrate the use of the proposed algorithm in cancer clustering tasks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Analysis of probability distributions conditional on species trees has demonstrated the existence of anomalous ranked gene trees (ARGTs), ranked gene trees that are more probable than the ranked gene tree that accords with the ranked species tree. Here, to improve the characterization of ARGTs, we study enumerative and probabilistic properties of two classes of ranked labeled species trees, focusing on the presence or avoidance of certain subtree patterns associated with the production of ARGTs. We provide exact enumerations and asymptotic estimates for cardinalities of these sets of trees, showing that as the number of species increases without bound, the fraction of all ranked labeled species trees that are ARGT-producing approaches $1$ . This result extends beyond earlier existence results to provide a probabilistic claim about the frequency of ARGTs.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: The existence of various types of correlations among the expressions of a group of biologically significant genes poses challenges in developing effective methods of gene expression data analysis. The initial focus of computational biologists was to work with only absolute and shifting correlations. However, researchers have found that the ability to handle shifting-and-scaling correlation enables them to extract more biologically relevant and interesting patterns from gene microarray data. In this paper, we introduce an effective shifting-and-scaling correlation measure named Shifting and Scaling Similarity (SSSim), which can detect highly correlated gene pairs in any gene expression data. We also introduce a technique named Intensive Correlation Search (ICS) biclustering algorithm, which uses SSSim to extract biologically significant biclusters from a gene expression data set. The technique performs satisfactorily with a number of benchmarked gene expression data sets when evaluated in terms of functional categories in Gene Ontology database.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2014-12-13
    Description: Attractors in gene regulatory networks represent cell types or states of cells. In system biology and synthetic biology, it is important to generate gene regulatory networks with desired attractors. In this paper, we focus on a singleton attractor, which is also called a fixed point. Using a Boolean network (BN) model, we consider the problem of finding Boolean functions such that the system has desired singleton attractors and has no undesired singleton attractors. To solve this problem, we propose a matrix-based representation of BNs. Using this representation, the problem of finding Boolean functions can be rewritten as an Integer Linear Programming (ILP) problem and a Satisfiability Modulo Theories (SMT) problem. Furthermore, the effectiveness of the proposed method is shown by a numerical example on a WNT5A network, which is related to melanoma. The proposed method provides us a basic method for design of gene regulatory networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2014-12-13
    Description: In this paper, we study Copy Number Variation (CNV) data.The underlying process generating CNV segments is generally assumed to be memory-less, giving rise to an exponential distribution of segment lengths. In this paper, we provide evidence from cancer patient data, which suggests that this generative model is too simplistic , and that segment lengths follow a power-law distribution instead . We conjecture a simple preferential attachment generative model that provides the basis for the observed power-law distribution. We then show how an existing statistical method for detecting cancer driver genes can be improved by incorporating the power-law distribution in the null model.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Proteins fold into complex three-dimensional shapes. Simplified representations of their shapes are central to rationalise, compare, classify, and interpret protein structures. Traditional methods to abstract protein folding patterns rely on representing their standard secondary structural elements (helices and strands of sheet) using line segments. This results in ignoring a significant proportion of structural information. The motivation of this research is to derive mathematically rigorous and biologically meaningful abstractions of protein folding patterns that maximize the economy of structural description and minimize the loss of structural information. We report on a novel method to describe a protein as a non-overlapping set of parametric three dimensional curves of varying length and complexity. Our approach to this problem is supported by information theory and uses the statistical framework of minimum message length (MML) inference. We demonstrate the effectiveness of our non-linear abstraction to support efficient and effective comparison of protein folding patterns on a large scale.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2014-12-13
    Description: The organization of global protein interaction networks (PINs) has been extensively studied and heatedly debated. We revisited this issue in the context of the analysis of dynamic organization of a PIN in the yeast cell cycle. Statistically significant bimodality was observed when analyzing the distribution of the differences in expression peak between periodically expressed partners. A close look at their behavior revealed that date and party hubs derived from this analysis have some distinct features. There are no significant differences between them in terms of protein essentiality, expression correlation and semantic similarity derived from gene ontology (GO) biological process hierarchy. However, date hubs exhibit significantly greater values than party hubs in terms of semantic similarity derived from both GO molecular function and cellular component hierarchies. Relating to three-dimensional structures, we found that both single- and multi-interface proteins could become date hubs coordinating multiple functions performed at different times while party hubs are mainly multi-interface proteins. Furthermore, we constructed and analyzed a PPI network specific to the human cell cycle and highlighted that the dynamic organization in human interactome is far more complex than the dichotomy of hubs observed in the yeast cell cycle.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2014-12-13
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2014-12-13
    Description: The articles in this special section were presented at the 2012 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS 2012) that was held in Washington DC from December 2nd to 4th.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Disk additions to an RAID-6 storage system can increase the I/O parallelism and expand the storage capacity simultaneously. To regain load balance among all disks including old and new, RAID-6 scaling requires moving certain data blocks onto newly added disks. Existing approaches to RAID-6 scaling, restricted by preserving a round-robin data distribution, require migrating all the data, which results in an expensive cost for RAID-6 scaling. In this paper, we propose RS6—a new approach to accelerating RDP RAID-6 scaling by reducing disk I/Os and XOR operations. First, RS6 minimizes the number of data blocks to be moved while maintaining a uniform data distribution across all data disks. Second, RS6 piggybacks parity updates during data migration to reduce the cost of maintaining consistent parities. Third, RS6 selects parameters of data migration so as to reduce disk I/Os for parity updates. Our mathematical analysis indicates that RS6 provides uniform data distribution, minimal data migration, and fast data addressing. We also conducted extensive simulation experiments to quantitatively characterize the properties of RS6. The results show that, compared with existing “moving-everything” Round-Robin approaches, RS6 reduces the number of blocks to be moved by 60.0%–88.9%, and saves the migration time by 40.27%–69.88%.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: This paper focuses on designing a distributed medium access control algorithm for fairly sharing network resources among contending stations in an 802.11 wireless network. Because the notion of fairness is not universal and there lacks a rigorous analysis on the relationships among the four types of most popular fairness criteria, we first mathematically prove that there exist certain connections between these types of fairness criteria. We then propose an efficient medium access algorithm that aims at achieving time fairness and throughput enhancement in a fully distributed manner. The core idea of our proposed algorithm lies in that each station needs to select an appropriate contention window size so as to fairly share the channel occupancy time and maximize the throughput under the time fairness constraint. The derivation of the proper contention window size is addressed rigorously. We evaluate the performance of our proposed algorithm through an extensive simulation study, and the evaluation results demonstrate that our proposed algorithm leads to nearly perfect time fairness, high throughput, and low collision overhead.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: This paper investigates the limits of adaptive voltage scaling (AVS) applied to commercial FPGAs which do not specifically support voltage adaptation. An adaptive power architecture based on a modified design flow is created with in-situ detectors and dynamic reconfiguration of clock management resources. AVS is a power-saving technique that enables a device to regulate its own voltage and frequency based on workload, process and operating conditions in a closed-loop configuration. It results in significant improved energy profiles compared with dynamic voltage frequency scaling (DVFS) in which the device uses a number of pre-calculated valid working points. The results of deploying AVS in FPGAs with in-situ detectors shows power and energy savings exceeding 85 percent compared with nominal voltage operation at the same frequency. The in-situ detector approach compares favorably with critical path replication based on delay lines since it avoids the need of cumbersome and error-prone delay line calibration.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2014-11-07
    Description: Motivation: Mapping of high-throughput sequencing data and other bulk sequence comparison applications have motivated a search for high-efficiency sequence alignment algorithms. The bit-parallel approach represents individual cells in an alignment scoring matrix as bits in computer words and emulates the calculation of scores by a series of logic operations composed of AND, OR, XOR, complement, shift and addition. Bit-parallelism has been successfully applied to the longest common subsequence (LCS) and edit-distance problems, producing fast algorithms in practice. Results: We have developed BitPAl, a bit-parallel algorithm for general, integer-scoring global alignment. Integer-scoring schemes assign integer weights for match, mismatch and insertion/deletion. The BitPAl method uses structural properties in the relationship between adjacent scores in the scoring matrix to construct classes of efficient algorithms, each designed for a particular set of weights. In timed tests, we show that BitPAl runs 7–25 times faster than a standard iterative algorithm. Availability and implementation: Source code is freely available for download at http://lobstah.bu.edu/BitPAl/BitPAl.html . BitPAl is implemented in C and runs on all major operating systems. Contact : jloving@bu.edu or yhernand@bu.edu or gbenson@bu.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2014-11-07
    Description: : Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify 〉175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. Availability and implementation: The source code can be downloaded at http://www.heiderlab.de Contact: d.heider@wz-straubing.de
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2014-11-07
    Description: : We present a new method to incrementally construct the FM-index for both short and long sequence reads, up to the size of a genome. It is the first algorithm that can build the index while implicitly sorting the sequences in the reverse (complement) lexicographical order without a separate sorting step. The implementation is among the fastest for indexing short reads and the only one that practically works for reads of averaged kilobases in length. Availability and implementation: https://github.com/lh3/ropebwt2 Contact: hengli@broadinstitute.org
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2014-11-07
    Description: : AliView is an alignment viewer and editor designed to meet the requirements of next-generation sequencing era phylogenetic datasets. AliView handles alignments of unlimited size in the formats most commonly used, i.e. FASTA, Phylip, Nexus, Clustal and MSF. The intuitive graphical interface makes it easy to inspect, sort, delete, merge and realign sequences as part of the manual filtering process of large datasets. AliView also works as an easy-to-use alignment editor for small as well as large datasets. Availability and implementation: AliView is released as open-source software under the GNU General Public License, version 3.0 (GPLv3), and is available at GitHub ( www.github.com/AliView ). The program is cross-platform and extensively tested on Linux, Mac OS X and Windows systems. Downloads and help are available at http://ormbunkar.se/aliview Contact: anders.larsson@ebc.uu.se Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2014-11-07
    Description: Motivation: The ability to accurately read the order of nucleotides in DNA and RNA is fundamental for modern biology. Errors in next-generation sequencing can lead to many artifacts, from erroneous genome assemblies to mistaken inferences about RNA editing. Uneven coverage in datasets also contributes to false corrections. Result: We introduce Trowel, a massively parallelized and highly efficient error correction module for Illumina read data. Trowel both corrects erroneous base calls and boosts base qualities based on the k -mer spectrum. With high-quality k -mers and relevant base information, Trowel achieves high accuracy for different short read sequencing applications.The latency in the data path has been significantly reduced because of efficient data access and data structures. In performance evaluations, Trowel was highly competitive with other tools regardless of coverage, genome size read length and fragment size. Availability and implementation: Trowel is written in C++ and is provided under the General Public License v3.0 (GPLv3). It is available at http://trowel-ec.sourceforge.net . Contact: euncheon.lim@tue.mpg.de or weigel@tue.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2014-11-07
    Description: : The application of protein–protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of 〉97% strong scaling. Availability and Implementation: MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: http://www.bi.cs.titech.ac.jp/megadock . Contact: akiyama@cs.titech.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2014-11-07
    Description: Motivation: The identification of active transcriptional regulatory elements is crucial to understand regulatory networks driving cellular processes such as cell development and the onset of diseases. It has recently been shown that chromatin structure information, such as DNase I hypersensitivity (DHS) or histone modifications, significantly improves cell-specific predictions of transcription factor binding sites. However, no method has so far successfully combined both DHS and histone modification data to perform active binding site prediction. Results: We propose here a method based on hidden Markov models to integrate DHS and histone modifications occupancy for the detection of open chromatin regions and active binding sites. We have created a framework that includes treatment of genomic signals, model training and genome-wide application. In a comparative analysis, our method obtained a good trade-off between sensitivity versus specificity and superior area under the curve statistics than competing methods. Moreover, our technique does not require further training or sequence information to generate binding location predictions. Therefore, the method can be easily applied on new cell types and allow flexible downstream analysis such as de novo motif finding. Availability and implementation: Our framework is available as part of the Regulatory Genomics Toolbox. The software information and all benchmarking data are available at http://costalab.org/wp/dh-hmm . Contact: ivan.costa@rwth-aachen.de or eduardo.gusmao@rwth-aachen.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2014-11-07
    Description: Motivation: A proper target or marker is essential in any diagnosis (e.g. an infection or cancer). An ideal diagnostic target should be both conserved in and unique to the pathogen. Currently, these targets can only be identified manually, which is time-consuming and usually error-prone. Because of the increasingly frequent occurrences of emerging epidemics and multidrug-resistant ‘superbugs’, a rapid diagnostic target identification process is needed. Results: A new method that can identify uniquely conserved regions (UCRs) as candidate diagnostic targets for a selected group of organisms solely from their genomic sequences has been developed and successfully tested. Using a sequence-indexing algorithm to identify UCRs and a k -mer integer-mapping model for computational efficiency, this method has successfully identified UCRs within the bacteria domain for 15 test groups, including pathogenic, probiotic, commensal and extremophilic bacterial species or strains. Based on the identified UCRs, new diagnostic primer sets were designed, and their specificity and efficiency were tested by polymerase chain reaction amplifications from both pure isolates and samples containing mixed cultures. Availability and implementation: The UCRs identified for the 15 bacterial species are now freely available at http://ucr.synblex.com . The source code of the programs used in this study is accessible at http://ucr.synblex.com/bacterialIdSourceCode.d.zip Contact: yazhousun@synblex.com Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2014-11-07
    Description: Motivation: A popular method for classification of protein domain movements apportions them into two main types: those with a ‘hinge’ mechanism and those with a ‘shear’ mechanism. The intuitive assignment of domain movements to these classes has limited the number of domain movements that can be classified in this way. Furthermore, whether intended or not, the term ‘shear’ is often interpreted to mean a relative translation of the domains. Results: Numbers of occurrences of four different types of residue contact changes between domains were optimally combined by logistic regression using the training set of domain movements intuitively classified as hinge and shear to produce a predictor for hinge and shear. This predictor was applied to give a 10-fold increase in the number of examples over the number previously available with a high degree of precision. It is shown that overall a relative translation of domains is rare, and that there is no difference between hinge and shear mechanisms in this respect. However, the shear set contains significantly more examples of domains having a relative twisting movement than the hinge set. The angle of rotation is also shown to be a good discriminator between the two mechanisms. Availability and implementation: Results are free to browse at http://www.cmp.uea.ac.uk/dyndom/interface/ . Contact: sjh@cmp.uea.ac.uk . Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2014-11-07
    Description: Motivation: Recent studies on human disease have revealed that aberrant interaction between proteins probably underlies a substantial number of human genetic diseases. This suggests a need to investigate disease inheritance mode using interaction, and based on which to refresh our conceptual understanding of a series of properties regarding inheritance mode of human disease. Results: We observed a strong correlation between the number of protein interactions and the likelihood of a gene causing any dominant diseases or multiple dominant diseases, whereas no correlation was observed between protein interaction and the likelihood of a gene causing recessive diseases. We found that dominant diseases are more likely to be associated with disruption of important interactions. These suggest inheritance mode should be understood using protein interaction. We therefore reviewed the previous studies and refined an interaction model of inheritance mode, and then confirmed that this model is largely reasonable using new evidences. With these findings, we found that the inheritance mode of human genetic diseases can be predicted using protein interaction. By integrating the systems biology perspectives with the classical disease genetics paradigm, our study provides some new insights into genotype–phenotype correlations. Contact: haodapeng@ems.hrbmu.edu.cn or biofomeng@hotmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2014-11-07
    Description: : Recently, several high profile studies collected cell viability data from panels of cancer cell lines treated with many drugs applied at different concentrations. Such drug sensitivity data for cancer cell lines provide suggestive treatments for different types and subtypes of cancer. Visualization of these datasets can reveal patterns that may not be obvious by examining the data without such efforts. Here we introduce Drug/Cell-line Browser (DCB), an online interactive HTML5 data visualization tool for interacting with three of the recently published datasets of cancer cell lines/drug-viability studies. DCB uses clustering and canvas visualization of the drugs and the cell lines, as well as a bar graph that summarizes drug effectiveness for the tissue of origin or the cancer subtypes for single or multiple drugs. DCB can help in understanding drug response patterns and prioritizing drug/cancer cell line interactions by tissue of origin or cancer subtype. Availability and implementation: DCB is an open source Web-based tool that is freely available at: http://www.maayanlab.net/LINCS/DCB Contact: avi.maayan@mssm.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: In microprocessor-based systems, such as the cloud computing infrastructure, high reliability is essential. As multiprocessor systems become more widespread and increasingly complex, system-level diagnosis will increasingly be adopted to determine their robustness. In this paper, we consider a pessimistic diagnostic strategy for hypermesh multiprocessor systems under the PMC model. The pessimistic strategy is a diagnostic process whereby all faulty processors are correctly identified and at most one fault-free processor may be misjudged to be a faulty processor. We first determine the pessimistic diagnosability of a hypermesh to be ${2}{{n}}({{k}} - {1}) - {{k}}$ . We then propose an efficient pessimistic diagnostic algorithm to identify at most ${ 2}{{n}}({{k}} - { 1}) - {{k}}$ faults in ${{O}}({{N}})$ time, where ${mbi{k}}$ is the radix, ${mbi{n}}$ is the number of dimensions, and ${{N}} = {{k^n}}$ is the total number of processors. This result is superior to the best precise diagnostic algorithm, which runs in ${{O}}({{N}}{log _{{k}}}{{N}})$ time. Furthermore, the Cartesian product network, a subgraph of the hypermesh and the proposed algorithm can be employed to determine faults in the product network.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: In a top- $k$ Geometric Intersection Query (top- $k$ GIQ) problem, a set of $n$ weighted, geometric objects in ${bb R}^d$ is to be pre-processed into a compact data structure so that for any query geometric object, $q$ , and integer $k>0$ , the $k$ largest-weight objects intersected by $q$ can be reported efficiently. While the top- $k$ problem has been studied extensively for non-geometric problems (e.g., recommender systems), the geometric version has received little attention. This paper gives a general technique to solve any top-
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2014-11-08
    Description: Rule induction method based on rough set theory (RST) has received much attention recently since it may generate a minimal set of rules from the decision system for real-life applications by using of attribute reduction and approximations. The decision system may vary with time, e.g., the variation of objects, attributes and attribute values. The reduction and approximations of the decision system may alter on Attribute Values’ Coarsening and Refining (AVCR), a kind of variation of attribute values, which results in the alteration of decision rules simultaneously. This paper aims for dynamic maintenance of decision rules $w.r.t.$ AVCR. The definition of minimal discernibility attribute set is proposed firstly, which aims to improve the efficiency of attribute reduction in RST. Then, principles of updating decision rules in case of AVCR are discussed. Furthermore, the rough set-based methods for updating decision rules in the inconsistent decision system are proposed. The complexity analysis and extensive experiments on UCI data sets have verified the effectiveness and efficiency of the proposed methods.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: A major mining task for binary matrixes is the extraction of approximate top- (k) patterns that are able to concisely describe the input data. The top- (k) pattern discovery problem is commonly stated as an optimization one, where the goal is to minimize a given cost function, see the accuracy of the data description. In this work, we review several greedy algorithms, and discuss PaNDa + , an algorithmic framework able to optimize different cost functions generalized into a unifying formulation. We evaluated the goodness of the algorithm by measuring the quality of the extracted patterns. We adapted standard quality measures to assess the capability of the algorithm to discover both the items and transactions of the patterns embedded in the data. The evaluation was conducted on synthetic data, where patterns were artificially embedded, and on real-world text collection, where each document is labeled with a topic. Finally, in order to qualitatively evaluate the usefulness of the discovered patterns, we exploited PaNDa + to detect overlapping communities in a bipartite network. The results show that PaNDa + is able to discover high-quality patterns in both synthetic and real-world datasets.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: Mining network evolution has emerged as an intriguing research topic in many domains such as data mining, social networks, and machine learning. While a bulk of research has focused on mining evolutionary patterns of homogeneous networks (e.g., networks of friends), however, most real-world networks are heterogeneous, containing objects of different types, such as authors, papers, venues, and terms in a bibliographic network. Modeling co-evolution of multityped objects can capture richer information than that on single-typed objects alone. For example, studying co-evolution of authors, venues, and terms in a bibliographic network can tell better the evolution of research areas than just examining co-author network or term network alone. In this paper, we study mining co-evolution of multityped objects in a special type of heterogeneous networks, called star networks, and examine how the multityped objects influence each other in the network evolution. A hierarchical Dirichlet process mixture model-based evolution model is proposed, which detects the co-evolution of multityped objects in the form of multityped cluster evolution in dynamic star networks. An efficient inference algorithm is provided to learn the proposed model. Experiments on several real networks (DBLP, Twitter, and Delicious) validate the effectiveness of the model and the scalability of the algorithm.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: In this work, we define cost-free learning (CFL) formally in comparison with cost-sensitive learning (CSL). The main difference between them is that a CFL approach seeks optimal classification results without requiring any cost information, even in the class imbalance problem. In fact, several CFL approaches exist in the related studies, such as sampling and some criteria-based approaches. However, to our best knowledge, none of the existing CFL and CSL approaches are able to process the abstaining classifications properly when no information is given about errors and rejects. Based on information theory, we propose a novel CFL which seeks to maximize normalized mutual information of the targets and the decision outputs of classifiers. Using the strategy, we can handle binary/multi-class classifications with/without abstaining. Significant features are observed from the new strategy. While the degree of class imbalance is changing, the proposed strategy is able to balance the errors and rejects accordingly and automatically. Another advantage of the strategy is its ability of deriving optimal rejection thresholds for abstaining classifications and the “equivalent” costs in binary classifications. The connection between rejection thresholds and ROC curve is explored. Empirical investigation is made on several benchmark data sets in comparison with other existing approaches. The classification results demonstrate a promising perspective of the strategy in machine learning.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: Multivariate time series are common in many application domains, particularly in industrial processes with a large number of sensors installed for process monitoring and control. Often, such data encapsulate complex relations among individual series. This paper presents a new type of patterns in multivariate time series, referred to as temporal associations, to capture a wide range of local relations along and across individual series. A scalable algorithm is developed to discover frequent associations by incorporating (1) redundancy pruning of patterns in single time series and (2) two conditions to avoid over-counting the occurrences of associations, thus greatly reducing the space and runtime complexity of the discovery process. A statistical significance measure is also introduced for ranking and post-pruning discovered associations. To evaluate the proposed method, synthetic data sets and a real world data set taken from the time series mining repository as well as a large data set obtained from a delayed coking plant are used. The experiments demonstrated that the discovered associations capture the local relations in multiple time series and that the proposed method is scalable to large data sets.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: Short texts are popular on today’s web, especially with the emergence of social media. Inferring topics from large scale short texts becomes a critical but challenging task for many content analysis tasks. Conventional topic models such as latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA) learn topics from document-level word co-occurrences by modeling each document as a mixture of topics, whose inference suffers from the sparsity of word co-occurrence patterns in short texts. In this paper, we propose a novel way for short text topic modeling, referred as biterm topic model (BTM) . BTM learns topics by directly modeling the generation of word co-occurrence patterns (i.e., biterms) in the corpus, making the inference effective with the rich corpus-level information. To cope with large scale short text data, we further introduce two online algorithms for BTM for efficient topic learning. Experiments on real-word short text collections show that BTM can discover more prominent and coherent topics, and significantly outperform the state-of-the-art baselines. We also demonstrate the appealing performance of the two online BTM algorithms on both time efficiency and topic learning.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2014-11-08
    Description: In the literature about association analysis, many interestingness measures have been proposed to assess the quality of obtained association rules in order to select a small set of the most interesting among them. In the particular case of hierarchically organized items and generalized association rules connecting them, a measure that dealt appropriately with the hierarchy would be advantageous. Here we present the further developments of a new class of such hierarchical interestingness measures and compare them with a large set of conventional measures and with three hierarchical pruning methods from the literature. The aim is to find interesting pairwise generalized association rules connecting the concepts of multiple ontologies. Interested in the broad empirical evaluation of interestingness measures, we compared the rules obtained by 37 methods on four real world data sets against predefined ground truth sets of associations. To this end, we adopted a framework of instance-based ontology matching and extended the set of performance measures by two novel measures: relation learning recall and precision which take into account hierarchical relationships.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: A highly comparative, feature-based approach to time series classification is introduced that uses an extensive database of algorithms to extract thousands of interpretable features from time series. These features are derived from across the scientific time-series analysis literature, and include summaries of time series in terms of their correlation structure, distribution, entropy, stationarity, scaling properties, and fits to a range of time-series models. After computing thousands of features for each time series in a training set, those that are most informative of the class structure are selected using greedy forward feature selection with a linear classifier. The resulting feature-based classifiers automatically learn the differences between classes using a reduced number of time-series properties, and circumvent the need to calculate distances between time series. Representing time series in this way results in orders of magnitude of dimensionality reduction, allowing the method to perform well on very large data sets containing long time series or time series of different lengths. For many of the data sets studied, classification performance exceeded that of conventional instance-based classifiers, including one nearest neighbor classifiers using euclidean distances and dynamic time warping and, most importantly, the features selected provide an understanding of the properties of the data set, insight that can guide further scientific investigation.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: The explosive usage of social media produces massive amount of unlabeled and high-dimensional data. Feature selection has been proven to be effective in dealing with high-dimensional data for efficient learning and data mining. Unsupervised feature selection remains a challenging task due to the absence of label information based on which feature relevance is often assessed. The unique characteristics of social media data further complicate the already challenging problem of unsupervised feature selection, e.g., social media data is inherently linked, which makes invalid the independent and identically distributed assumption, bringing about new challenges to unsupervised feature selection algorithms. In this paper, we investigate a novel problem of feature selection for social media data in an unsupervised scenario. In particular, we analyze the differences between social media data and traditional attribute-value data, investigate how the relations extracted from linked data can be exploited to help select relevant features, and propose a novel unsupervised feature selection framework, LUFS, for linked social media data. We systematically design and conduct systemic experiments to evaluate the proposed framework on data sets from real-world social media websites. The empirical study demonstrates the effectiveness and potential of our proposed framework.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: The discovery of process models from event logs has emerged as one of the crucial problems for enabling the continuous support in the life-cycle of an information system. However, in a decade of process discovery research, the algorithms and tools that have appeared are known to have strong limitations in several dimensions. The size of the logs and the formal properties of the model discovered are the two main challenges nowadays. In this paper we propose the use of numerical abstract domains for tackling these two problems, for the particular case of the discovery of Petri nets. First, numerical abstract domains enable the discovery of general process models, requiring no knowledge (e.g., the bound of the Petri net to derive) for the discovery algorithm. Second, by using divide and conquer techniques we are able to control the size of the process discovery problems. The methods proposed in this paper have been implemented in a prototype tool and experiments are reported illustrating the significance of this fresh view of the process discovery problem.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: Given a real world graph, how should we lay-out its edges? How can we compress it? These questions are closely related, and the typical approach so far is to find clique-like communities, like the ‘cavemen graph’, and compress them. We show that the block-diagonal mental image of the ‘cavemen graph’ is the wrong paradigm, in full agreement with earlier results that real world graphs have no good cuts. Instead, we propose to envision graphs as a collection of hubs connecting spokes, with super-hubs connecting the hubs, and so on, recursively. Based on the idea, we propose the SlashBurn method to recursively split a graph into hubs and spokes connected only by the hubs. We also propose techniques to select the hubs and give an ordering to the spokes, in addition to the basic SlashBurn. We give theoretical analysis of the proposed hub selection methods. Our view point has several advantages: (a) it avoids the ‘no good cuts’ problem, (b) it gives better compression, and (c) it leads to faster execution times for matrix-vector operations, which are the back-bone of most graph processing tools. Through experiments, we show that SlashBurn consistently outperforms other methods for all data sets, resulting in better compression and faster running time. Moreover, we show that SlashBurn with the appropriate spokes ordering can further improve compression while hardly sacrificing the running time.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: In this paper, we introduce “task trail” to understand user search behaviors. We define a task to be an atomic user information need, whereas a task trail represents all user activities within that particular task, such as query reformulations, URL clicks. Previously, web search logs have been studied mainly at session or query level where users may submit several queries within one task and handle several tasks within one session. Although previous studies have addressed the problem of task identification, little is known about the advantage of using task over session or query for search applications. In this paper, we conduct extensive analyses and comparisons to evaluate the effectiveness of task trails in several search applications: determining user satisfaction, predicting user search interests, and suggesting related queries. Experiments on large scale data sets of a commercial search engine show that: (1) Task trail performs better than session and query trails in determining user satisfaction; (2) Task trail increases webpage utilities of end users comparing to session and query trails; (3) Task trails are comparable to query trails but more sensitive than session trails in measuring different ranking functions; (4) Query terms from the same task are more topically consistent to each other than query terms from different tasks; (5) Query suggestion based on task trail is a good complement of query suggestions based on session trail and click-through bipartite. The findings in this paper verify the need of extracting task trails from web search logs and enhance applications in search and recommendation systems.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: This paper studies the problem of mining named entity translations by aligning comparable corpora. Current state-of-the-art approaches mine a translation pair by aligning an entity graph in one language to another based on node similarity or propagated similarity of related entities. However, they, building on the assumption of “symmetry”, quickly deteriorate on “weakly” comparable corpora with some asymmetry. In this paper, we pursue two directions for overcoming relation and entity asymmetry respectively. The first approach starts from weakly comparable corpora (for high recall) then ensures precision by selective propagation only to entities of symmetric relations. The second approach starts from parallel corpora (for high precision) then enhances recall by extending the translation matrix based on node similarity and contextual similarity. Our experimental results on English-Chinese corpora show that both approaches are effective and complementary. Our combined approach outperforms the best-performing baseline in terms of F1-score by up to 0.28.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: The knowledge remembered by the human body and reflected by the dexterity of body motion is called embodied knowledge. In this paper, we propose a new method using singular value decomposition for extracting embodied knowledge from the time-series data of the motion. We compose a matrix from the time-series data and use the left singular vectors of the matrix as the patterns of the motion and the singular values as a scalar, by which each corresponding left singular vector affects the matrix. Two experiments were conducted to validate the method. One is a gesture recognition experiment in which we categorize gesture motions by two kinds of models with indexes of similarity and estimation that use left singular vectors. The proposed method obtained a higher correct categorization ratio than principal component analysis (PCA) and correlation efficiency (CE). The other is an ambulation evaluation experiment in which we distinguished the levels of walking disability. The first singular values derived from the walking acceleration were suggested to be a reliable criterion to evaluate walking disability. Finally we discuss the characteristic and significance of the embodied knowledge extraction using the singular value decomposition proposed in this paper.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: Edit distance is widely used for measuring the similarity between two strings. As a primitive operation, edit distance based string similarity search is to find strings in a collection that are similar to a given query string using edit distance. Existing approaches for answering such string similarity queries follow the filter-and-verify framework by using various indexes. Typically, most approaches assume that indexes and data sets are maintained in main memory. To overcome this limitation, in this paper, we propose B $^+$ -tree based approaches to answer edit distance based string similarity queries, and hence, our approaches can be easily integrated into existing RDBMSs. In general, we answer string similarity search using pruning techniques employed in the metric space in that edit distance is a metric. First, we split the string collection into partitions according to a set of reference strings. Then, we index strings in all partitions using a single B $^+$ -tree based on the distances of these strings to their corresponding reference strings. Finally, we propose two approaches to efficiently answer range and KNN queries, respectively, based on the B $^+$ -tree. We prove that the optimal partitioning of the data set is an NP-hard problem, and therefore propose a heuristic approach for selecting the reference strings greedily and present an optimal partition assignment strategy to minimize the expected number of strings that need to be verified during the query evaluation. Through extensive experiments over a variety of real data sets, we demonstrate that our B $^+$ -tree based approaches provide superior performance over state-of-the-art techniques on both range and KNN queries in most cases.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-11-08
    Description: A top- k query retrieves the best (k) tuples by assigning scores for each tuple in a target relation with respect to a user-specific scoring function. This paper studies the problem of constructing an indexing structure for supporting top- k queries over varying scoring functions and retrieval sizes. The existing research efforts can be categorized into three approaches: list- , layer- , and view-based approaches. In this paper, we mainly focus on the layer-based approach that pre-materializes tuples into consecutive multiple layers. We first propose a dual-resolution layer that consists of coarse-level and fine-level layers. Specifically, we build coarse-level layers using skylines , and divide each coarse-level layer into fine-level sublayers using convex skylines . To make our proposed dual-resolution layer scalable , we then address the following optimization directions: 1) index construction; 2) disk-based storage scheme; 3) the design of the virtual layer; and 4) index maintenance for tuple updates. Our evaluation results show that our proposed method is more scalable than the state-of-the-art methods.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    facet.materialart.
    Unknown
    White Horse Press
    Publication Date: 2014-11-05
    Description: This article is currently available as a free download on ingentaconnect
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    facet.materialart.
    Unknown
    White Horse Press
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2014-11-05
    Print ISSN: 0963-2719
    Electronic ISSN: 1752-7015
    Topics: Energy, Environment Protection, Nuclear Power Engineering , Philosophy
    Published by White Horse Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2014-11-05
    Description: The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Future Internet 2014, 6 689 Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts..
    Electronic ISSN: 1999-5903
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2014-12-13
    Description: Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and epigenetic characteristics illuminate the roles their complex relationships play in disease progression and outcomes. However, integrative methods for diverse genomics data are faced with the challenges of ultra-high dimensionality and the existence of complex interactions both within and between platforms. We propose a novel modeling framework for integrative analysis based on decompositions of the large number of platform-specific features into a smaller number of latent features. Subsequently we build a predictive model for clinical outcomes accounting for both within- and between-platform interactions based on Bayesian model averaging procedures. Principal components, partial least squares and non-negative matrix factorization as well as sparse counterparts of each are used to define the latent features, and the performance of these decompositions is compared both on real and simulated data. The latent feature interactions are shown to preserve interactions between the original features and not only aid prediction but also allow explicit selection of outcome-related features. The methods are motivated by and applied to a glioblastoma multiforme data set from The Cancer Genome Atlas to predict patient survival times integrating gene expression, microRNA, copy number and methylation data. For the glioblastoma data, we find a high concordance between our selected prognostic genes and genes with known associations with glioblastoma. In addition, our model discovers several relevant cross-platform interactions such as copy number variation associated gene dosing and epigenetic regulation through promoter methylation. On simulated data, we show that our proposed method successfully incorporates interactions within and between g- nomic platforms to aid accurate prediction and variable selection. Our methods perform best when principal components are used to define the latent features.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: In Cyber-Physical Networked Systems (CPNS), the adversary can inject false measurements into the controller through compromised sensor nodes, which not only threaten the security of the system, but also consume network resources. To deal with this issue, a number of en-route filtering schemes have been designed for wireless sensor networks. However, these schemes either lack resilience to the number of compromised nodes or depend on the statically configured routes and node localization, which are not suitable for CPNS. In this paper, we propose a Polynomial-based Compromise-Resilient En-route Filtering scheme (PCREF), which can filter false injected data effectively and achieve a high resilience to the number of compromised nodes without relying on static routes and node localization. PCREF adopts polynomials instead of Message Authentication Codes (MACs) for endorsing measurement reports to achieve resilience to attacks. Each node stores two types of polynomials: authentication polynomial and check polynomial, derived from the primitive polynomial, and used for endorsing and verifying the measurement reports. Through extensive theoretical analysis and experiments, our data shows that PCREF achieves better filtering capacity and resilience to the large number of compromised nodes in comparison to the existing schemes.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: The Resistive Random Access Memory (RRAM) is a new type of non-volatile memory based on the resistive memory device. Researchers are currently moving from resistive device development to memory circuit design and implementation, hoping to fabricate memory chips that can be deployed in the market in the near future. However, so far the low manufacturing yield is still a major issue. In this paper, we propose defect and fault models specific to RRAM, i.e., the Over-Forming (OF) defect and the Read-One-Disturb (R1D) fault. We then propose a March algorithm to cover these defects and faults in addition to the conventional RAM faults, which is called March C*. We also develop a novel squeeze-search scheme to identify the OF defect, which leads to the Stuck-At Fault (SAF). The proposed test algorithm is applied to a first-cut 4-Mb HfO 2 -based RRAM test chip. Results show that OF defects and R1D faults do exist in the RRAM chip. We also identify specific failure patterns from the test results, which are shown to be induced by multiple short defects between bit-lines. By identifying the defects and faults, designers and process engineers can improve the RRAM yield in a more cost-effective way.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: In response to the increasing ubiquity of multicore processors, applications are usually designed or deployed to make each core busy. Unfortunately, lock contention within operating systems can limit the scalability of multicore systems so severely that an increase in the number of cores can actually lead to reduced performance (i.e., scalability collapse). Existing lock implementations have disadvantages in scalability, power consumption, and energy efficiency. In this paper, we observe that the number of tasks requesting a lock has a significant correlation with the occurrence of scalability collapse. Based on this observation, a lock implementation that allows tasks waiting for a lock to either spin or enter a power-saving state based on the number of requesters is proposed. Our lock protocol is called requester-based lock and is implemented in the Linux kernel to replace its default spin lock. Based on the results of a sensitivity analysis, we find that the best policy, in practice, for a task waiting for a lock to be granted is to enter the power-saving state immediately after noticing the lock cannot be acquired. Our requester-based lock scheme is evaluated using intensive benchmarking on AMD 32-core and Intel 40-core systems. Experimental results suggest that our lock avoids scalability collapse completely for most applications and shows better scalability, power consumption, and energy efficiency than previous work. Besides, the requester-based lock is extensible, which means using together with other kinds of spin locks can provide better scalability and energy efficiency.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2014-12-13
    Description: Functionally equivalent web services can be composed to form more reliable service-oriented systems. However, the choice of fault tolerance strategy can have a significant effect on the quality-of-service (QoS) of the resulting service-oriented systems. In this paper, we investigate the problem of selecting an optimal fault tolerance strategy for building reliable service-oriented systems. We formulate the user requirements as local and global constraints and model the selection of fault tolerance strategy as an optimization problem. A heuristic algorithm is proposed to efficiently solve the optimization problem. Fault tolerance strategy selection for semantically related tasks is also investigated in this paper. Large-scale real-world experiments are conducted to illustrate the benefits of the proposed approach. The experimental results show that our problem modeling approach and the proposed selection algorithm make it feasible to manage the fault tolerance of complex service-oriented systems both efficiently and effectively.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: This paper describes an end-to-end system implementation of a transactional memory (TM) programming model on top of the hardware transactional memory (HTM) of the Blue Gene/Q machine. The TM programming model supports most C/C++ programming constructs using a best-effort HTM and the help of a complete software stack including the compiler, the kernel, and the TM runtime. An extensive evaluation of the STAMP and the RMS-TM benchmark suites on BG/Q is the first of its kind in understanding characteristics of running TM workloads on real hardware TM. The study reveals several interesting insights on the overhead and the scalability of BG/Q HTM with respect to sequential execution, coarse-grain locking, and software TM.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Radio Frequency Identification (RFID) technology has been widely used in inventory management in many scenarios, e.g., warehouses, retail stores, hospitals, etc. This paper investigates a challenging problem of complete identification of missing tags in large-scale RFID systems. Although this problem has attracted extensive attention from academy and industry, the existing work can hardly satisfy the stringent real-time requirements. In this paper, a Slot Filter-based Missing Tag Identification (SFMTI) protocol is proposed to reconcile some expected collision slots into singleton slots and filter out the expected empty slots as well as the unreconcilable collision slots, thereby achieving the improved time-efficiency. The theoretical analysis is conducted to minimize the execution time of the proposed SFMTI. We then propose a cost-effective method to extend SFMTI to the multi-reader scenarios. The extensive simulation experiments and performance results demonstrate that the proposed SFMTI protocol outperforms the most promising Iterative ID-free Protocol (IIP) by reducing nearly 45% of the required execution time, and is just within a factor of 1.18 from the lower bound of the minimum execution time.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Memristor-based memory technology, also referred to as resistive RAM (RRAM), is one of the emerging memory technologies potentially to replace conventional semiconductor memories such as SRAM, DRAM, and flash. Existing research on such novel circuits focuses mainly on the integration between CMOS and non-CMOS, fabrication techniques, and reliability improvement. However, research on (manufacturing) test for yield and quality improvement is still in its infancy stage. This paper presents fault analysis and modeling for open defects based on electrical simulation, introduces fault models, and proposes test approaches for RRAMs. The fault analysis reveals that unique faults occur in addition to some conventional memory faults, and the detection of such unique faults cannot be guaranteed with just the application of traditional march tests. The paper also presents a new Design-for-Testability (DfT) concept to facilitate the detection of the unique faults. Two DfT schemes are developed by exploiting the access time duration and supply voltage level of the RRAM cells, and their simulation results show that the fault coverage can be increased with minor circuit modification. As the fault behavior may vary due to process variations, the DfT schemes are extended to be programmable to track the changes and further improve the fault/defect coverage.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Runtime power management using dynamic voltage and frequency scaling (DVFS) has been extensively studied for video processing applications. But there is only a little work on game power management although gaming applications are now widely run on battery-operated portable devices like mobile phones. Taking a cue from video power management, where PID controllers have been successfully used, they were recently applied to game workload prediction and DVFS. However, the use of hand-tuned PID controller gains on relatively short game plays left open questions on the robustness of the controller and the sensitivity of prediction quality on the choice of the gain values. In this paper, we try to systematically answer these questions. We first show that from the space of PID controller gain values, only a small subset leads to good game quality and power savings. Further, the choice of this set highly depends on the scene and the game application. For most gain values the controller becomes unstable, which can lead to large oscillations in the processor’s frequency setting and thereby poor results. We then study a number of time series models, such as a Least Mean Squares (LMS) Linear Predictor and its generalizations in the form of Autoregressive Moving Average (ARMA) models. These models learn most of the relevant model parameters iteratively as the game progresses, thereby dramatically reducing the complexity of manual parameter estimation. This makes them deployable in real setups, where all game plays and even game applications are not a priori known. We have evaluated each of these models (PID, LMS, and ARMA) for a variety of games—ranging from Quake II to more recent closed-source games such as Crysis, Need for Speed—Shift and World in Conflict—with very encouraging results. To the best of our knowledge, this is the first work that systematically explores (a) the feasibility of manually tuning PID controller parameters for p- wer management, (b) time series models for workload prediction for gaming applications, and (c) power management for closed-source games.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Ciphertext Policy Attribute-Based Encryption (CP-ABE) enforces expressive data access policies and each policy consists of a number of attributes. Most existing CP-ABE schemes incur a very large ciphertext size, which increases linearly with respect to the number of attributes in the access policy. Recently, Herranz proposed a construction of CP-ABE with constant ciphertext. However, Herranz do not consider the recipients’ anonymity and the access policies are exposed to potential malicious attackers. On the other hand, existing privacy preserving schemes protect the anonymity but require bulky, linearly increasing ciphertext size. In this paper, we proposed a new construction of CP-ABE, named Privacy Preserving Constant CP-ABE (denoted as PP-CP-ABE) that significantly reduces the ciphertext to a constant size with any given number of attributes. Furthermore, PP-CP-ABE leverages a hidden policy construction such that the recipients’ privacy is preserved efficiently. As far as we know, PP-CP-ABE is the first construction with such properties. Furthermore, we developed a Privacy Preserving Attribute-Based Broadcast Encryption (PP-AB-BE) scheme. Compared to existing Broadcast Encryption (BE) schemes, PP-AB-BE is more flexible because a broadcasted message can be encrypted by an expressive hidden access policy, either with or without explicit specifying the receivers. Moreover, PP-AB-BE significantly reduces the storage and communication overhead to the order of ${mbi {O}}(log {mbi {N}})$ , where ${mbi {N}}$ is the system size. Also, we proved, using information theoretical approaches, PP-AB-BE attains minimal bound on storage overhead for each user to cover all possible subgroups in the communication system.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Recent mobile devices adopt high-performance processors to support various functions. As a side effect, higher performance inevitably leads to power density increase, eventually resulting in thermal problems. In order to alleviate the thermal problems, off-the-shelf mobile devices rely on dynamic voltage-frequency scaling (DVFS)-based dynamic thermal management (DTM) schemes. Unfortunately, in the DVFS-based DTM schemes, an excessive number of DTM operations worsen not only performance but also power efficiency. In this paper, we propose a temperature-aware DVFS scheme for Android-based mobile devices to optimize power or performance depending on the option. We evaluate our scheme in the off-the-shelf mobile device. Our evaluation results show that our scheme saves energy consumption by 12.7%, on average, when we use the power optimizing option. Our scheme also enhances the performance by 6.3%, on average, by using the performance optimizing scheme, still reducing the energy consumption by 6.7%.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-13
    Description: Han propose a new method for parallel decimal multiplication with redundant partial products. They compare the performance of their multiplier with some previous relevant works, based on analytical and synthesis results. We have noted that the claimed critical delay path in (IEEE Trans. Computers, vol. 62, no. 5, pp. 956–968, May 2013) is faster than the actual critical delay path. Therefore, comparison results seem to be deceptive. For example, our accurate analytical evaluation devaluated the claimed speed advantage over the multiplier of (Microelectronics J., vol. 40, no. 10, pp. 1471–1481, Oct. 2009). Furthermore, we synthesized both multipliers, to show synthesis results confirm those of analytical evaluation.
    Print ISSN: 0018-9340
    Electronic ISSN: 1557-9956
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Learn about metasystems, their characteristics, and the challenges IT professionals and systems engineers face in designing and managing such systems.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Before writing a single line of code, software engineers can increase application assurance by instituting the practice recommendations articulated in their enterprise architecture. Many Common Weakness Enumerations (CWEs) can be addressed in the architecture and design phases of the development life cycle. Architectural and design flaws found late in the SDLC can be costly to repair; often, these flaws are so baked into the application that they're resistant to code patches. The only viable response might be to catalogue their existence for a later redesign of the application. Moreover, patches to flaws can inject additional defects as well as alert adversaries to the existence of these flaws.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2014-12-19
    Description: Due to instability and poor identification ability of single pyroelectric infrared (PIR) detector for human target identification, this paper proposes a new approach to fuse the information collected from multiple PIR sensors for human identification. Firstly, Fast Fourier Transform (FFT), Short Time Fourier Transform (STFT), Wavelet Transform (WT) and Wavelet Packet Transform (WPT) are adopted to extract features of the human body, which can be achieved by single PIR sensor. Then, we apply Principal Component Analysis (PCA) and Support Vector Machine (SVM) to reduce the characteristic dimensions and to classify the human targets, respectively. Finally, Fuzzy Comprehensive Evaluation (FCE) is utilized to fuse recognition results from multiple PIR sensors to finalize human identification. The pyroelectric characteristics under scenarios with different people and/or different paths are analyzed by various experiments, and the recognition results with/without fusion procedure are also shown and compared. The experimental results demonstrate our scheme has improved efficiency for human identification.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Cloud computing provides flexibility and agility to meet growing business needs in a dynamic and competitive landscape. Banking, financial services, and insurance sector organizations are interested in exploring cloud services as a technology, provided that security and privacy are ensured. One solution is a community cloud, in which cloud services are targeted for organizations with common objectives and security controls. The Indian Banking Community Cloud (IBCC) initiative of the Institute for Development and Research in Banking Technology in Hyderabad, India, provides cloud-based services exclusively for Indian banks. In this article, the authors describe the IBCC architecture, along with its implementation details, cloud services offered, security and disaster-recovery aspects, deployment challenges, and future work. This article is part of a special issue on advancing cloud computing.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Motivated by a continually increasing demand for applications that depend on machine comprehension of text-based content, researchers in both academia and industry have developed innovative solutions for automated information extraction from text. In this article, the authors focus on a subset of such tools--semantic taggers--that not only extract and disambiguate entities mentioned in the text but also identify topics that unambiguously describe the text's main themes. The authors offer insight into the process of semantic tagging, the capabilities and specificities of today's semantic taggers, and also indicate some of the criteria to be considered when choosing a tagger.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: The objectives of datacenter consolidation are cost reduction and superior services. A datacenter consolidation plan includes minimizing investments in IT infrastructure and buildings and reducing power consumption related to cooling. Such a process requires scalable planning and implementation. Virtualization is the most popular and cost-effective technology for datacenter consolidation. In this article, the author runs a cost-benefit analysis of virtualization and datacenter consolidation using the Global Virtual datacenter online calculator and VMware's ROI CO ((for) return on investment/total cost of ownership) calculator version 3.0.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: The adoption of various converging trends in IT, such as cloud computing, the Internet of Things (IoT), crypto-currency, autonomous systems, and big data, challenge traditional notions of program management and highlight the importance of computational networks.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Provides a listing of current staff, committee members and society officers.
    Print ISSN: 1520-9202
    Electronic ISSN: 1941-045X
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2014-12-04
    Description: Motivation : Structural variation is common in human and cancer genomes. High-throughput DNA sequencing has enabled genome-scale surveys of structural variation. However, the short reads produced by these technologies limit the study of complex variants, particularly those involving repetitive regions. Recent ‘third-generation’ sequencing technologies provide single-molecule templates and longer sequencing reads, but at the cost of higher per-nucleotide error rates. Results : We present MultiBreak-SV, an algorithm to detect structural variants (SVs) from single molecule sequencing data, paired read sequencing data, or a combination of sequencing data from different platforms. We demonstrate that combining low-coverage third-generation data from Pacific Biosciences (PacBio) with high-coverage paired read data is advantageous on simulated chromosomes. We apply MultiBreak-SV to PacBio data from four human fosmids and show that it detects known SVs with high sensitivity and specificity. Finally, we perform a whole-genome analysis on PacBio data from a complete hydatidiform mole cell line and predict 1002 high-probability SVs, over half of which are confirmed by an Illumina-based assembly. Availability and implementation : MultiBreak-SV is available at http://compbio.cs.brown.edu/software/ . Contact : annaritz@vt.edu or braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2014-12-04
    Description: Motivation: Insertions play an important role in genome evolution. However, such variants are difficult to detect from short-read sequencing data, especially when they exceed the paired-end insert size. Many approaches have been proposed to call short insertion variants based on paired-end mapping. However, there remains a lack of practical methods to detect and assemble long variants. Results: We propose here an original method, called M ind T he G ap , for the integrated detection and assembly of insertion variants from re-sequencing data. Importantly, it is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. M ind T he G ap uses an efficient k -mer-based method to detect insertion sites in a reference genome, and subsequently assemble them from the donor reads. M ind T he G ap showed high recall and precision on simulated datasets of various genome complexities. When applied to real Caenorhabditis elegans and human NA12878 datasets, M ind T he G ap detected and correctly assembled insertions 〉1 kb, using at most 14 GB of memory. Availability and implementation: http://mindthegap.genouest.org Contact: guillaume.rizk@inria.fr or claire.lemaitre@inria.fr
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2014-12-04
    Description: Motivation: Most tumor samples are a heterogeneous mixture of cells, including admixture by normal (non-cancerous) cells and subpopulations of cancerous cells with different complements of somatic aberrations. This intra-tumor heterogeneity complicates the analysis of somatic aberrations in DNA sequencing data from tumor samples. Results: We describe an algorithm called THetA2 that infers the composition of a tumor sample—including not only tumor purity but also the number and content of tumor subpopulations—directly from both whole-genome (WGS) and whole-exome (WXS) high-throughput DNA sequencing data. This algorithm builds on our earlier Tumor Heterogeneity Analysis (THetA) algorithm in several important directions. These include improved ability to analyze highly rearranged genomes using a variety of data types: both WGS sequencing (including low ~7 x coverage) and WXS sequencing. We apply our improved THetA2 algorithm to WGS (including low-pass) and WXS sequence data from 18 samples from The Cancer Genome Atlas (TCGA). We find that the improved algorithm is substantially faster and identifies numerous tumor samples containing subclonal populations in the TCGA data, including in one highly rearranged sample for which other tumor purity estimation algorithms were unable to estimate tumor purity. Availability and implementation: An implementation of THetA2 is available at http://compbio.cs.brown.edu/software Contact: layla@cs.brown.edu or braphael@brown.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: In creating the Open Networking Foundation's conformance testing program for the OpenFlow networking specification, economic, technological, and market drivers must be harmonized, allowing for the simultaneous development of consumer confidence, industry competition, and trustworthy product validation.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Recently, Apple admitted that revealing photos of celebrities had been released on the Internet due to security breaches associated with its iCloud and Find My iPhone systems.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: An integrity level defines a required level of confidence that a system satisfies critical properties related to relevant risk criteria. However, integrity level terms and definitions differ across industry sectors, and this hampers a common understanding and application of integrity levels.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Describes the above-named upcoming conference event. May include topics to be covered or calls for papers.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Software-defined networking opens up new possibilities for architectures based on open source components, promising improved orchestration and agility, lower operational costs, and--most important--a wave of innovation. The Web extra at http://youtu.be/pdG2btcyyK8 is a video in which authors Christian Esteve Rothenberg, Roy Chua, and Thomas Nadeau present a slideshow and discuss how software-defined networking opens up new possibilities for architectures based on open-source components, promising improved orchestration and agility, lower operational costs, and a new wave of innovation.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Why do dynamic power-management technologies that dramatically improve datacenter server energy efficiency continue to go unleveraged?
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: To adequately address climate change, we need novel data-science methods that account for the spatiotemporal and physical nature of climate phenomena. Only then will we be able to move from statistical analysis to scientific insights.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Amateur software developers might lack precise technical skills, but they bring detailed knowledge of their environments to the table. The first Web extra at http://youtu.be/r-kIJQu4iDQ is an audio recording of author David Alan Grier reading his Errant Hashtag column in which he discusses how amateur software developers might lack precise technical skills but bring detailed knowledge of their environments to the table. The second Web extra at http://youtu.be/EDKeN9mVfwk is an audio recording of author David Alan Grier discussing a recent report on electronic voting by the Atlantic Council, a Washington DC think tank, that shows that e-voting is still a risk that citizens of democracies and engineers should take into account.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: A summary of articles recently published in IEEE Computer Society magazines.
    Print ISSN: 0018-9162
    Electronic ISSN: 1558-0814
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: The past decade has seen a dramatic increase in the amount of data captured and made available to scientists for research. This increase amplifies the difficulty scientists face in finding the data most relevant to their information needs. In prior work, we hypothesized that Information Retrieval-style ranked search can be applied to data sets to help a scientist discover the most relevant data amongst the thousands of data sets in many formats, much like text-based ranked search helps users make sense of the vast number of Internet documents. To test this hypothesis, we explored the use of ranked search for scientific data using an existing multi-terabyte observational archive as our test-bed. In this paper, we investigate whether the concept of varying relevance, and therefore ranked search, applies to numeric data—that is, are data sets are enough like documents for Information Retrieval techniques and evaluation measures to apply? We present a user study that demonstrates that data set similarity resonates with users as a basis for relevance and, therefore, for ranked search. We evaluate a prototype implementation of ranked search over data sets with a second user study and demonstrate that ranked search improves a scientist's ability to find needed data.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: In recent years, probabilistic data management has received a lot of attention due to several applications that deal with uncertain data: RFID systems, sensor networks, data cleaning, scientific and biomedical data management, and approximate schema mappings. Query evaluation is a challenging problem in probabilistic databases, proved to be #P-hard. A general method for query evaluation is based on the lineage of the query and reduces the query evaluation problem to computing the probability of a propositional formula. The main approaches proposed in the literature to approximate probabilistic queries confidence computation are based on Monte Carlo simulation, or formula compilation into decision diagrams (e.g., d-trees). The former executes a polynomial, but with too many, iterations, while the latter is polynomial for easy queries, but may be exponential in the worst case. We designed a new optimized Monte Carlo algorithm that drastically reduces the number of iterations and proposed an efficient parallel version that we implemented on GPU. Thanks to the elevated degree of parallelism provided by the GPU, combined with the linear speedup of our algorithm, we managed to reduce significantly the long running time required by a sequential Monte Carlo algorithm. Experimental results show that our algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Graph-based ranking models have been widely applied in information retrieval area. In this paper, we focus on a well known graph-based model - the Ranking on Data Manifold model, or Manifold Ranking (MR). Particularly, it has been successfully applied to content-based image retrieval, because of its outstanding ability to discover underlying geometrical structure of the given image database. However, manifold ranking is computationally very expensive, which significantly limits its applicability to large databases especially for the cases that the queries are out of the database (new samples). We propose a novel scalable graph-based ranking model called Efficient Manifold Ranking (EMR), trying to address the shortcomings of MR from two main perspectives: scalable graph construction and efficient ranking computation. Specifically, we build an anchor graph on the database instead of a traditional $k$ -nearest neighbor graph, and design a new form of adjacency matrix utilized to speed up the ranking. An approximate method is adopted for efficient out-of-sample retrieval. Experimental results on some large scale image databases demonstrate that EMR is a promising method for real world retrieval applications.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Product quantization-based approaches are effective to encode high-dimensional data points for approximate nearest neighbor search. The space is decomposed into a Cartesian product of low-dimensional subspaces, each of which generates a sub codebook. Data points are encoded as compact binary codes using these sub codebooks, and the distance between two data points can be approximated efficiently from their codes by the precomputed lookup tables. Traditionally, to encode a subvector of a data point in a subspace, only one sub codeword in the corresponding sub codebook is selected, which may impose strict restrictions on the search accuracy. In this paper, we propose a novel approach, named optimized cartesian K-means (ock-means), to better encode the data points for more accurate approximate nearest neighbor search. In ock-means, multiple sub codewords are used to encode the subvector of a data point in a subspace. Each sub codeword stems from different sub codebooks in each subspace, which are optimally generated with regards to the minimization of the distortion errors. The high-dimensional data point is then encoded as the concatenation of the indices of multiple sub codewords from all the subspaces. This can provide more flexibility and lower distortion errors than traditional methods. Experimental results on the standard real-life data sets demonstrate the superiority over state-of-the-art approaches for approximate nearest neighbor search.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: This paper considers the problem of determinizing probabilistic data to enable such data to be stored in legacy systems that accept only deterministic input. Probabilistic data may be generated by automated data analysis/enrichment techniques such as entity resolution, information extraction, and speech processing. The legacy system may correspond to pre-existing web applications such as Flickr, Picasa, etc. The goal is to generate a deterministic representation of probabilistic data that optimizes the quality of the end-application built on deterministic data. We explore such a determinization problem in the context of two different data processing tasks—triggers and selection queries. We show that approaches such as thresholding or top-1 selection traditionally used for determinization lead to suboptimal performance for such applications. Instead, we develop a query-aware strategy and show its advantages over existing solutions through a comprehensive empirical evaluation over real and synthetic datasets.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Description: Identifying which text corpus leads in the context of a topic presents a great challenge of considerable interest to researchers. Recent research into lead-lag analysis has mainly focused on estimating the overall leads and lags between two corpora. However, real-world applications have a dire need to understand lead-lag patterns both globally and locally. In this paper, we introduce TextPioneer , an interactive visual analytics tool for investigating lead-lag across corpora from the global level to the local level. In particular, we extend an existing lead-lag analysis approach to derive two-level results. To convey multiple perspectives of the results, we have designed two visualizations, a novel hybrid tree visualization that couples a radial space-filling tree with a node-link diagram and a twisted-ladder-like visualization. We have applied our method to several corpora and the evaluation shows promise, especially in support of text comparison at different levels of detail.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2014-12-06
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2014-01-16
    Description: The matrix method, due to Bibel and Andrews, is a proof procedure designed for automated theorem-proving. We show that underlying this method is a fully structured combinatorial model of conventional classical proof theory.
    Print ISSN: 0955-792X
    Electronic ISSN: 1465-363X
    Topics: Computer Science , Mathematics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2014-01-16
    Description: Proof search in inference systems such as the sequent calculus is a process of discovery. Once a proof is found, there is often information in the proof which is redundant. In this article we show how to detect and eliminate certain kinds of redundant formulae from a given proof, and in particular in a way which does not require further proof search or any rearrangement of the proof found. Our technique involves adding constraints to the inference rules, which are used once the proof is complete to determine redundant formulae and how they may be eliminated. We show how this technique can be applied to propositional linear logic, and prove its correctness for this logic. We also discuss how our approach can be extended to other logics without much change.
    Print ISSN: 0955-792X
    Electronic ISSN: 1465-363X
    Topics: Computer Science , Mathematics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2014-01-22
    Description: Good accessibility of publicly funded research data is essential to secure an open scientific system and eventually becomes mandatory [Wellcome Trust will Penalise Scientists Who Don’t Embrace Open Access . The Guardian 2012]. By the use of high-throughput methods in many research areas from physics to systems biology, large data collections are increasingly important as raw material for research. Here, we present strategies worked out by international and national institutions targeting open access to publicly funded research data via incentives or obligations to share data. Funding organizations such as the British Wellcome Trust therefore have developed data sharing policies and request commitment to data management and sharing in grant applications. Increased citation rates are a profound argument for sharing publication data. Pre-publication sharing might be rewarded by a data citation credit system via digital object identifiers (DOIs) which have initially been in use for data objects. Besides policies and incentives, good practice in data management is indispensable. However, appropriate systems for data management of large-scale projects for example in systems biology are hard to find. Here, we give an overview of a selection of open-source data management systems proved to be employed successfully in large-scale projects.
    Print ISSN: 1467-5463
    Electronic ISSN: 1477-4054
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2014-01-22
    Description: Genome-scale metabolic network reconstructions are now routinely used in the study of metabolic pathways, their evolution and design. The development of such reconstructions involves the integration of information on reactions and metabolites from the scientific literature as well as public databases and existing genome-scale metabolic models. The reconciliation of discrepancies between data from these sources generally requires significant manual curation, which constitutes a major obstacle in efforts to develop and apply genome-scale metabolic network reconstructions. In this work, we discuss some of the major difficulties encountered in the mapping and reconciliation of metabolic resources and review three recent initiatives that aim to accelerate this process, namely BKM-react, MetRxn and MNXref (presented in this article). Each of these resources provides a pre-compiled reconciliation of many of the most commonly used metabolic resources. By reducing the time required for manual curation of metabolite and reaction discrepancies, these resources aim to accelerate the development and application of high-quality genome-scale metabolic network reconstructions and models.
    Print ISSN: 1467-5463
    Electronic ISSN: 1477-4054
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2014-01-22
    Description: microRNAs (miRNAs) are small endogenous non-coding RNAs that function as the universal specificity factors in post-transcriptional gene silencing. Discovering miRNAs, identifying their targets and further inferring miRNA functions have been a critical strategy for understanding normal biological processes of miRNAs and their roles in the development of disease. In this review, we focus on computational methods of inferring miRNA functions, including miRNA functional annotation and inferring miRNA regulatory modules, by integrating heterogeneous data sources. We also briefly introduce the research in miRNA discovery and miRNA-target identification with an emphasis on the challenges to computational biology.
    Print ISSN: 1467-5463
    Electronic ISSN: 1477-4054
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2014-01-22
    Description: Supermatrix and supertree analyses are frequently used to more accurately recover vertical evolutionary history but debate still exists over which method provides greater reliability. Traditional methods that resolve relationships among organisms from single genes are often unreliable because of the frequent lack of strong phylogenetic signal and the presence of systematic artifacts. Methods developed to reconstruct organismal history from multiple genes can be divided into supermatrix and supertree approaches. A supermatrix analysis consists of the concatenation of multiple genes into a single, possibly partitioned alignment, from which phylogenies are reconstructed using a variety of approaches. Supertrees build consensus trees from the topological information contained within individual gene trees. Both methods are now widely used and have been demonstrated to solve previously ambiguous or unresolved phylogenies with high statistical support. However, the amount of misleading signal needed to induce erroneous phylogenies for both strategies is still unknown. Using genome simulations, we test the accuracy of supertree and supermatrix approaches in recovering the true organismal phylogeny under increased amounts of horizontally transferred genes and changes in substitution rates. Our results show that overall, supermatrix approaches are preferable when a low amount of gene transfer is suspected to be present in the dataset, while supertrees have greater reliability in the presence of a moderate amount of misleading gene transfers. In the face of very high or very low substitution rates without horizontal gene transfers, supermatrix approaches outperform supertrees as individual gene trees remain unresolved and additional sequences contribute to a congruent phylogenetic signal.
    Print ISSN: 1467-5463
    Electronic ISSN: 1477-4054
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...