ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (13,326)
  • 2020-2022  (2,676)
  • 2015-2019  (10,457)
  • 1975-1979  (193)
  • 1945-1949
  • PLoS Computational Biology  (2,535)
  • Algorithms  (852)
  • IEEE Transactions on Knowledge and Data Engineering  (728)
  • Pattern Recognition  (596)
  • 110151
  • 1274
  • 3363
  • 56466
  • Computer Science  (13,326)
Collection
  • Articles  (13,326)
Years
Year
Topic
  • Computer Science  (13,326)
  • Biology  (6,935)
  • 1
    Publication Date: 2020-08-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2020-08-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2021-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2021-01-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2020-08-29
    Description: Healthcare facilities are constantly deteriorating due to tight budgets allocated to the upkeep of building assets. This entails the need for improved deterioration modeling of such buildings in order to enforce a predictive maintenance approach that decreases the unexpected occurrence of failures and the corresponding downtime elapsed to repair or replace the faulty asset components. Currently, hospitals utilize subjective deterioration prediction methodologies that mostly rely on age as the sole indicator of degradation to forecast the useful lives of the building components. Thus, this paper aims at formulating a more efficient stochastic deterioration prediction model that integrates the latest observed condition into the forecasting procedure to overcome the subjectivity and uncertainties associated with the currently employed methods. This is achieved by means of developing a hybrid genetic algorithm-based fuzzy Markovian model that simulates the deterioration process given the scarcity of available data demonstrating the condition assessment and evaluation for such critical facilities. A nonhomogeneous transition probability matrix (TPM) based on fuzzy membership functions representing the condition, age and relative deterioration rate of the hospital systems is utilized to address the inherited uncertainties. The TPM is further calibrated by means of a genetic algorithm to circumvent the drawbacks of the expert-based models. A sensitivity analysis was carried out to analyze the possible changes in the output resulting from predefined modifications to the input parameters in order to ensure the robustness of the model. The performance of the deterioration prediction model developed is then validated through a comparison with a state-of-art stochastic model in contrast to real hospital datasets, and the results obtained from the developed model significantly outperformed the long-established Weibull distribution-based deterioration prediction methodology with mean absolute errors of 1.405 and 9.852, respectively. Therefore, the developed model is expected to assist decision-makers in creating more efficient maintenance programs as well as more data-driven capital renewal plans.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2021-01-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2020-08-29
    Description: The harmonic closeness centrality measure associates, to each node of a graph, the average of the inverse of its distances from all the other nodes (by assuming that unreachable nodes are at infinite distance). This notion has been adapted to temporal graphs (that is, graphs in which edges can appear and disappear during time) and in this paper we address the question of finding the top-k nodes for this metric. Computing the temporal closeness for one node can be done in O(m) time, where m is the number of temporal edges. Therefore computing exactly the closeness for all nodes, in order to find the ones with top closeness, would require O(nm) time, where n is the number of nodes. This time complexity is intractable for large temporal graphs. Instead, we show how this measure can be efficiently approximated by using a “backward” temporal breadth-first search algorithm and a classical sampling technique. Our experimental results show that the approximation is excellent for nodes with high closeness, allowing us to detect them in practice in a fraction of the time needed for computing the exact closeness of all nodes. We validate our approach with an extensive set of experiments.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2021-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2020-07-16
    Description: High order convective Cahn-Hilliard type equations describe the faceting of a growing surface, or the dynamics of phase transitions in ternary oil-water-surfactant systems. In this paper, we prove the well-posedness of the classical solutions for the Cauchy problem, associated with this equation.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2020-07-08
    Description: We consider a rather general problem of nonparametric estimation of an uncountable set of probability density functions (p.d.f.’s) of the form: f ( x ; r ) , where r is a non-random real variable and ranges from R 1 to R 2 . We put emphasis on the algorithmic aspects of this problem, since they are crucial for exploratory analysis of big data that are needed for the estimation. A specialized learning algorithm, based on the 2D FFT, is proposed and tested on observations that allow for estimate p.d.f.’s of a jet engine temperatures as a function of its rotation speed. We also derive theoretical results concerning the convergence of the estimation procedure that contains hints on selecting parameters of the estimation algorithm.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2020-07-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2020-07-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2020-07-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2020-07-09
    Description: We report the design of a Spiking Neural Network (SNN) edge detector with biologically inspired neurons that has a conceptual similarity with both Hodgkin-Huxley (HH) model neurons and Leaky Integrate-and-Fire (LIF) neurons. The computation of the membrane potential, which is used to determine the occurrence or absence of spike events, at each time step, is carried out by using the analytical solution to a simplified version of the HH neuron model. We find that the SNN based edge detector detects more edge pixels in images than those obtained by a Sobel edge detector. We designed a pipeline for image classification with a low-exposure frame simulation layer, SNN edge detection layers as pre-processing layers and a Convolutional Neural Network (CNN) as a classification module. We tested this pipeline for the task of classification with the Digits dataset, which is available in MATLAB. We find that the SNN based edge detection layer increases the image classification accuracy at lower exposure times, that is, for 1 〈 t 〈 T /4, where t is the number of milliseconds in a simulated exposure frame and T is the total exposure time, with reference to a Sobel edge or Canny edge detection layer in the pipeline. These results pave the way for developing novel cognitive neuromorphic computing architectures for millisecond timescale detection and object classification applications using event or spike cameras.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2020-07-05
    Description: Microscopic crowd simulation can help to enhance the safety of pedestrians in situations that range from museum visits to music festivals. To obtain a useful prediction, the input parameters must be chosen carefully. In many cases, a lack of knowledge or limited measurement accuracy add uncertainty to the input. In addition, for meaningful parameter studies, we first need to identify the most influential parameters of our parametric computer models. The field of uncertainty quantification offers standardized and fully automatized methods that we believe to be beneficial for pedestrian dynamics. In addition, many methods come at a comparatively low cost, even for computationally expensive problems. This allows for their application to larger scenarios. We aim to identify and adapt fitting methods to microscopic crowd simulation in order to explore their potential in pedestrian dynamics. In this work, we first perform a variance-based sensitivity analysis using Sobol’ indices and then crosscheck the results by a derivative-based measure, the activity scores. We apply both methods to a typical scenario in crowd simulation, a bottleneck. Because constrictions can lead to high crowd densities and delays in evacuations, several experiments and simulation studies have been conducted for this setting. We show qualitative agreement between the results of both methods. Additionally, we identify a one-dimensional subspace in the input parameter space and discuss its impact on the simulation. Moreover, we analyze and interpret the sensitivity indices with respect to the bottleneck scenario.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2020-06-30
    Description: Standard (Lomb-Scargle, likelihood, etc.) procedures for power-spectrum analysis provide convenient estimates of the significance of any peak in a power spectrum, based—typically—on the assumption that the measurements being analyzed have a normal (i.e., Gaussian) distribution. However, the measurement sequence provided by a real experiment or a real observational program may not meet this requirement. The RONO (rank-order normalization) procedure generates a proxy distribution that retains the rank-order of the original measurements but has a strictly normal distribution. The proxy distribution may then be analyzed by standard power-spectrum analysis. We show by an example that the resulting power spectrum may prove to be quite close to the power spectrum obtained from the original data by a standard procedure, even if the distribution of the original measurements is far from normal. Such a comparison would tend to validate the original analysis.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2020-06-30
    Description: Toward strong demand for very high-speed I/O for processors, physical performance growth of hardware I/O speed was drastically increased in this decade. However, the recent Big Data applications still demand the larger I/O bandwidth and the lower latency for the speed. Because the current I/O performance does not improve so drastically, it is the time to consider another way to increase it. To overcome this challenge, we focus on lossless data compression technology to decrease the amount of data itself in the data communication path. The recent Big Data applications treat data stream that flows continuously and never allow stalling processing due to the high speed. Therefore, an elegant hardware-based data compression technology is demanded. This paper proposes a novel lossless data compression, called ASE coding. It encodes streaming data by applying the entropy coding approach. ASE coding instantly assigns the fewest bits to the corresponding compressed data according to the number of occupied entries in a look-up table. This paper describes the detailed mechanism of ASE coding. Furthermore, the paper demonstrates performance evaluations to promise that ASE coding adaptively shrinks streaming data and also works on a small amount of hardware resources without stalling or buffering any part of data stream.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2020-11-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2020-11-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2020-07-01
    Description: Text annotation is the process of identifying the sense of a textual segment within a given context to a corresponding entity on a concept ontology. As the bag of words paradigm’s limitations become increasingly discernible in modern applications, several information retrieval and artificial intelligence tasks are shifting to semantic representations for addressing the inherent natural language polysemy and homonymy challenges. With extensive application in a broad range of scientific fields, such as digital marketing, bioinformatics, chemical engineering, neuroscience, and social sciences, community detection has attracted great scientific interest. Focusing on linguistics, by aiming to identify groups of densely interconnected subgroups of semantic ontologies, community detection application has proven beneficial in terms of disambiguation improvement and ontology enhancement. In this paper we introduce a novel distributed supervised knowledge-based methodology employing community detection algorithms for text annotation with Wikipedia Entities, establishing the unprecedented concept of community Coherence as a metric for local contextual coherence compatibility. Our experimental evaluation revealed that deeper inference of relatedness and local entity community coherence in the Wikipedia graph bears substantial improvements overall via a focus on accuracy amelioration of less common annotations. The proposed methodology is propitious for wider adoption, attaining robust disambiguation performance.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2020-06-30
    Description: Geomechanical modelling of the processes associated to the exploitation of subsurface resources, such as land subsidence or triggered/induced seismicity, is a common practice of major interest. The prediction reliability depends on different sources of uncertainty, such as the parameterization of the constitutive model characterizing the deep rock behaviour. In this study, we focus on a Sobol’-based sensitivity analysis and uncertainty reduction via assimilation of land deformations. A synthetic test case application on a deep hydrocarbon reservoir is considered, where land settlements are predicted with the aid of a 3-D Finite Element (FE) model. Data assimilation is performed via the Ensemble Smoother (ES) technique and its variation in the form of Multiple Data Assimilation (ES-MDA). However, the ES convergence is guaranteed with a large number of Monte Carlo (MC) simulations, that may be computationally infeasible in large scale and complex systems. For this reason, a surrogate model based on the generalized Polynomial Chaos Expansion (gPCE) is proposed as an approximation of the forward problem. This approach allows to efficiently compute the Sobol’ indices for the sensitivity analysis and greatly reduce the computational cost of the original ES and MDA formulations, also enhancing the accuracy of the overall prediction process.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2020-07-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2020-06-30
    Description: Clustering is an unsupervised machine learning technique with many practical applications that has gathered extensive research interest. Aside from deterministic or probabilistic techniques, fuzzy C-means clustering (FCM) is also a common clustering technique. Since the advent of the FCM method, many improvements have been made to increase clustering efficiency. These improvements focus on adjusting the membership representation of elements in the clusters, or on fuzzifying and defuzzifying techniques, as well as the distance function between elements. This study proposes a novel fuzzy clustering algorithm using multiple different fuzzification coefficients depending on the characteristics of each data sample. The proposed fuzzy clustering method has similar calculation steps to FCM with some modifications. The formulas are derived to ensure convergence. The main contribution of this approach is the utilization of multiple fuzzification coefficients as opposed to only one coefficient in the original FCM algorithm. The new algorithm is then evaluated with experiments on several common datasets and the results show that the proposed algorithm is more efficient compared to the original FCM as well as other clustering methods.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2020-12-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2020-07-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
  • 36
    Publication Date: 2020-11-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2020-07-03
    Description: Business processes evolve over time to adapt to changing business environments. This requires continuous monitoring of business processes to gain insights into whether they conform to the intended design or deviate from it. The situation when a business process changes while being analysed is denoted as Concept Drift. Its analysis is concerned with studying how a business process changes, in terms of detecting and localising changes and studying the effects of the latter. Concept drift analysis is crucial to enable early detection and management of changes, that is, whether to promote a change to become part of an improved process, or to reject the change and make decisions to mitigate its effects. Despite its importance, there exists no comprehensive framework for analysing concept drift types, affected process perspectives, and granularity levels of a business process. This article proposes the CONcept Drift Analysis in Process Mining (CONDA-PM) framework describing phases and requirements of a concept drift analysis approach. CONDA-PM was derived from a Systematic Literature Review (SLR) of current approaches analysing concept drift. We apply the CONDA-PM framework on current approaches to concept drift analysis and evaluate their maturity. Applying CONDA-PM framework highlights areas where research is needed to complement existing efforts.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2020-11-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2018-04-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2018-01-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2020-04-14
    Description: Let P be a set of n points in R d , k ≥ 1 be an integer and ε ∈ ( 0 , 1 ) be a constant. An ε-coreset is a subset C ⊆ P with appropriate non-negative weights (scalars), that approximates any given set Q ⊆ R d of k centers. That is, the sum of squared distances over every point in P to its closest point in Q is the same, up to a factor of 1 ± ε to the weighted sum of C to the same k centers. If the coreset is small, we can solve problems such as k-means clustering or its variants (e.g., discrete k-means, where the centers are restricted to be in P, or other restricted zones) on the small coreset to get faster provable approximations. Moreover, it is known that such coreset support streaming, dynamic and distributed data using the classic merge-reduce trees. The fact that the coreset is a subset implies that it preserves the sparsity of the data. However, existing such coresets are randomized and their size has at least linear dependency on the dimension d. We suggest the first such coreset of size independent of d. This is also the first deterministic coreset construction whose resulting size is not exponential in d. Extensive experimental results and benchmarks are provided on public datasets, including the first coreset of the English Wikipedia using Amazon’s cloud.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-08-12
    Description: We examine a distributed detection problem in a wireless sensor network, where sensor nodes collaborate to detect a Gaussian signal with an unknown change of power, i.e., a scale parameter. Due to power/bandwidth constraints, we consider the case where each sensor quantizes its observation into a binary digit. The binary data are then transmitted through error-prone wireless links to a fusion center, where a generalized likelihood ratio test (GLRT) detector is employed to perform a global decision. We study the design of a binary quantizer based on an asymptotic analysis of the GLRT. Interestingly, the quantization threshold of the quantizer is independent of the unknown scale parameter. Numerical results are included to illustrate the performance of the proposed quantizer and GLRT in binary symmetric channels (BSCs).
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2015-08-13
    Description: More and more hybrid electric vehicles are driven since they offer such advantages as energy savings and better active safety performance. Hybrid vehicles have two or more power driving systems and frequently switch working condition, so controlling stability is very important. In this work, a two-stage Kalman algorithm method is used to fuse data in hybrid vehicle stability testing. First, the RT3102 navigation system and Dewetron system are introduced. Second, a modeling of data fusion is proposed based on the Kalman filter. Then, this modeling is simulated and tested on a sample vehicle, using Carsim and Simulink software to test the results. The results showed the merits of this modeling.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2015-08-05
    Description: Currently deep learning has made great breakthroughs in visual and speech processing, mainly because it draws lessons from the hierarchical mode that brain deals with images and speech. In the field of NLP, a topic model is one of the important ways for modeling documents. Topic models are built on a generative model that clearly does not match the way humans write. In this paper, we propose Event Model, which is unsupervised and based on the language processing mechanism of neurolinguistics, to model documents. In Event Model, documents are descriptions of concrete or abstract events seen, heard, or sensed by people and words are objects in the events. Event Model has two stages: word learning and dimensionality reduction. Word learning is to learn semantics of words based on deep learning. Dimensionality reduction is the process that representing a document as a low dimensional vector by a linear mode that is completely different from topic models. Event Model achieves state-of-the-art results on document retrieval tasks.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Given a database table with records that can be ranked, an interesting problem is to identify selection conditions for the table, which are qualified by an input record and render its ranking as high as possible among the qualifying tuples. In this paper, we study this standing maximization problem, which finds application in object promotion and characterization. After showing the hardness of the problem, we propose greedy methods, which are experimentally shown to achieve high accuracy compared to exhaustive enumeration, while scaling very well to the problem input size. Our contributions include a linear-time algorithm for determining the optimal selection range for an ordinal attribute and techniques for choosing and prioritizing the most promising selection predicates to apply. Experiments on real datasets confirm the effectiveness and efficiency of our techniques.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Some fairly recent research has focused on providing XACML-based solutions for dynamic privacy policy management. In this regard, a number of works have provided enhancements to the performance of XACML policy enforcement point (PEP) component, but very few have focused on enhancing the accuracy of that component. This paper improves the accuracy of an XACML PEP by filling some gaps in the existing works. In particular, dynamically incorporating user access context into the privacy policy decision, and its enforcement. We provide an XACML-based implementation of a dynamic privacy policy management framework and an evaluation of the applicability of our system in comparison to some of the existing approaches.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: This paper first introduces pattern aided regression (PXR) models, a new type of regression models designed to represent accurate and interpretable prediction models. This was motivated by two observations: (1) Regression modeling applications often involve complex diverse predictor-response relationships , which occur when the optimal regression models (of given regression model type) fitting two or more distinct logical groups of data are highly different. (2) State-of-the-art regression methods are often unable to adequately model such relationships. This paper defines PXR models using several patterns and local regression models, which respectively serve as logical and behavioral characterizations of distinct predictor-response relationships. The paper also introduces a contrast pattern aided regression (CPXR) method, to build accurate PXR models. In experiments, the PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by big margins. Usually using (a) around seven simple patterns and (b) linear local regression models, those PXR models are easy to interpret; in fact, their complexity is just a bit higher than that of (piecewise) linear regression models and is significantly lower than that of traditional ensemble based regression models. CPXR is especially effective for high-dimensional data. The paper also discusses how to use CPXR methodology for analyzing prediction models and correcting their prediction errors.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: We analyze models for predicting the probability of a strikeout for a batter/pitcher matchup in baseball using player descriptors that can be estimated accurately from small samples. We start with the log5 model which has been used extensively for describing matchups in sports. Log5 is a special case of a logit model and we use constrained logistic regression over nearly one million matchup observations to assess the use of the log5 explanatory variables for this application. We also show that a batter/pitcher ground ball rate interaction variable is significant for the prediction of strikeout probability and we provide physical justification for the inclusion of this variable in the model. We quantify the differences among the models and show that batters control the majority of the variance in predicted strikeout rate.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Eliseo Ferrante, Ali Emre Turgut, Edgar Duéñez-Guzmán, Marco Dorigo, Tom Wenseleers Division of labor is ubiquitous in biological systems, as evidenced by various forms of complex task specialization observed in both animal societies and multicellular organisms. Although clearly adaptive, the way in which division of labor first evolved remains enigmatic, as it requires the simultaneous co-occurrence of several complex traits to achieve the required degree of coordination. Recently, evolutionary swarm robotics has emerged as an excellent test bed to study the evolution of coordinated group-level behavior. Here we use this framework for the first time to study the evolutionary origin of behavioral task specialization among groups of identical robots. The scenario we study involves an advanced form of division of labor, common in insect societies and known as “task partitioning”, whereby two sets of tasks have to be carried out in sequence by different individuals. Our results show that task partitioning is favored whenever the environment has features that, when exploited, reduce switching costs and increase the net efficiency of the group, and that an optimal mix of task specialists is achieved most readily when the behavioral repertoires aimed at carrying out the different subtasks are available as pre-adapted building blocks. Nevertheless, we also show for the first time that self-organized task specialization could be evolved entirely from scratch, starting only from basic, low-level behavioral primitives, using a nature-inspired evolutionary method known as Grammatical Evolution. Remarkably, division of labor was achieved merely by selecting on overall group performance, and without providing any prior information on how the global object retrieval task was best divided into smaller subtasks. We discuss the potential of our method for engineering adaptively behaving robot swarms and interpret our results in relation to the likely path that nature took to evolve complex sociality and task specialization.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2015-08-07
    Description: by Patrícia Santos-Oliveira, António Correia, Tiago Rodrigues, Teresa M Ribeiro-Rodrigues, Paulo Matafome, Juan Carlos Rodríguez-Manzaneque, Raquel Seiça, Henrique Girão, Rui D. M. Travasso Sprouting angiogenesis, where new blood vessels grow from pre-existing ones, is a complex process where biochemical and mechanical signals regulate endothelial cell proliferation and movement. Therefore, a mathematical description of sprouting angiogenesis has to take into consideration biological signals as well as relevant physical processes, in particular the mechanical interplay between adjacent endothelial cells and the extracellular microenvironment. In this work, we introduce the first phase-field continuous model of sprouting angiogenesis capable of predicting sprout morphology as a function of the elastic properties of the tissues and the traction forces exerted by the cells. The model is very compact, only consisting of three coupled partial differential equations, and has the clear advantage of a reduced number of parameters. This model allows us to describe sprout growth as a function of the cell-cell adhesion forces and the traction force exerted by the sprout tip cell. In the absence of proliferation, we observe that the sprout either achieves a maximum length or, when the traction and adhesion are very large, it breaks. Endothelial cell proliferation alters significantly sprout morphology, and we explore how different types of endothelial cell proliferation regulation are able to determine the shape of the growing sprout. The largest region in parameter space with well formed long and straight sprouts is obtained always when the proliferation is triggered by endothelial cell strain and its rate grows with angiogenic factor concentration. We conclude that in this scenario the tip cell has the role of creating a tension in the cells that follow its lead. On those first stalk cells, this tension produces strain and/or empty spaces, inevitably triggering cell proliferation. The new cells occupy the space behind the tip, the tension decreases, and the process restarts. Our results highlight the ability of mathematical models to suggest relevant hypotheses with respect to the role of forces in sprouting, hence underlining the necessary collaboration between modelling and molecular biology techniques to improve the current state-of-the-art.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2015-08-08
    Description: by Sayed-Rzgar Hosseini, Aditya Barve, Andreas Wagner All biological evolution takes place in a space of possible genotypes and their phenotypes. The structure of this space defines the evolutionary potential and limitations of an evolving system. Metabolism is one of the most ancient and fundamental evolving systems, sustaining life by extracting energy from extracellular nutrients. Here we study metabolism’s potential for innovation by analyzing an exhaustive genotype-phenotype map for a space of 10 15 metabolisms that encodes all possible subsets of 51 reactions in central carbon metabolism. Using flux balance analysis, we predict the viability of these metabolisms on 10 different carbon sources which give rise to 1024 potential metabolic phenotypes. Although viable metabolisms with any one phenotype comprise a tiny fraction of genotype space, their absolute numbers exceed 10 9 for some phenotypes. Metabolisms with any one phenotype typically form a single network of genotypes that extends far or all the way through metabolic genotype space, where any two genotypes can be reached from each other through a series of single reaction changes. The minimal distance of genotype networks associated with different phenotypes is small, such that one can reach metabolisms with novel phenotypes – viable on new carbon sources – through one or few genotypic changes. Exceptions to these principles exist for those metabolisms whose complexity (number of reactions) is close to the minimum needed for viability. Increasing metabolic complexity enhances the potential for both evolutionary conservation and evolutionary innovation.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2015-08-19
    Description: by Pengxing Cao, Ada W. C. Yan, Jane M. Heffernan, Stephen Petrie, Robert G. Moss, Louise A. Carolan, Teagan A. Guarnaccia, Anne Kelso, Ian G. Barr, Jodie McVernon, Karen L. Laurie, James M. McCaw Influenza is an infectious disease that primarily attacks the respiratory system. Innate immunity provides both a very early defense to influenza virus invasion and an effective control of viral growth. Previous modelling studies of virus–innate immune response interactions have focused on infection with a single virus and, while improving our understanding of viral and immune dynamics, have been unable to effectively evaluate the relative feasibility of different hypothesised mechanisms of antiviral immunity. In recent experiments, we have applied consecutive exposures to different virus strains in a ferret model, and demonstrated that viruses differed in their ability to induce a state of temporary immunity or viral interference capable of modifying the infection kinetics of the subsequent exposure. These results imply that virus-induced early immune responses may be responsible for the observed viral hierarchy. Here we introduce and analyse a family of within-host models of re-infection viral kinetics which allow for different viruses to stimulate the innate immune response to different degrees. The proposed models differ in their hypothesised mechanisms of action of the non-specific innate immune response. We compare these alternative models in terms of their abilities to reproduce the re-exposure data. Our results show that 1) a model with viral control mediated solely by a virus-resistant state, as commonly considered in the literature, is not able to reproduce the observed viral hierarchy; 2) the synchronised and desynchronised behaviour of consecutive virus infections is highly dependent upon the interval between primary virus and challenge virus exposures and is consistent with virus-dependent stimulation of the innate immune response. Our study provides the first mechanistic explanation for the recently observed influenza viral hierarchies and demonstrates the importance of understanding the host response to multi-strain viral infections. Re-exposure experiments provide a new paradigm in which to study the immune response to influenza and its role in viral control.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2015-08-21
    Description: by Paul M. Harrison, Laurent Badel, Mark J. Wall, Magnus J. E. Richardson Models of neocortical networks are increasingly including the diversity of excitatory and inhibitory neuronal classes. Significant variability in cellular properties are also seen within a nominal neuronal class and this heterogeneity can be expected to influence the population response and information processing in networks. Recent studies have examined the population and network effects of variability in a particular neuronal parameter with some plausibly chosen distribution. However, the empirical variability and covariance seen across multiple parameters are rarely included, partly due to the lack of data on parameter correlations in forms convenient for model construction. To addess this we quantify the heterogeneity within and between the neocortical pyramidal-cell classes in layers 2/3, 4, and the slender-tufted and thick-tufted pyramidal cells of layer 5 using a combination of intracellular recordings, single-neuron modelling and statistical analyses. From the response to both square-pulse and naturalistic fluctuating stimuli, we examined the class-dependent variance and covariance of electrophysiological parameters and identify the role of the h current in generating parameter correlations. A byproduct of the dynamic I-V method we employed is the straightforward extraction of reduced neuron models from experiment. Empirically these models took the refractory exponential integrate-and-fire form and provide an accurate fit to the perisomatic voltage responses of the diverse pyramidal-cell populations when the class-dependent statistics of the model parameters were respected. By quantifying the parameter statistics we obtained an algorithm which generates populations of model neurons, for each of the four pyramidal-cell classes, that adhere to experimentally observed marginal distributions and parameter correlations. As well as providing this tool, which we hope will be of use for exploring the effects of heterogeneity in neocortical networks, we also provide the code for the dynamic I-V method and make the full electrophysiological data set available.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2015-08-22
    Description: Community detection in a complex network is an important problem of much interest in recent years. In general, a community detection algorithm chooses an objective function and captures the communities of the network by optimizing the objective function, and then, one uses various heuristics to solve the optimization problem to extract the interesting communities for the user. In this article, we demonstrate the procedure to transform a graph into points of a metric space and develop the methods of community detection with the help of a metric defined for a pair of points. We have also studied and analyzed the community structure of the network therein. The results obtained with our approach are very competitive with most of the well-known algorithms in the literature, and this is justified over the large collection of datasets. On the other hand, it can be observed that time taken by our algorithm is quite less compared to other methods and justifies the theoretical findings.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2015-08-19
    Description: by Ariel Afek, Hila Cohen, Shiran Barber-Zucker, Raluca Gordân, David B. Lukatsky Recent genome-wide experiments in different eukaryotic genomes provide an unprecedented view of transcription factor (TF) binding locations and of nucleosome occupancy. These experiments revealed that a large fraction of TF binding events occur in regions where only a small number of specific TF binding sites (TFBSs) have been detected. Furthermore, in vitro protein-DNA binding measurements performed for hundreds of TFs indicate that TFs are bound with wide range of affinities to different DNA sequences that lack known consensus motifs. These observations have thus challenged the classical picture of specific protein-DNA binding and strongly suggest the existence of additional recognition mechanisms that affect protein-DNA binding preferences. We have previously demonstrated that repetitive DNA sequence elements characterized by certain symmetries statistically affect protein-DNA binding preferences. We call this binding mechanism nonconsensus protein-DNA binding in order to emphasize the point that specific consensus TFBSs do not contribute to this effect. In this paper, using the simple statistical mechanics model developed previously, we calculate the nonconsensus protein-DNA binding free energy for the entire C . elegans and D . melanogaster genomes. Using the available chromatin immunoprecipitation followed by sequencing (ChIP-seq) results on TF-DNA binding preferences for ~100 TFs, we show that DNA sequences characterized by low predicted free energy of nonconsensus binding have statistically higher experimental TF occupancy and lower nucleosome occupancy than sequences characterized by high free energy of nonconsensus binding. This is in agreement with our previous analysis performed for the yeast genome. We suggest therefore that nonconsensus protein-DNA binding assists the formation of nucleosome-free regions, as TFs outcompete nucleosomes at genomic locations with enhanced nonconsensus binding. In addition, here we perform a new, large-scale analysis using in vitro TF-DNA preferences obtained from the universal protein binding microarrays (PBM) for ~90 eukaryotic TFs belonging to 22 different DNA-binding domain types. As a result of this new analysis, we conclude that nonconsensus protein-DNA binding is a widespread phenomenon that significantly affects protein-DNA binding preferences and need not require the presence of consensus (specific) TFBSs in order to achieve genome-wide TF-DNA binding specificity.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2015-08-21
    Description: A three-step iterative method with fifth-order convergence as a new modification of Newton’s method was presented. This method is for finding multiple roots of nonlinear equation with unknown multiplicity m whose multiplicity m is the highest multiplicity. Its order of convergence is analyzed and proved. Results for some numerical examples show the efficiency of the new method.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2015-08-20
    Description: by The PLOS Computational Biology Staff
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2015-08-21
    Description: by James Tamerius, Cécile Viboud, Jeffrey Shaman, Gerardo Chowell While a relationship between environmental forcing and influenza transmission has been established in inter-pandemic seasons, the drivers of pandemic influenza remain debated. In particular, school effects may predominate in pandemic seasons marked by an atypical concentration of cases among children. For the 2009 A/H1N1 pandemic, Mexico is a particularly interesting case study due to its broad geographic extent encompassing temperate and tropical regions, well-documented regional variation in the occurrence of pandemic outbreaks, and coincidence of several school breaks during the pandemic period. Here we fit a series of transmission models to daily laboratory-confirmed influenza data in 32 Mexican states using MCMC approaches, considering a meta-population framework or the absence of spatial coupling between states. We use these models to explore the effect of environmental, school–related and travel factors on the generation of spatially-heterogeneous pandemic waves. We find that the spatial structure of the pandemic is best understood by the interplay between regional differences in specific humidity (explaining the occurrence of pandemic activity towards the end of the school term in late May-June 2009 in more humid southeastern states), school vacations (preventing influenza transmission during July-August in all states), and regional differences in residual susceptibility (resulting in large outbreaks in early fall 2009 in central and northern Mexico that had yet to experience fully-developed outbreaks). Our results are in line with the concept that very high levels of specific humidity, as present during summer in southeastern Mexico, favor influenza transmission, and that school cycles are a strong determinant of pandemic wave timing.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2015-08-21
    Description: by Alireza Alemi, Carlo Baldassi, Nicolas Brunel, Riccardo Zecchina Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model simplicity and the locality of the synaptic update rules come at the cost of a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns to be memorized are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2015-08-12
    Description: by Sander Land, Steven A. Niederer Biophysical models of cardiac tension development provide a succinct representation of our understanding of force generation in the heart. The link between protein kinetics and interactions that gives rise to high cooperativity is not yet fully explained from experiments or previous biophysical models. We propose a biophysical ODE-based representation of cross-bridge (XB), tropomyosin and troponin within a contractile regulatory unit (RU) to investigate the mechanisms behind cooperative activation, as well as the role of cooperativity in dynamic tension generation across different species. The model includes cooperative interactions between regulatory units (RU-RU), between crossbridges (XB-XB), as well more complex interactions between crossbridges and regulatory units (XB-RU interactions). For the steady-state force-calcium relationship, our framework predicts that: (1) XB-RU effects are key in shifting the half-activation value of the force-calcium relationship towards lower [Ca 2+ ], but have only small effects on cooperativity. (2) XB-XB effects approximately double the duty ratio of myosin, but do not significantly affect cooperativity. (3) RU-RU effects derived from the long-range action of tropomyosin are a major factor in cooperative activation, with each additional unblocked RU increasing the rate of additional RU’s unblocking. (4) Myosin affinity for short (1–4 RU) unblocked stretches of actin of is very low, and the resulting suppression of force at low [Ca 2+ ] is a major contributor in the biphasic force-calcium relationship. We also reproduce isometric tension development across mouse, rat and human at physiological temperature and pacing rate, and conclude that species differences require only changes in myosin affinity and troponin I/troponin C affinity. Furthermore, we show that the calcium dependence of the rate of tension redevelopment k tr is explained by transient blocking of RU’s by a temporary decrease in XB-RU effects.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-12
    Description: by Jonas Paulsen, Odin Gramstad, Philippe Collas The three-dimensional (3D) structure of the genome is important for orchestration of gene expression and cell differentiation. While mapping genomes in 3D has for a long time been elusive, recent adaptations of high-throughput sequencing to chromosome conformation capture (3C) techniques, allows for genome-wide structural characterization for the first time. However, reconstruction of "consensus" 3D genomes from 3C-based data is a challenging problem, since the data are aggregated over millions of cells. Recent single-cell adaptations to the 3C-technique, however, allow for non-aggregated structural assessment of genome structure, but data suffer from sparse and noisy interaction sampling. We present a manifold based optimization (MBO) approach for the reconstruction of 3D genome structure from chromosomal contact data. We show that MBO is able to reconstruct 3D structures based on the chromosomal contacts, imposing fewer structural violations than comparable methods. Additionally, MBO is suitable for efficient high-throughput reconstruction of large systems, such as entire genomes, allowing for comparative studies of genomic structure across cell-lines and different species.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2015-08-12
    Description: by Hiroo Kenzaki, Shoji Takada Nucleosomes, basic units of chromatin, are known to show spontaneous DNA unwrapping dynamics that are crucial for transcriptional activation, but its structural details are yet to be elucidated. Here, employing a coarse-grained molecular model that captures residue-level structural details up to histone tails, we simulated equilibrium fluctuations and forced unwrapping of single nucleosomes at various conditions. The equilibrium simulations showed spontaneous unwrapping from outer DNA and subsequent rewrapping dynamics, which are in good agreement with experiments. We found several distinct partially unwrapped states of nucleosomes, as well as reversible transitions among these states. At a low salt concentration, histone tails tend to sit in the concave cleft between the histone octamer and DNA, tightening the nucleosome. At a higher salt concentration, the tails tend to bound to the outer side of DNA or be expanded outwards, which led to higher degree of unwrapping. Of the four types of histone tails, H3 and H2B tail dynamics are markedly correlated with partial unwrapping of DNA, and, moreover, their contributions were distinct. Acetylation in histone tails was simply mimicked by changing their charges, which enhanced the unwrapping, especially markedly for H3 and H2B tails.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-13
    Description: by Sebastian Bitzer, Jelle Bruineberg, Stefan J. Kiebel Even for simple perceptual decisions, the mechanisms that the brain employs are still under debate. Although current consensus states that the brain accumulates evidence extracted from noisy sensory information, open questions remain about how this simple model relates to other perceptual phenomena such as flexibility in decisions, decision-dependent modulation of sensory gain, or confidence about a decision. We propose a novel approach of how perceptual decisions are made by combining two influential formalisms into a new model. Specifically, we embed an attractor model of decision making into a probabilistic framework that models decision making as Bayesian inference. We show that the new model can explain decision making behaviour by fitting it to experimental data. In addition, the new model combines for the first time three important features: First, the model can update decisions in response to switches in the underlying stimulus. Second, the probabilistic formulation accounts for top-down effects that may explain recent experimental findings of decision-related gain modulation of sensory neurons. Finally, the model computes an explicit measure of confidence which we relate to recent experimental evidence for confidence computations in perceptual decision tasks.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Over the past decade or so, several research groups have addressed the problem of multi-label classification where each example can belong to more than one class at the same time. A common approach, called  Binary Relevance (BR) , addresses this problem by inducing a separate classifier for each class. Research has shown that this framework can be improved if mutual class dependence is exploited: an example that belongs to class $X$ is likely to belong also to class $Y$ ; conversely, belonging to $X$ can make an example less likely to belong to $Z$ . Several works sought to model this information by using the vector of class labels as additional example attributes. To fill the unknown values of these attributes during prediction, existing methods resort to using outputs of other classifiers, and this makes them prone to errors. This is where our paper wants to contribute. We identified two potential ways to prune unnecessary dependencies and to reduce error-propagation in our new classifier-stacking technique, which is named PruDent . Experimental results indicate that the classification performance of PruDent compares favorably with that of other state-of-the-art approaches over a broad range of testbeds. Mor- over, its computational costs grow only linearly in the number of classes.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2015-08-07
    Description: This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data (LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of support vector machines and of label propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discussed.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2015-08-07
    Description: A new graph based constrained semi-supervised learning (G-CSSL) framework is proposed. Pairwise constraints (PC) are used to specify the types (intra- or inter-class) of points with labels. Since the number of labeled data is typically small in SSL setting, the core idea of this framework is to create and enrich the PC sets using the propagated soft labels from both labeled and unlabeled data by special label propagation (SLP), and hence obtaining more supervised information for delivering enhanced performance. We also propose a Two-stage Sparse Coding, termed TSC, for achieving adaptive neighborhood for SLP. The first stage aims at correcting the possible corruptions in data and training an informative dictionary, and the second stage focuses on sparse coding. To deliver enhanced inter-class separation and intra-class compactness, we also present a mixed soft-similarity measure to evaluate the similarity/dissimilarity of constrained pairs using the sparse codes and outputted probabilistic values by SLP. Simulations on the synthetic and real datasets demonstrated the validity of our algorithms for data representation and image recognition, compared with other related state-of-the-art graph based semi-supervised techniques.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In large databases, the amount and the complexity of the data calls for data summarization techniques. Such summaries are used to assist fast approximate query answering or query optimization. Histograms are a prominent class of model-free data summaries and are widely used in database systems. So-called self-tuning histograms look at query-execution results to refine themselves. An assumption with such histograms, which has not been questioned so far, is that they can learn the dataset from scratch, that is—starting with an empty bucket configuration. We show that this is not the case. Self-tuning methods are very sensitive to the initial configuration. Three major problems stem from this. Traditional self-tuning is unable to learn projections of multi-dimensional data, is sensitive to the order of queries, and reaches only local optima with high estimation errors. We show how to improve a self-tuning method significantly by starting with a carefully chosen initial configuration. We propose initialization by dense subspace clusters in projections of the data, which improves both accuracy and robustness of self-tuning. Our experiments on different datasets show that the error rate is typically halved compared to the uninitialized version.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Recently, two ideas have been explored that lead to more accurate algorithms for time-series classification (TSC). First, it has been shown that the simplest way to gain improvement on TSC problems is to transform into an alternative data space where discriminatory features are more easily detected. Second, it was demonstrated that with a single data representation, improved accuracy can be achieved through simple ensemble schemes. We combine these two principles to test the hypothesis that forming a collective of ensembles of classifiers on different data transformations improves the accuracy of time-series classification. The collective contains classifiers constructed in the time, frequency, change, and shapelet transformation domains. For the time domain, we use a set of elastic distance measures. For the other domains, we use a range of standard classifiers. Through extensive experimentation on 72 datasets, including all of the 46 UCR datasets, we demonstrate that the simple collective formed by including all classifiers in one ensemble is significantly more accurate than any of its components and any other previously published TSC algorithm. We investigate alternative hierarchical collective structures and demonstrate the utility of the approach on a new problem involving classifying Caenorhabditis elegans mutant types.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In real-world graphs such as social networks, Semantic Web and biological networks, each vertex usually contains rich information, which can be modeled by a set of tokens or elements. In this paper, we study a subgraph matching with set similarity (SMS $^2$ ) query over a large graph database, which retrieves subgraphs that are structurally isomorphic to the query graph, and meanwhile satisfy the condition of vertex pair matching with the (dynamic) weighted set similarity. To efficiently process the SMS $^2$ query, this paper designs a novel lattice-based index for data graph, and lightweight signatures for both query vertices and data vertices. Based on the index and signatures, we propose an efficient two-phase pruning strategy including set similarity pruning and structure-based pruning, which exploits the unique features of both (dynamic) weighted set similarity and graph topology. We also propose an efficient dominating-set-based subgraph matching algorithm guided by a dominating set selection algorithm to achieve better query performance. Extensive experiments on both real and synthetic datasets demonstrate that our method outperforms state-of-the-art methods by an order of magnitude.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Data imputation aims at filling in missing attribute values in databases. Most existing imputation methods to string attribute values are inferring-based approaches, which usually fail to reach a high imputation recall by just inferring missing values from the complete part of the data set. Recently, some retrieving-based methods are proposed to retrieve missing values from external resources such as the World Wide Web, which tend to reach a much higher imputation recall, but inevitably bring a large overhead by issuing a large number of search queries. In this paper, we investigate the interaction between the inferring-based methods and the retrieving-based methods. We show that retrieving a small number of selected missing values can greatly improve the imputation recall of the inferring-based methods. With this intuition, we propose an inTeractive Retrieving-Inferring data imPutation approach (TRIP), which performs retrieving and inferring alternately in filling in missing attribute values in a data set. To ensure the high recall at the minimum cost, TRIP faces a challenge of selecting the least number of missing values for retrieving to maximize the number of inferable values. Our proposed solution is able to identify an optimal retrieving-inferring scheduling scheme in deterministic data imputation, and the optimality of the generated scheme is theoretically analyzed with proofs. We also analyze with an example that the optimal scheme is not feasible to be achieved in $tau$ -constrained stochastic data imputation ( $tau$ -SDI), but still, our proposed solution identifies an expected-optimal scheme in $tau$ -SDI. Extensive experiments on four data collections show that TRIP retrieves on average 20 percent missing values and achieves the same high recall that was reached by the retrieving-based approach.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Visual classification has attracted considerable research interests in the past decades. In this paper, a novel $ell _1$ -hypergraph model for visual classification is proposed. Hypergraph learning, as a natural extension of graph model, has been widely used in many machine learning tasks. In previous work, hypergraph is usually constructed by attribute-based or neighborhood-based methods. That is, a hyperedge is generated by connecting a set of samples sharing a same feature attribute or in a neighborhood. However, these methods are unable to explore feature space globally or sensitive to noises. To address these problems, we propose a novel hypergraph construction approach that leverages sparse representation to generate hyperedges and learns the relationship among hyperedges and their vertices. First, for each sample, a hyperedge is generated by regarding it as the centroid and linking it as well as its nearest neighbors. Then, the sparse representation method is applied to represent the centroid vertex by other vertices within the same hyperedge. The vertices with zero coefficients are removed from the hyperedge. Finally, the representation coefficients are used to define the incidence relation between the hyperedge and the vertices. In our approach, we also optimize the hyperedge weights to modulate the effects of different hyperedges. We leverage the prior knowledge on the hyperedges so that the hyperedges sharing more vertices can have closer weights, where a graph Laplacian is used to regularize the optimization of the weights. Our approach is named $ell _1$ -hypergraph since the $ell _1$ sparse representation is employed in the hypergraph construction process. The method is evaluated on various visual classification tasks, and it demonstrates promising performance.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Malachi Griffith, Jason R. Walker, Nicholas C. Spies, Benjamin J. Ainscough, Obi L. Griffith Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-07
    Description: by Arjun Bharioke, Dmitri B. Chklovskii Neurons must faithfully encode signals that can vary over many orders of magnitude despite having only limited dynamic ranges. For a correlated signal, this dynamic range constraint can be relieved by subtracting away components of the signal that can be predicted from the past, a strategy known as predictive coding, that relies on learning the input statistics. However, the statistics of input natural signals can also vary over very short time scales e.g., following saccades across a visual scene. To maintain a reduced transmission cost to signals with rapidly varying statistics, neuronal circuits implementing predictive coding must also rapidly adapt their properties. Experimentally, in different sensory modalities, sensory neurons have shown such adaptations within 100 ms of an input change. Here, we show first that linear neurons connected in a feedback inhibitory circuit can implement predictive coding. We then show that adding a rectification nonlinearity to such a feedback inhibitory circuit allows it to automatically adapt and approximate the performance of an optimal linear predictive coding network, over a wide range of inputs, while keeping its underlying temporal and synaptic properties unchanged. We demonstrate that the resulting changes to the linearized temporal filters of this nonlinear network match the fast adaptations observed experimentally in different sensory modalities, in different vertebrate species. Therefore, the nonlinear feedback inhibitory network can provide automatic adaptation to fast varying signals, maintaining the dynamic range necessary for accurate neuronal transmission of natural inputs.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2015-08-08
    Description: by Murat Alp, Vipan K. Parihar, Charles L. Limoli, Francis A. Cucinotta In this work, a stochastic computational model of microscopic energy deposition events is used to study for the first time damage to irradiated neuronal cells of the mouse hippocampus. An extensive library of radiation tracks for different particle types is created to score energy deposition in small voxels and volume segments describing a neuron’s morphology that later are sampled for given particle fluence or dose. Methods included the construction of in silico mouse hippocampal granule cells from neuromorpho.org with spine and filopodia segments stochastically distributed along the dendritic branches. The model is tested with high-energy 56 Fe, 12 C, and 1 H particles and electrons. Results indicate that the tree-like structure of the neuronal morphology and the microscopic dose deposition of distinct particles may lead to different outcomes when cellular injury is assessed, leading to differences in structural damage for the same absorbed dose. The significance of the microscopic dose in neuron components is to introduce specific local and global modes of cellular injury that likely contribute to spine, filopodia, and dendrite pruning, impacting cognition and possibly the collapse of the neuron. Results show that the heterogeneity of heavy particle tracks at low doses, compared to the more uniform dose distribution of electrons, juxtaposed with neuron morphology make it necessary to model the spatial dose painting for specific neuronal components. Going forward, this work can directly support the development of biophysical models of the modifications of spine and dendritic morphology observed after low dose charged particle irradiation by providing accurate descriptions of the underlying physical insults to complex neuron structures at the nano-meter scale.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Public Library of Science (PLoS)
    Publication Date: 2015-08-08
    Description: by Brinda Vallat, Carlos Madrid-Aliste, Andras Fiser Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2015-07-30
    Description: In this paper, we present three improvements to a three-point third order variant of Newton’s method derived from the Simpson rule. The first one is a fifth order method using the same number of functional evaluations as the third order method, the second one is a four-point 10th order method and the last one is a five-point 20th order method. In terms of computational point of view, our methods require four evaluations (one function and three first derivatives) to get fifth order, five evaluations (two functions and three derivatives) to get 10th order and six evaluations (three functions and three derivatives) to get 20th order. Hence, these methods have efficiency indexes of 1.495, 1.585 and 1.648, respectively which are better than the efficiency index of 1.316 of the third order method. We test the methods through some numerical experiments which show that the 20th order method is very efficient.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2015-08-05
    Description: by Po-Wei Chen, Luis L. Fonseca, Yusuf A. Hannun, Eberhard O. Voit The article demonstrates that computational modeling has the capacity to convert metabolic snapshots, taken sequentially over time, into a description of cellular, dynamic strategies. The specific application is a detailed analysis of a set of actions with which Saccharomyces cerevisiae responds to heat stress. Using time dependent metabolic concentration data, we use a combination of mathematical modeling, reverse engineering, and optimization to infer dynamic changes in enzyme activities within the sphingolipid pathway. The details of the sphingolipid responses to heat stress are important, because they guide some of the longer-term alterations in gene expression, with which the cells adapt to the increased temperature. The analysis indicates that all enzyme activities in the system are affected and that the shapes of the time trends in activities depend on the fatty-acyl CoA chain lengths of the different ceramide species in the system.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2015-07-30
    Description: Robust small target detection of low signal-to-noise ratio (SNR) is very important in infrared search and track applications for self-defense or attacks. Due to the complex background, current algorithms have some unsolved issues with false alarm rate. In order to reduce the false alarm rate, an infrared small target detection algorithm based on saliency detection and support vector machine was proposed. Firstly, we detect salient regions that may contain targets with phase spectrum Fourier transform (PFT) approach. Then, target recognition was performed in the salient regions. Experimental results show the proposed algorithm has ideal robustness and efficiency for real infrared small target detection applications.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2015-08-06
    Description: In dynamic propagation environments, beamforming algorithms may suffer from strong interference, steering vector mismatches, a low convergence speed and a high computational complexity. Reduced-rank signal processing techniques provide a way to address the problems mentioned above. This paper presents a low-complexity robust data-dependent dimensionality reduction based on an iterative optimization with steering vector perturbation (IOVP) algorithm for reduced-rank beamforming and steering vector estimation. The proposed robust optimization procedure jointly adjusts the parameters of a rank reduction matrix and an adaptive beamformer. The optimized rank reduction matrix projects the received signal vector onto a subspace with lower dimension. The beamformer/steering vector optimization is then performed in a reduced dimension subspace. We devise efficient stochastic gradient and recursive least-squares algorithms for implementing the proposed robust IOVP design. The proposed robust IOVP beamforming algorithms result in a faster convergence speed and an improved performance. Simulation results show that the proposed IOVP algorithms outperform some existing full-rank and reduced-rank algorithms with a comparable complexity.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2015-08-07
    Description: Recently, wireless sensor networks (WSNs) have drawn great interest due to their outstanding monitoring and management potential in medical, environmental and industrial applications. Most of the applications that employ WSNs demand all of the sensor nodes to run on a common time scale, a requirement that highlights the importance of clock synchronization. The clock synchronization problem in WSNs is inherently related to parameter estimation. The accuracy of clock synchronization algorithms depends essentially on the statistical properties of the parameter estimation algorithms. Recently, studies dedicated to the estimation of synchronization parameters, such as clock offset and skew, have begun to emerge in the literature. The aim of this article is to provide an overview of the state-of-the-art clock synchronization algorithms for WSNs from a statistical signal processing point of view. This article focuses on describing the key features of the class of clock synchronization algorithms that exploit the traditional two-way message (signal) exchange mechanism. Upon introducing the two-way message exchange mechanism, the main clock offset estimation algorithms for pairwise synchronization of sensor nodes are first reviewed, and their performance is compared. The class of fully-distributed clock offset estimation algorithms for network-wide synchronization is then surveyed. The paper concludes with a list of open research problems pertaining to clock synchronization of WSNs.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: We consider the problem of adaptively routing a fleet of cooperative vehicles within a road network in the presence of uncertain and dynamic congestion conditions. To tackle this problem, we first propose a Gaussian process dynamic congestion model that can effectively characterize both the dynamics and the uncertainty of congestion conditions. Our model is efficient and thus facilitates real-time adaptive routing in the face of uncertainty. Using this congestion model, we develop efficient algorithms for non-myopic adaptive routing to minimize the collective travel time of all vehicles in the system. A key property of our approach is the ability to efficiently reason about the long-term value of exploration, which enables collectively balancing the exploration/exploitation trade-off for entire fleets of vehicles. Our approach is validated by traffic data from two large Asian cities. Our congestion model is shown to be effective in modeling dynamic congestion conditions. Our routing algorithms also generate significantly faster routes compared to standard baselines, and achieve near-optimal performance compared to an omniscient routing algorithm. We also present the results from a preliminary field study, which showcases the efficacy of our approach.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Betweenness centrality is a classic measure that quantifies the importance of a graph element (vertex or edge) according to the fraction of shortest paths passing through it. This measure is notoriously expensive to compute, and the best known algorithm runs in $mathcal {O}(nm)$ time. The problems of efficiency and scalability are exacerbated in a dynamic setting, where the input is an evolving graph seen edge by edge, and the goal is to keep the betweenness centrality up to date. In this paper, we propose the first truly scalable algorithm for online computation of betweenness centrality of both vertices and edges in an evolving graph where new edges are added and existing edges are removed. Our algorithm is carefully engineered with out-of-core techniques and tailored for modern parallel stream processing engines that run on clusters of shared-nothing commodity hardware. Hence, it is amenable to real-world deployment. We experiment on graphs that are two orders of magnitude larger than previous studies. Our method is able to keep the betweenness centrality measures up-to-date online, i.e., the time to update the measures is smaller than the inter-arrival time between two consecutive updates.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Phase change memory (PCM) is non-volatile memory that is byte-addressable. It is two to four times denser than DRAM, orders of magnitude better than NAND Flash memory in read latency, and 10 times better than NAND Flash memory in write endurance. However, it still limits the number of write operations to at most $10^6$ times per PCM cell. To extend its lifetime, it is necessary to evenly distribute write operations over all the memory cells. Up to now, the $mathrm{B^{+}}$ -Tree index structure has been used to quickly locate a search key in a relational database management system (RDBMS). All the record keys in each node are sorted and packed upon insertion in, and deletion from, the $mathrm{B^{+}}$ -Tree. In addition, a counter keeps track of the number of valid keys in the $mathrm{B^{+}}$ -Tree. Consequently, a $mathrm{B^{+}}$ -Tree algorithm results in a large number of write operations, which deteriorates the endurance of PCM. This restricts the usage of PCM on a database server and deteriorates performance of database servers. In this paper, we propose a novel PCM-aware $math- m{B^{+}}$ -Tree index structure, called $mathrm{PB^{+}}$ -Tree, to provide wear-leveling in PCM. According to our experiment results, $mathrm{PB^{+}}$ -Tree is much faster than the existing $mathrm{B^{+}}$ -Tree algorithms for PCM and NAND Flash memory with versatile workloads. More importantly, our scheme also greatly reduces the number of write operations compared to other $mathrm{B^{+}}$ -Tree algorithms. All of these results suggest that $mathrm{PB^{+}}$ -Tree is the $mathrm{B^{+}}$ -Tree algorithm best fitted to PCM.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2015-08-08
    Description: by Pengyi Yang, Xiaofeng Zheng, Vivek Jayaswal, Guang Hu, Jean Yee Hwa Yang, Raja Jothi Cell signaling underlies transcription/epigenetic control of a vast majority of cell-fate decisions. A key goal in cell signaling studies is to identify the set of kinases that underlie key signaling events. In a typical phosphoproteomics study, phosphorylation sites (substrates) of active kinases are quantified proteome-wide. By analyzing the activities of phosphorylation sites over a time-course, the temporal dynamics of signaling cascades can be elucidated. Since many substrates of a given kinase have similar temporal kinetics, clustering phosphorylation sites into distinctive clusters can facilitate identification of their respective kinases. Here we present a knowledge-based CLUster Evaluation (CLUE) approach for identifying the most informative partitioning of a given temporal phosphoproteomics data. Our approach utilizes prior knowledge, annotated kinase-substrate relationships mined from literature and curated databases, to first generate biologically meaningful partitioning of the phosphorylation sites and then determine key kinases associated with each cluster. We demonstrate the utility of the proposed approach on two time-series phosphoproteomics datasets and identify key kinases associated with human embryonic stem cell differentiation and insulin signaling pathway. The proposed approach will be a valuable resource in the identification and characterizing of signaling networks from phosphoproteomics data.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2015-08-13
    Description: by Deborah A. Striegel, Manami Hara, Vipul Periwal Pancreatic islets of Langerhans consist of endocrine cells, primarily α, β and δ cells, which secrete glucagon, insulin, and somatostatin, respectively, to regulate plasma glucose. β cells form irregular locally connected clusters within islets that act in concert to secrete insulin upon glucose stimulation. Due to the central functional significance of this local connectivity in the placement of β cells in an islet, it is important to characterize it quantitatively. However, quantification of the seemingly stochastic cytoarchitecture of β cells in an islet requires mathematical methods that can capture topological connectivity in the entire β-cell population in an islet. Graph theory provides such a framework. Using large-scale imaging data for thousands of islets containing hundreds of thousands of cells in human organ donor pancreata, we show that quantitative graph characteristics differ between control and type 2 diabetic islets. Further insight into the processes that shape and maintain this architecture is obtained by formulating a stochastic theory of β-cell rearrangement in whole islets, just as the normal equilibrium distribution of the Ornstein-Uhlenbeck process can be viewed as the result of the interplay between a random walk and a linear restoring force. Requiring that rearrangements maintain the observed quantitative topological graph characteristics strongly constrained possible processes. Our results suggest that β-cell rearrangement is dependent on its connectivity in order to maintain an optimal cluster size in both normal and T2D islets.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2015-08-15
    Description: by Ghanim Ullah, Yina Wei, Markus A Dahlem, Martin Wechselberger, Steven J Schiff Cell volume changes are ubiquitous in normal and pathological activity of the brain. Nevertheless, we know little of how cell volume affects neuronal dynamics. We here performed the first detailed study of the effects of cell volume on neuronal dynamics. By incorporating cell swelling together with dynamic ion concentrations and oxygen supply into Hodgkin-Huxley type spiking dynamics, we demonstrate the spontaneous transition between epileptic seizure and spreading depression states as the cell swells and contracts in response to changes in osmotic pressure. Our use of volume as an order parameter further revealed a dynamical definition for the experimentally described physiological ceiling that separates seizure from spreading depression, as well as predicted a second ceiling that demarcates spreading depression from anoxic depolarization. Our model highlights the neuroprotective role of glial K buffering against seizures and spreading depression, and provides novel insights into anoxic depolarization and the relevant cell swelling during ischemia. We argue that the dynamics of seizures, spreading depression, and anoxic depolarization lie along a continuum of the repertoire of the neuron membrane that can be understood only when the dynamic ion concentrations, oxygen homeostasis,and cell swelling in response to osmotic pressure are taken into consideration. Our results demonstrate the feasibility of a unified framework for a wide range of neuronal behaviors that may be of substantial importance in the understanding of and potentially developing universal intervention strategies for these pathological states.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2015-08-15
    Description: by John R. Houser, Craig Barnhart, Daniel R. Boutz, Sean M. Carroll, Aurko Dasgupta, Joshua K. Michener, Brittany D. Needham, Ophelia Papoulas, Viswanadham Sridhara, Dariya K. Sydykova, Christopher J. Marx, M. Stephen Trent, Jeffrey E. Barrick, Edward M. Marcotte, Claus O. Wilke How do bacteria regulate their cellular physiology in response to starvation? Here, we present a detailed characterization of Escherichia coli growth and starvation over a time-course lasting two weeks. We have measured multiple cellular components, including RNA and proteins at deep genomic coverage, as well as lipid modifications and flux through central metabolism. Our study focuses on the physiological response of E . coli in stationary phase as a result of being starved for glucose, not on the genetic adaptation of E . coli to utilize alternative nutrients. In our analysis, we have taken advantage of the temporal correlations within and among RNA and protein abundances to identify systematic trends in gene regulation. Specifically, we have developed a general computational strategy for classifying expression-profile time courses into distinct categories in an unbiased manner. We have also developed, from dynamic models of gene expression, a framework to characterize protein degradation patterns based on the observed temporal relationships between mRNA and protein abundances. By comparing and contrasting our transcriptomic and proteomic data, we have identified several broad physiological trends in the E . coli starvation response. Strikingly, mRNAs are widely down-regulated in response to glucose starvation, presumably as a strategy for reducing new protein synthesis. By contrast, protein abundances display more varied responses. The abundances of many proteins involved in energy-intensive processes mirror the corresponding mRNA profiles while proteins involved in nutrient metabolism remain abundant even though their corresponding mRNAs are down-regulated.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2015-08-15
    Description: by Shaun S. Sanders, Dale D. O. Martin, Stefanie L. Butland, Mathieu Lavallée-Adam, Diego Calzolari, Chris Kay, John R. Yates, Michael R. Hayden Palmitoylation involves the reversible posttranslational addition of palmitate to cysteines and promotes membrane binding and subcellular localization. Recent advancements in the detection and identification of palmitoylated proteins have led to multiple palmitoylation proteomics studies but these datasets are contained within large supplemental tables, making downstream analysis and data mining time-consuming and difficult. Consequently, we curated the data from 15 palmitoylation proteomics studies into one compendium containing 1,838 genes encoding palmitoylated proteins; representing approximately 10% of the genome. Enrichment analysis revealed highly significant enrichments for Gene Ontology biological processes, pathway maps, and process networks related to the nervous system. Strikingly, 41% of synaptic genes encode a palmitoylated protein in the compendium. The top disease associations included cancers and diseases and disorders of the nervous system, with Schizophrenia, HD, and pancreatic ductal carcinoma among the top five, suggesting that aberrant palmitoylation may play a pivotal role in the balance of cell death and survival. This compendium provides a much-needed resource for cell biologists and the palmitoylation field, providing new perspectives for cancer and neurodegeneration.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2015-08-15
    Description: by Liat Rockah-Shmuel, Ágnes Tóth-Petróczy, Dan S. Tawfik Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII’s natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2015-08-13
    Description: by Shuai Yuan, H. Richard Johnston, Guosheng Zhang, Yun Li, Yi-Juan Hu, Zhaohui S. Qin With rapid decline of the sequencing cost, researchers today rush to embrace whole genome sequencing (WGS), or whole exome sequencing (WES) approach as the next powerful tool for relating genetic variants to human diseases and phenotypes. A fundamental step in analyzing WGS and WES data is mapping short sequencing reads back to the reference genome. This is an important issue because incorrectly mapped reads affect the downstream variant discovery, genotype calling and association analysis. Although many read mapping algorithms have been developed, the majority of them uses the universal reference genome and do not take sequence variants into consideration. Given that genetic variants are ubiquitous, it is highly desirable if they can be factored into the read mapping procedure. In this work, we developed a novel strategy that utilizes genotypes obtained a priori to customize the universal haploid reference genome into a personalized diploid reference genome. The new strategy is implemented in a program named RefEditor. When applying RefEditor to real data, we achieved encouraging improvements in read mapping, variant discovery and genotype calling. Compared to standard approaches, RefEditor can significantly increase genotype calling consistency (from 43% to 61% at 4X coverage; from 82% to 92% at 20X coverage) and reduce Mendelian inconsistency across various sequencing depths. Because many WGS and WES studies are conducted on cohorts that have been genotyped using array-based genotyping platforms previously or concurrently, we believe the proposed strategy will be of high value in practice, which can also be applied to the scenario where multiple NGS experiments are conducted on the same cohort. The RefEditor sources are available at https://github.com/superyuan/refeditor.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2015-08-15
    Description: by Alexey A. Gritsenko, Marc Hulsman, Marcel J. T. Reinders, Dick de Ridder Translation of RNA to protein is a core process for any living organism. While for some steps of this process the effect on protein production is understood, a holistic understanding of translation still remains elusive. In silico modelling is a promising approach for elucidating the process of protein synthesis. Although a number of computational models of the process have been proposed, their application is limited by the assumptions they make. Ribosome profiling (RP), a relatively new sequencing-based technique capable of recording snapshots of the locations of actively translating ribosomes, is a promising source of information for deriving unbiased data-driven translation models. However, quantitative analysis of RP data is challenging due to high measurement variance and the inability to discriminate between the number of ribosomes measured on a gene and their speed of translation. We propose a solution in the form of a novel multi-scale interpretation of RP data that allows for deriving models with translation dynamics extracted from the snapshots. We demonstrate the usefulness of this approach by simultaneously determining for the first time per-codon translation elongation and per-gene translation initiation rates of Saccharomyces cerevisiae from RP data for two versions of the Totally Asymmetric Exclusion Process (TASEP) model of translation. We do this in an unbiased fashion, by fitting the models using only RP data with a novel optimization scheme based on Monte Carlo simulation to keep the problem tractable. The fitted models match the data significantly better than existing models and their predictions show better agreement with several independent protein abundance datasets than existing models. Results additionally indicate that the tRNA pool adaptation hypothesis is incomplete, with evidence suggesting that tRNA post-transcriptional modifications and codon context may play a role in determining codon elongation rates.
    Print ISSN: 1553-734X
    Electronic ISSN: 1553-7358
    Topics: Biology , Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: The $k$ nearest neighbor ( $k$ NN) search on road networks is an important function in web mapping services. These services are now dealing with rapidly arriving queries, that are issued by a massive amount of users. While overlay graph-based indices can answer shortest path queries efficiently, there have been no studies on utilizing such indices to answer $k$ NN queries efficiently. In this paper, we fill this research gap and present two efficient $k$ NN search solutions on overlay graph-based indices. Experimental results show that our solutions offer very low query latency (0.1 ms) and require only small index sizes, even for 10-million-node networks.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: Measuring semantic similarity between two terms is essential for a variety of text analytics and understanding applications. Currently, there are two main approaches for this task, namely the knowledge based and the corpus based approaches. However, existing approaches are more suitable for semantic similarity between words rather than the more general multi-word expressions (MWEs), and they do not scale very well. Contrary to these existing techniques, we propose an efficient and effective approach for semantic similarity using a large scale semantic network. This semantic network is automatically acquired from billions of web documents. It consists of millions of concepts, which explicitly model the context of semantic relationships. In this paper, we first show how to map two terms into the concept space, and compare their similarity there. Then, we introduce a clustering approach to orthogonalize the concept space in order to improve the accuracy of the similarity measure. Finally, we conduct extensive studies to demonstrate that our approach can accurately compute the semantic similarity between terms of MWEs and with ambiguity, and significantly outperforms 12 competing methods under Pearson Correlation Coefficient. Meanwhile, our approach is much more efficient than all competing algorithms, and can be used to compute semantic similarity in a large scale.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: Given a spatio-temporal network, a source, a destination, and a desired departure time interval, the All-departure-time Lagrangian Shortest Paths (ALSP) problem determines a set which includes the shortest path for every departure time in the given interval. ALSP is important for critical societal applications such as eco-routing. However, ALSP is computationally challenging due to the non-stationary ranking of the candidate paths across distinct departure-times. Current related work for reducing the redundant work, across consecutive departure-times sharing a common solution, exploits only partial information e.g., the earliest feasible arrival time of a path. In contrast, our approach uses all available information, e.g., the entire time series of arrival times for all departure-times. This allows elimination of all knowable redundant computation based on complete information available at hand. We operationalize this idea through the concept of critical-time-points (CTP), i.e., departure-times before which ranking among candidate paths cannot change. In our preliminary work, we proposed a CTP based forward search strategy. In this paper, we propose a CTP based temporal bi-directional search for the ALSP problem via a novel impromptu rendezvous termination condition. Theoretical and experimental analysis show that the proposed approach outperforms the related work approaches particularly when there are few critical-time-points.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: Computing connected components is a core operation on graph data. Since billion-scale graphs cannot be resident in memory of a single server, several approaches based on distributed machines have recently been proposed. The representative methods are $mathsf{Hashhbox{-}Tohbox{-}Min}$ and $mathsf{PowerGraph}$ . $mathsf{Hashhbox{-}Tohbox{-}Min}$ is the state-of-the art disk-based distributed method which minimizes the number of MapReduce rounds. $mathsf{PowerGraph}$ is the-state-of-the-art in-memory distributed system, which is typically faster than the disk-based distributed one, however, requires a lot of machines for handling billion-scale graphs. In this paper, we propose an I/O efficient parallel algorithm for billion-scale graphs in a single PC. We first propose the Disk-based Sequential access-oriented Parallel processing  (DSP) model that exploits sequential disk access in terms of disk I/Os and parallel processing in terms of computation. We then propose an ultra-fast disk-based parallel algorithm for computing connected components, $mathsf{DSPhbox{-}CC}$ , which largely improves the performance through sequential disk scan and page-level cache-conscious parallel processing . Extensive experimental results show that $mathsf{DSPhbox{-}CC}$ 1) computes connected components in billion-scale graphs using the limited memory size whereas in-memory algorithms can only support medium-sized graphs with the same memory size, and 2) significantly outperforms all distributed competitors as well as a representative disk-based parallel method.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: Answering why-not questions in databases is promised to have wide application prospect in many areas and thereby, has attracted recent attention in the database research community. This paper addresses the problem of answering these so-called why-not questions in similar graph matching for graph databases. Given a set of answer graphs of an initial query graph $q$ and a set of missing ( why-not ) graphs, we aim to modify $q$ into a new query graph $q^*$ such that the missing graphs are included in the new answer set of $q^*$ . We present an approximate solution to address the above as the optimal solution is NP-hard to compute. In our approach, we first compute the bounded search space and the distance to be minimized for $q^*$ . Then, we present a two-phase algorithm to find the new query $q^*$ . In the first phase, we generate a set of candidate edges to be added/deleted into/from the initial query $q$ within the bounded search space and in the second phase, we select a subset of candidate edges generated in the first phase to minimize the distance for $q^*$ . We also demonstrate the effectiveness and efficiency of our approach by conducting extensive experiments on two real datasets.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: How has the interdisciplinary data mining field been practiced in Network and Systems Management (NSM)? In Science and Technology, there is a wide use of data mining in areas like bioinformatics, genetics, Web, and, more recently, astroinformatics. However, the application in NSM has been limited and inconsiderable. In this article, we provide an account of how data mining has been applied in managing networks and systems for the past four decades, presumably since its birth. We look into the field’s applications in the key NSM activities—discovery, monitoring, analysis, reporting, and domain knowledge acquisition. In the end, we discuss our perspective on the issues that are considered critical for the effective application of data mining in the modern systems which are characterized by heterogeneity and high dynamism.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: With the rapid development of location-aware mobile devices, ubiquitous Internet access and social computing technologies, lots of users’ personal information, such as location data and social data, has been readily accessible from various mobile platforms and online social networks. The convergence of these two types of data, known as geo-social data , has enabled collaborative spatial computing that explicitly combines both location and social factors to answer useful geo-social queries for either business or social good. In this paper, we study a new type of Geo-Social K-Cover Group (GSKCG) queries that, given a set of query points and a social network, retrieves a minimum user group in which each user is socially related to at least $k$ other users and the users’ associated regions (e.g., familiar regions or service regions) can jointly cover all the query points. Albeit its practical usefulness, the GSKCG query problem is NP-complete. We consequently explore a set of effective pruning strategies to derive an efficient algorithm for finding the optimal solution. Moreover, we design a novel index structure tailored to our problem to further accelerate query processing. Extensive experiments demonstrate that our algorithm achieves desirable performance on real-life datasets.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: High utility sequential pattern mining has been considered as an important research problem and a number of relevant algorithms have been proposed for this topic. The main challenge of high utility sequential pattern mining is that, the search space is large and the efficiency of the solutions is directly affected by the degree at which they can eliminate the candidate patterns. Therefore, the efficiency of any high utility sequential pattern mining solution depends on its ability to reduce this big search space, and as a result, lower the computational complexity of calculating the utilities of the candidate patterns. In this paper, we propose efficient data structures and pruning technique which is based on Cumulated Rest of Match (CRoM) based upper bound. CRoM, by defining a tighter upper bound on the utility of the candidates, allows more conservative pruning before candidate pattern generation in comparison to the existing techniques. In addition, we have developed an efficient algorithm, High Utility Sequential Pattern Extraction (HuspExt), which calculates the utilities of the child patterns based on that of the parents’. Substantial experiments on both synthetic and real datasets from different domains show that, the proposed solution efficiently discovers high utility sequential patterns from large scale datasets with different data characteristics, under low utility thresholds.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-11
    Description: Ordinal classification with a monotonicity constraint is a kind of classification tasks, in which the objects with better attribute values should not be assigned to a worse decision class. Several learning algorithms have been proposed to handle this kind of tasks in recent years. The rank entropy-based monotonic decision tree is very representative thanks to its better robustness and generalization. Ensemble learning is an effective strategy to significantly improve the generalization ability of machine learning systems. The objective of this work is to develop a method of fusing monotonic decision trees. In order to achieve this goal, we take two factors into account: attribute reduction and fusing principle. Through introducing variable dominance rough sets, we firstly propose an attribute reduction approach with rank-preservation for learning base classifiers, which can effectively avoid overfitting and improve classification performance. Then, we establish a fusing principe based on maximal probability through combining the base classifiers, which is used to further improve generalization ability of the learning system. The experimental analysis shows that the proposed fusing method can significantly improve classification performance of the learning system constructed by monotonic decision trees.
    Print ISSN: 1041-4347
    Electronic ISSN: 1558-2191
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...