ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (35)
  • classification  (33)
  • Triticum aestivum
  • fish
  • taxonomy
  • Springer  (35)
  • 1995-1999  (35)
  • Computer Science  (35)
Collection
  • Articles  (35)
Publisher
  • Springer  (35)
Years
Year
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 27 (1997), S. 227-250 
    ISSN: 1572-9443
    Keywords: polling systems ; heavy traffic ; expected delay ; exhaustiveness ; monotonicity ; service disciplines ; classification
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We study the expected delay in cyclic polling models with general ‘branching-type’ service disciplines. For this class of models, which contains models with exhaustive and gated service as special cases, we obtain closed-form expressions for the expected delay under standard heavy-traffic scalings. We identify a single parameter associated with the service discipline at each queue, which we call the ‘exhaustiveness’. We show that the scaled expected delay figures depend on the service policies at the queues only through the exhaustiveness of each of the service disciplines. This implies that the influence of different service disciplines, but with the same exhaustiveness, on the expected delays at the queues becomes the same when the system reaches saturation. This observation leads to a new classification of the service disciplines. In addition, we show monotonicity of the scaled expected delays with respect to the exhaustiveness of the service disciplines. This induces a complete ordering in terms of efficiency of the service disciplines. The results also lead to new rules for optimization of the system performance with respect to the service disciplines at the queues. Further, the exact asymptotic results suggest simple expected waiting-time approximations for polling models in heavy traffic. Numerical experiments show that the accuracy of the approximations is excellent for practical heavy-traffic scenarios.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Springer
    Grammars 1 (1998), S. 103-153 
    ISSN: 1572-848X
    Keywords: classification ; constraint grammars ; knowledge management
    Source: Springer Online Journal Archives 1860-2000
    Topics: Linguistics and Literary Studies , Computer Science
    Notes: Abstract Classifying linguistic objects is a widespread and important linguistic task, but hand deducing a classificatory system from a general linguistic theory can consume much effort and introduce pernicious errors. We present an abstract prototype device that effectively deduces an accurate classificatory system from a finite linguistic theory.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Computers and the humanities 29 (1995), S. 449-461 
    ISSN: 1572-8412
    Keywords: neural networks ; stylometric analysis ; Shakespeare ; Fletcher ; discrimination ; classification
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Media Resources and Communication Sciences, Journalism
    Notes: Abstract In this paper we show, for the first time, how Radial Basis Function (RBF) network techniques can be used to explore questions surrounding authorship of historic documents. The paper illustrates the technical and practical aspects of RBF's, using data extracted from works written in the early 17th century by William Shakespeare and his contemporary John Fletcher. We also present benchmark comparisons with other standard techniques for contrast and comparison.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Electronic Resource
    Electronic Resource
    Springer
    Minds and machines 5 (1995), S. 69-87 
    ISSN: 1572-8641
    Keywords: Meaning ; reference ; disjunction problem ; situation theory ; synonymy ; classification ; causal theory of reference ; co-ordination
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Philosophy
    Notes: Abstract A basic theme of Winograd and Flores (1986) is that the principal function of language is to co-ordinate social activity. It is, they claim, from this function that meaning itself arises. They criticise approaches that try to understand meaning through the mechanisms of reference, the Rationalist Tradition as they call it. To seek to ground meaning in social practice is not new, but the approach is presently attractive because of difficulties encountered with the notion of reference. Without taking a view on whether these are insuperable, the present paper accepts Winograd and Flores' challenge and attempts to lay aside reference and to base a conception of meaning directly in terms of co-ordination and consensus within a linguistic community.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 21 (1998), S. 117-129 
    ISSN: 1573-0409
    Keywords: modular neural networks ; classification ; cooperative decision making ; performance comparison
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract There is a wide variety of Modular Neural Network (MNN) classifiers in the literature. They differ according to the design of their architecture, task-decomposition scheme, learning procedure, and multi-module decision-making strategy. Meanwhile, there is a lack of comparative studies in the MNN literature. This paper compares ten MNN classifiers which give a good representation of design varieties, viz., Decoupled; Other-output; ART-BP; Hierarchical; Multiple-experts; Ensemble (majority vote); Ensemble (average vote); Merge-glue; Hierarchical Competitive Neural Net; and Cooperative Modular Neural Net. Two benchmark applications of different degree and nature of complexity are used for performance comparison, and the strength-points and drawbacks of the different networks are outlined. The aim is to help a potential user to choose an appropriate model according to the application in hand and the available computational resources.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 30 (1998), S. 195-215 
    ISSN: 0885-6125
    Keywords: Inductive learning ; classification ; radar images ; methodology
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract During a project examining the use of machine learning techniques for oil spill detection, we encountered several essential questions that we believe deserve the attention of the research community. We use our particular case study to illustrate such issues as problem formulation, selection of evaluation measures, and data preparation. We relate these issues to properties of the oil spill application, such as its imbalanced class distribution, that are shown to be common to many applications. Our solutions to these issues are implemented in the Canadian Environmental Hazards Detection System (CEHDS), which is about to undergo field testing.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 29 (1997), S. 131-163 
    ISSN: 0885-6125
    Keywords: Bayesian networks ; classification
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 36 (1999), S. 105-139 
    ISSN: 0885-6125
    Keywords: classification ; boosting ; Bagging ; decision trees ; Naive-Bayes ; mean-squared error
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Methods for voting classification algorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and real-world datasets. We review these algorithms and describe a large empirical study comparing several variants in conjunction with a decision tree inducer (three variants) and a Naive-Bayes inducer. The purpose of the study is to improve our understanding of why and when these algorithms, which use perturbation, reweighting, and combination techniques, affect classification error. We provide a bias and variance decomposition of the error to show how different methods and variants influence these two terms. This allowed us to determine that Bagging reduced variance of unstable methods, while boosting methods (AdaBoost and Arc-x4) reduced both the bias and variance of unstable methods but increased the variance for Naive-Bayes, which was very stable. We observed that Arc-x4 behaves differently than AdaBoost if reweighting is used instead of resampling, indicating a fundamental difference. Voting variants, some of which are introduced in this paper, include: pruning versus no pruning, use of probabilistic estimates, weight perturbations (Wagging), and backfitting of data. We found that Bagging improves when probabilistic estimates in conjunction with no-pruning are used, as well as when the data was backfit. We measure tree sizes and show an interesting positive correlation between the increase in the average tree size in AdaBoost trials and its success in reducing the error. We compare the mean-squared error of voting methods to non-voting methods and show that the voting methods lead to large and significant reductions in the mean-squared errors. Practical problems that arise in implementing boosting algorithms are explored, including numerical instabilities and underflows. We use scatterplots that graphically show how AdaBoost reweights instances, emphasizing not only “hard” areas but also outliers and noise.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 36 (1999), S. 33-58 
    ISSN: 0885-6125
    Keywords: classification ; correspondence analysis ; multiple models ; combining estimates
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Several effective methods have been developed recently for improving predictive performance by generating and combining multiple learned models. The general approach is to create a set of learned models either by applying an algorithm repeatedly to different versions of the training data, or by applying different learning algorithms to the same data. The predictions of the models are then combined according to a voting scheme. This paper focuses on the task of combining the predictions of a set of learned models. The method described uses the strategies of stacking and Correspondence Analysis to model the relationship between the learning examples and their classification by a collection of learned models. A nearest neighbor method is then applied within the resulting representation to classify previously unseen examples. The new algorithm does not perform worse than, and frequently performs significantly better than other combining techniques on a suite of data sets.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 11 (1999), S. 15-30 
    ISSN: 1573-7497
    Keywords: neural networks ; structured objects ; machine learning ; classification ; similarity ; nearest neighbor
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Labeled graphs are an appropriate and popular representation of structured objects in many domains. If the labels describe the properties of real world objects and their relations, finding the best match between two graphs turns out to be the weakly defined, NP-complete task of establishing a mapping between them that maps similar parts onto each other preserving as much as possible of their overall structural correspondence. In this paper, former approaches of structural matching and constraint relaxation by spreading activation in neural networks and the method of solving optimization tasks using Hopfield-style nets are combined. The approximate matching task is reformulated as the minimization of a quadratic energy function. The design of the approach enables the user to change the parameters and the dynamics of the net so that knowledge about matching preferences is included easily and transparently. In the last section, some examples demonstrate the successful application of this approach in classification and learning in the domain of organic chemistry.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 11 (1999), S. 277-284 
    ISSN: 1573-7497
    Keywords: genetic algorithms ; classification ; data mining
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A common approach to evaluating competing models in a classification context is via accuracy on a test set or on cross-validation sets. However, this can be computationally costly when using genetic algorithms with large datasets and the benefits of performing a wide search are compromised by the fact that estimates of the generalization abilities of competing models are subject to noise. This paper shows that clear advantages can be gained by using samples of the test set when evaluating competing models. Further, that applying statistical tests in combination with Occam's razor produces parsimonious models, matches the level of evaluation to the state of the search and retains the speed advantages of test set sampling.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 9 (1999), S. 293-300 
    ISSN: 1573-773X
    Keywords: classification ; support vector machines ; linear least squares ; radial basis function kernel
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this letter we discuss a least squares version for support vector machine (SVM) classifiers. Due to equality type constraints in the formulation, the solution follows from solving a set of linear equations, instead of quadratic programming for classical SVM's. The approach is illustrated on a two-spiral benchmark classification problem.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent information systems 11 (1998), S. 99-138 
    ISSN: 1573-7675
    Keywords: anomaly detection ; Bayesian methods ; classification ; computational complexity ; knowledge discovery
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We consider the problem of prioritizing a collection of discrete pieces of information, or transactions. The goal is to rank the transactions in such a way that the user can best pursue a subset of the transactions in hopes of discovering those which were generated by an interesting source. The problem is shown to differ from traditional classification in several fundamental ways. Ranking algorithms are divided into classes, depending on the amount of information they may utilize. We demonstrate that while ranking by the least constrained algorithm class is consistent with classification, such is not the case for a more constrained class of algorithms. We demonstrate also that while optimal ranking by the former class is “easy”, optimal ranking by the latter class is NP-hard. Finally, we present detectors which solve optimally restricted versions of the ranking problem, including symmetric anomaly detection.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent information systems 10 (1998), S. 31-48 
    ISSN: 1573-7675
    Keywords: Semantic information preserving reduction ; relational databases ; selection ; projection ; classification ; reduced information systems
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Databases store large amounts of information about consumer transactions and other kinds of transactions. This information can be used to deduce rules about consumer behavior, and the rules can in turn be used to determine company policies, for instance with regards to production, marketing and in several other areas. Since databases typically store millions of records, and each record could have up to 100 or more attributes, as an initial step it is necessary to reduce the size of the database by eliminating attributes that do not influence the decision at all or do so very minimally. In this paper we present techniques that can be employed effectively for exact and approximate reduction in a database system. These techniques can be implemented efficiently in a database system using SQL (structured query language) commands. We tested their performance on a real data set and validated them. The results showed that the classification performance actually improved with a reduced set of attributes as compared to the case when all the attributes were present. We also discuss how our techniques differ from statistical methods and other data reduction methods such as rough sets.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 10 (1999), S. 57-72 
    ISSN: 1573-773X
    Keywords: Self Organizing Feature Map ; neural modeling ; local minima avoidance ; classification ; radial basis function networks
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper proposes an escape methodology to the local minima problem of self organizing feature maps generated in the overlapping regions which are equidistant to the corresponding winners. Two new versions of the Self Organizing Feature Map are derived equipped with such a methodology. The first approach introduces an excitation term, which increases the convergence speed and efficiency of the algorithm, while increasing the probability of escaping from local minima. In the second approach, we associate a learning set which specifies the attractive and repulsive field of output neurons. Results indicate that accuracy percentile of the new methods are higher than the original algorithm while they have the ability to escape from local minima.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 6 (1996), S. 75-86 
    ISSN: 1573-7497
    Keywords: reasoning under uncertainty ; pattern recognition ; classification ; evidential reasoning ; handwritten digit recognition ; multiple classifiers
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper introduces a knowledge integration framework based on Dempster-Shafer's mathematical theory of evidence for integrating classification results derived from multiple classifiers. This framework enables us to understand in which situations the classifiers give uncertain responses, to interpret classification evidence, and allows the classifiers to compensate for their individual deficiencies. Under this framework, we developed algorithms to model classification evidence and combine classification evidence form difference classifiers, we derived inference rules from evidential intervals for reasoning about classification results. The algorithms have been implemented and tested. Implementation issues, performance analysis and experimental results are presented.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    ISSN: 1573-7497
    Keywords: neural networks ; counter-propagation ; singular value decomposition ; molecular sequence ; classification ; human genome project
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A modified counter-propagation (CP) algorithm with supervised learning vector quantizer (LVQ) and dynamic node allocation has been developed for rapid classification of molecular sequences. The molecular sequences were encoded into neural input vectors using an n–gram hashing method for word extraction and a singular value decomposition (SVD) method for vector compression. The neural networks used were three-layered, forward-only CP networks that performed nearest neighbor classification. Several factors affecting the CP performance were evaluated, including weight initialization, Kohonen layer dimensioning, winner selection and weight update mechanisms. The performance of the modified CP network was compared with the back-propagation (BP) neural network and the k–nearest neighbor method. The major advantages of the CP network are its training and classification speed and its capability to extract statistical properties of the input data. The combined BP and CP networks can classify nucleic acid or protein sequences with a close to 100% accuracy at a rate of about one order of magnitude faster than other currently available methods.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 10 (1999), S. 201-210 
    ISSN: 1573-773X
    Keywords: neural networks ; learning ; minimal distance methods ; similarity-based methods ; machine learning ; interpretation of neural functions ; classification
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Multilayer Perceptrons (MLPs) use scalar products to compute weighted activation of neurons providing decision borders using combinations of soft hyperplanes. The weighted fun-in activation function may be replaced by a distance function between the inputs and the weights, offering a natural generalization of the standard MLP model. Non-Euclidean distance functions may also be introduced by normalization of the input vectors into an extended feature space. Both approaches influence the shapes of decision borders dramatically. An illustrative example showing these changes is provided.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 3 (1996), S. 151-162 
    ISSN: 1573-773X
    Keywords: classification ; convergence ; gradient algorithm ; RBF
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper, we present the architecture of an RBF neural classifier. We show that a global learning algorithm concentrating only on the centres, the Gaussian widths and the weights of the connections is inadequate for this architecture. Then, we propose to use an hybrid learning algorithm in which the Gaussian centres are first fixed. This one gives satisfactory results.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 7 (1998), S. 101-106 
    ISSN: 1573-773X
    Keywords: classification ; hybrid network ; modelling ; multilayer perceptron
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this study we investigate a hybrid neural network architecture for modelling purposes. The proposed network is based on the multilayer perceptron (MLP) network. However, in addition to the usual hidden layers the first hidden layer is selected to be a centroid layer. Each unit in this new layer incorporates a centroid that is located somewhere in the input space. The output of these units is the Euclidean distance between the centroid and the input. The centroid layer clearly resembles the hidden layer of the radial basis function (RBF) networks. Therefore the centroid based multilayer perceptron (CMLP) networks can be regarded as a hybrid of MLP and RBF networks. The presented benchmark experiments show that the proposed hybrid architecture is able to combine the good properties of MLP and RBF networks resulting fast and efficient learning, and compact network structure.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Electronic Resource
    Electronic Resource
    Springer
    Journal of systems integration 9 (1999), S. 167-185 
    ISSN: 1573-8787
    Keywords: flexibility ; complexity ; systems approach ; taxonomy
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper we present a taxonomy of manufacturing problems, labeled in a general sense as Design, Production, or Distribution problems. One or more basic systems concepts, such as complexity and adaptation, attach themselves to each such problems. By combining the hierarchical Design—Production—Distribution idea with system concepts, we establish the fact that there is, indeed, a significant systems component to most problems of modern manufacturing.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Electronic Resource
    Electronic Resource
    Springer
    Journal of computational neuroscience 6 (1999), S. 121-144 
    ISSN: 1573-6873
    Keywords: Mauthner ; escape ; artificial neural networks ; connectionism ; acoustic ; localization ; auditory ; fish ; goldfish ; XNOR model ; phase model
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Medicine , Physics
    Notes: Abstract Artificial neural networks were used to explore the auditory function of the Mauthner system, the brainstem circuit in teleost fishes that initiates fast-start escape responses. The artificial neural networks were trained with backpropagation to assign connectivity and receptive fields in an architecture consistent with the known anatomy of the Mauthner system. Our first goal was to develop neurally specific hypotheses for how the Mauthner system discriminates right from left in the onset of a sound. Our model was consistent with the phase model for directional hearing underwater, the prevalent theory for sound source localization by fishes. Our second goal was to demonstrate how the neural mechanisms that permit sound localization according to the phase model can coexist with the mechanisms that permit the Mauthner system to discriminate between stimuli based on amplitude. Our results indicate possible computational roles for elements of the Mauthner system, which has provided us a theoretical context within which to consider past and future experiments on the cellular physiology. Thus, these findings demonstrate the potential significance of this approach in generating experimentally testable hypotheses for small systems of identified cells.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Electronic Resource
    Electronic Resource
    Springer
    Computational optimization and applications 12 (1999), S. 53-79 
    ISSN: 1573-2894
    Keywords: support vector machines ; linear programming ; classification ; data mining ; machine learning.
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We examine the problem of how to discriminate between objects of three or more classes. Specifically, we investigate how two-class discrimination methods can be extended to the multiclass case. We show how the linear programming (LP) approaches based on the work of Mangasarian and quadratic programming (QP) approaches based on Vapnik's Support Vector Machine (SVM) can be combined to yield two new approaches to the multiclass problem. In LP multiclass discrimination, a single linear program is used to construct a piecewise-linear classification function. In our proposed multiclass SVM method, a single quadratic program is used to construct a piecewise-nonlinear classification function. Each piece of this function can take the form of a polynomial, a radial basis function, or even a neural network. For the k 〉 2-class problems, the SVM method as originally proposed required the construction of a two-class SVM to separate each class from the remaining classes. Similarily, k two-class linear programs can be used for the multiclass problem. We performed an empirical study of the original LP method, the proposed k LP method, the proposed single QP method and the original k QP methods. We discuss the advantages and disadvantages of each approach.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Electronic Resource
    Electronic Resource
    Springer
    Statistics and computing 8 (1998), S. 25-33 
    ISSN: 1573-1375
    Keywords: regression ; classification ; missing data ; mixtures of experts
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract In a regression or classification setting where we wish to predict Y from x1,x2,..., xp, we suppose that an additional set of ‘coaching’ variables z1,z2,..., zm are available in our training sample. These might be variables that are difficult to measure, and they will not be available when we predict Y from x1,x2,..., xp in the future. We consider two methods of making use of the coaching variables in order to improve the prediction of Y from x1,x2,..., xp. The relative merits of these approaches are discussed and compared in a number of examples.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Electronic Resource
    Electronic Resource
    Springer
    Statistics and computing 9 (1999), S. 111-121 
    ISSN: 1573-1375
    Keywords: classification ; conditional Gaussian model ; EM algorithm ; shrinkage ; underlying variable mixture model
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract For clustering mixed categorical and continuous data, Lawrence and Krzanowski (1996) proposed a finite mixture model in which component densities conform to the location model. In the graphical models literature the location model is known as the homogeneous Conditional Gaussian model. In this paper it is shown that their model is not identifiable without imposing additional restrictions. Specifically, for g groups and m locations, (g!)m−1 distinct sets of parameter values (not including permutations of the group mixing parameters) produce the same likelihood function. Excessive shrinkage of parameter estimates in a simulation experiment reported by Lawrence and Krzanowski (1996) is shown to be an artifact of the model's non-identifiability. Identifiable finite mixture models can be obtained by imposing restrictions on the conditional means of the continuous variables. These new identified models are assessed in simulation experiments. The conditional mean structure of the continuous variables in the restricted location mixture models is similar to that in the underlying variable mixture models proposed by Everitt (1988), but the restricted location mixture models are more computationally tractable.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Electronic Resource
    Electronic Resource
    Springer
    Software quality journal 5 (1996), S. 255-272 
    ISSN: 1573-1367
    Keywords: classification ; criticality prediction ; data analysis ; fuzzy classification ; quality models ; software metrics
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Managing software development and maintenance projects requires predictions about components of the software system that are likely to have a high error rate or that need high development effort. The value of any classification is determined by the accuracy and cost of such predictions. The paper investigates the hypothesis whether fuzzy classification applied to criticality prediction provides better results than other classification techniques that have been introduced in this area. Five techniques for identifying error-prone software components are compared, namely Pareto classification, crisp classification trees, factor-based discriminant analysis, neural networks, and fuzzy classification. The comparison is illustrated with experimental results from the development of industrial real-time projects. A module quality model — with respect to changes — provides both quality of fit (according to past data) and predictive accuracy (according to ongoing projects). Fuzzy classification showed best results in terms of overall predictive accuracy.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Electronic Resource
    Electronic Resource
    Springer
    Statistics and computing 9 (1999), S. 123-143 
    ISSN: 1573-1375
    Keywords: Data Mining ; noisy function optimization ; classification ; association ; rule induction
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract Many data analytic questions can be formulated as (noisy) optimization problems. They explicitly or implicitly involve finding simultaneous combinations of values for a set of (“input”) variables that imply unusually large (or small) values of another designated (“output”) variable. Specifically, one seeks a set of subregions of the input variable space within which the value of the output variable is considerably larger (or smaller) than its average value over the entire input domain. In addition it is usually desired that these regions be describable in an interpretable form involving simple statements (“rules”) concerning the input values. This paper presents a procedure directed towards this goal based on the notion of “patient” rule induction. This patient strategy is contrasted with the greedy ones used by most rule induction methods, and semi-greedy ones used by some partitioning tree techniques such as CART. Applications involving scientific and commercial data bases are presented.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    ISSN: 1573-7497
    Keywords: classification ; concept formation ; knowledge-based systems ; wastewater treatment ; environmental engineering
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Although activated sludge process is a very widely used biologicalprocess in wastewater treatment plants (WWTP), and there areproperly functioning control loops such as that of dissolved oxygen,in practice, this type of plant requires a major time investment onthe part of the operator, involving many manual operations.Treatment plants work well most of the time, as long as there are not unforeseen occurrences. Normal operatingsituations (generally similar to design conditions) can be treatedmathematically by using efficient control algorithms. However, there aresituations in which the control system cannot properlymanage the plant, and in which the process can only be efficiently managedthanks to the operator‘s experience. This is a case in which aknowledge-based system may be useful. One of the difficulties inherent tothe development of a knowledge-based system is to obtain the knowledge base(i.e., knowledge acquisition), specially whendealing with a wide, complicated and ill-structured)field. Among the aims of this work arethose to show how semi-automatic knowledge acquisition tools could helphuman experts to organize their knowledge about their domain and also, tocompare the power of different approaches of knowledge acquisition) to the same database. In this paper are presented the results obtained fromapplying two different classification techniques to the development of knowledge-bases for the management of an activated sludge process.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 8 (1998), S. 173-187 
    ISSN: 1573-7497
    Keywords: neural networks ; fuzzy logic ; classification ; remote sensing
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper, we consider neural-fuzzy models for multispectral image analysis. We consider both supervised and unsupervised classification. The model for supervised classification consists of six layers. The first three layers map the input variables to fuzzy set membership functions. The last three layers implement the decision rules. The model learns decision rules using a supervised gradient descent procedure. The model for unsupervised classification consists of two layers. The algorithm is similar to competitive learning. However, here, for each input sample, membership functions of output categories are used to update weights. Input vectors are normalized, and Euclidean distance is used as the similarity measure. In this model if the input vector does not satisfy the “similarity criterion,” a new cluster is created; otherwise, the weights corresponding to the winner unit are updated using the fuzzy membership values of the output categories. We have developed software for these models. As an illustration, the models are used to analyze multispectral images.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 1 (1997), S. 79-119 
    ISSN: 1573-756X
    Keywords: Bayesian networks ; Bayesian statistics ; learning ; missing data ; classification ; regression ; clustering ; causal discovery
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A Bayesian network is a graphical model that encodesprobabilistic relationships among variables of interest. When used inconjunction with statistical techniques, the graphical model hasseveral advantages for data modeling. One, because the model encodesdependencies among all variables, it readily handles situations wheresome data entries are missing. Two, a Bayesian network can be used tolearn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequencesof intervention. Three, because the model has both a causal andprobabilistic semantics, it is an ideal representation for combiningprior knowledge (which often comes in causal form) and data. Four,Bayesian statistical methods in conjunction with Bayesian networksoffer an efficient and principled approach for avoiding theoverfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarizeBayesian statistical methods for using data to improve these models.With regard to the latter task, we describe methods for learning boththe parameters and structure of a Bayesian network, includingtechniques for learning with incomplete data. In addition, we relateBayesian-network methods for learning to techniques for supervised andunsupervised learning. We illustrate the graphical-modeling approachusing a real-world case study.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 1 (1997), S. 55-77 
    ISSN: 1573-756X
    Keywords: classification ; bias ; variance ; curse-of-dimensionality ; bagging ; naive Bayes ; nearest-neighbors
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The classification problem is considered in which an outputvariable y assumes discrete values with respectiveprobabilities that depend upon the simultaneous values of a set of input variablesx = {x_1,....,x_n}. At issue is how error in the estimates of theseprobabilities affects classification error when the estimates are used ina classification rule. These effects are seen to be somewhat counterintuitive in both their strength and nature. In particular the bias andvariance components of the estimation error combine to influenceclassification in a very different way than with squared error on theprobabilities themselves. Certain types of (very high) bias can becanceled by low variance to produce accurate classification. This candramatically mitigate the effect of the bias associated with some simpleestimators like “naive” Bayes, and the bias induced by thecurse-of-dimensionality on nearest-neighbor procedures. This helps explainwhy such simple methods are often competitive with and sometimes superiorto more sophisticated ones for classification, and why“bagging/aggregating” classifiers can often improveaccuracy. These results also suggest simple modifications to theseprocedures that can (sometimes dramatically) further improve theirclassification performance.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 1 (1997), S. 317-328 
    ISSN: 1573-756X
    Keywords: classification ; comparative studies ; statistical methods
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract An important component of many data mining projects is finding a good classification algorithm, a process that requires very careful thought about experimental design. If not done very carefully, comparative studies of classification and other types of algorithms can easily result in statistically invalid conclusions. This is especially true when one is using data mining techniques to analyze very large databases, which inevitably contain some statistically unlikely data. This paper describes several phenomena that can, if ignored, invalidate an experimental comparison. These phenomena and the conclusions that follow apply not only to classification, but to computational experiments in almost any aspect of data mining. The paper also discusses why comparative analysis is more important in evaluating some types of algorithms than for others, and provides some suggestions about how to avoid the pitfalls suffered by many experimental studies.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 2 (1998), S. 345-389 
    ISSN: 1573-756X
    Keywords: classification ; tree-structured classifiers ; data compaction
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Decision trees have proved to be valuable tools for the description, classification and generalization of data. Work on constructing decision trees from data exists in multiple disciplines such as statistics, pattern recognition, decision theory, signal processing, machine learning and artificial neural networks. Researchers in these disciplines, sometimes working on quite different problems, identified similar issues and heuristics for decision tree construction. This paper surveys existing work on decision tree construction, attempting to identify the important issues involved, directions the work has taken and the current state of the art.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 3 (1999), S. 197-217 
    ISSN: 1573-756X
    Keywords: binary decision tree ; classification ; data mining ; entropy ; Gini index ; impurity ; optimal splitting
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is often forced to search over all possible L-ary partitions for the one that yields the minimum impurity measure. For binary trees (L = 2) when there are just two classes a short-cut search is possible that is linear in n, the number of distinct values of the attribute. For the general case in which the number of classes, k, may be greater than two, Burshtein et al. have shown that the optimal partition satisfies a condition that involves the existence of 2 L hyperplanes in the class probability space. We derive a property of the optimal partition for concave impurity measures (including in particular the Gini and entropy impurity measures) in terms of the existence ofL vectors in the dual of the class probability space, which implies the earlier condition. Unfortunately, these insights still do not offer a practical search method when n and k are large, even for binary trees. We therefore present a new heuristic search algorithm to find a good partition. It is based on ordering the attribute's values according to their principal component scores in the class probability space, and is linear in n. We demonstrate the effectiveness of the new method through Monte Carlo simulation experiments and compare its performance against other heuristic methods.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Electronic Resource
    Electronic Resource
    Springer
    Data mining and knowledge discovery 3 (1999), S. 237-261 
    ISSN: 1573-756X
    Keywords: data mining ; parallel processing ; classification ; scalability ; decision trees
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Classification decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud detection, etc. Highly parallel algorithms for constructing classification decision trees are desirable for dealing with large data sets in reasonable amount of time. Algorithms for building classification decision trees have a natural concurrency, but are difficult to parallelize due to the inherent dynamic nature of the computation. In this paper, we present parallel formulations of classification decision tree learning algorithm based on induction. We describe two basic parallel formulations. One is based on Synchronous Tree Construction Approach and the other is based on Partitioned Tree Construction Approach. We discuss the advantages and disadvantages of using these methods and propose a hybrid method that employs the good features of these methods. We also provide the analysis of the cost of computation and communication of the proposed hybrid method. Moreover, experimental results on an IBM SP-2 demonstrate excellent speedups and scalability.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...