ISSN:
1573-7462
Keywords:
clustering
;
divisive partitioning
;
graph partitioning
;
principal component analysis
;
web documents
Source:
Springer Online Journal Archives 1860-2000
Topics:
Computer Science
Notes:
Abstract We present WebACE, an agent for exploring and categorizing documents onthe World Wide Web based on a user profile. The heart of the agent is anunsupervised categorization of a set of documents, combined with a processfor generating new queries that is used to search for new relateddocuments and for filtering the resulting documents to extract the onesmost closely related to the starting set. The document categories are notgiven a priori. We present the overall architecture and describe twonovel algorithms which provide significant improvement over HierarchicalAgglomeration Clustering and AutoClass algorithms and form the basis forthe query generation and search component of the agent. We report on theresults of our experiments comparing these new algorithms with moretraditional clustering algorithms and we show that our algorithms are fastand sacalable.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1023/A:1006592405320
Permalink