ISSN:
1572-8412
Keywords:
conflation algorithm
;
Hartlib Papers Collection
;
Latin
;
Patrologia Latina
;
stemming
;
text databases
Source:
Springer Online Journal Archives 1860-2000
Topics:
Computer Science
,
Media Resources and Communication Sciences, Journalism
Notes:
Abstract This paper reports a detailed evaluation of the effectiveness of a system that has been developed for the identification and retrieval of morphological variants in searches of Latin text databases. A user of the retrieval system enters the principal parts of the search term (two parts for a noun or adjective, three parts for a deponent verb, and four parts for other verbs), this enabling the identification of the type of word that is to be processed and of the rules that are to be followed in determining the morphological variants that should be retrieved. Two different search algorithms are described. The algorithms are applied to the Latin portion of the Hartlib Papers Collection and to a range of classical, vulgar and medieval Latin texts drawn from the Patrologia Latina and from the PHI Disk 5.3 datasets. The effectiveness of these searches demonstrates the effectiveness of our procedures in providing access to the full range of classical and post-classical Latin text databases.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1023/A:1000996413558
Permalink