ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Electronic Resource

Linguistically based functions in information retrieval: PADOK and the German Patent Information System (1991)

Krause, Jürgen ; Womser-Hacker, Christa

Springer

Computers and the humanities 25 (1991), S. 103-114

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: information retrieval ; intelligent information retrieval ; evaluation ; mass data, patent information system ; statistical measurement ; indexing system ; protocol analysis

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract This paper reports on methodological considerations and the results of the Information Retrieval (IR) project PADOK I and II. PADOK has been carried out by the Linguistic Information Science Group of the University of Regensburg (LIR) since November 1984 and has been sponsored by the German Ministry for Research and Technology. The long term objective is to integrate artificial intelligence topics and the methods of information retrieval research without neglecting traditional IR methodology. In PADOK we consider a type of mass data IR system which indexes its documents rather shallowly (freetext or morphological components) and adds an intelligent information retrieval component to this kernel system. So far we have obtained, on the basis of two large-scale retrieval tests of the German Patent Information System results which show how the linguistically based functions of an indexing system contribute to its performance, and indicate what is the most reasonable basic content analysis program for a German Patent Information System. This paper focusses on the general principles and aims of PADOK I and PADOK R and on the statistical evaluation of the retrieval tests. Christa Womser-Hacker has a Ph.D. in Linguistic Information Science. From 1985 until 1990 she was involved in several LIR-Projects concerning text processing, evaluation of the German Patent Information System, man-machine-interaction, intelligent interfaces for databases. Since May 1990 she has been an LIR staff member. She is interested in information retrieval, (statistical) evaluation methods of man-machine-interaction, intelligent interfaces. She has published Der PADOK-Retrieval-test (1989) and “Die statistische Auswertung des Retrievaltests” (1990).

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00124147

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

2

Electronic Resource

Tagger Evaluation Given Hierarchical Tag Sets (2000)

Melamed, I. Dan ; Resnik, Philip

Springer

Computers and the humanities 34 (2000), S. 79-84

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: evaluation ; ambiguity resolution ; WSD ; inter-annotator agreement

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract We present methods for evaluating human and automatictaggers that extend current practice in three ways. First, we show howto evaluate taggers that assign multiple tags to each test instance,even if they do not assign probabilities. Second, we show how toaccommodate a common property of manually constructed ``gold standards''that are typically used for objective evaluation, namely that there isoften more than one correct answer. Third, we show how to measureperformance when the set of possible tags is tree-structured in an IS-Ahierarchy. To illustrate how our methods can be used to measureinter-annotator agreement, we show how to compute the kappa coefficientover hierarchical tag sets.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1002402902356

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

3

Electronic Resource

Peeling an Onion: The Lexicographer's Experience ofManual Sense-Tagging (2000)

Krishnamurthy, Ramesh ; Nicholls, Diane

Springer

Computers and the humanities 34 (2000), S. 85-97

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: context/kwd〉 ; corpus ; evaluation ; lexicography ; part-of-speech tagging ; word sense disambiguation ; sense-tagging

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract SENSEVAL set itself the task of evaluating automaticword sense disambiguation programs (see Kilgarriff andRosenzweig, this volume, for an overview of theframework and results). In order to do this, it wasnecessary to provide a `gold standard' dataset of `correct' answers. This paper will describe thelexicographic part of the process involved in creatingthat dataset. The primary objective was for a group oflexicographers to manually examine keywords in a largenumber of corpus contexts, and assign to each contexta sense-tag for the keyword, taken from the Hectordictionary. Corpus contexts also had to be manuallypart-of-speech (POS) tagged. Various observationsmade and insights gained by the lexicographers duringthis process will be presented, including a critiqueof the resources and the methodology.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1002407003264

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

4

Electronic Resource

Framework and Results for English SENSEVAL (2000)

Kilgarriff, A. ; Rosenzweig, J.

Springer

Computers and the humanities 34 (2000), S. 15-48

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: evaluation ; SENSEVAL ; word sense disambiguation

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract Senseval was the first open, community-based evaluation exercisefor Word Sense Disambiguation programs. It adopted the quantitativeapproach to evaluation developed in MUC and other ARPA evaluationexercises. It took place in 1998. In this paper we describe thestructure, organisation and results of the SENSEVAL exercise forEnglish. We present and defend various design choices for theexercise, describe the data and gold-standard preparation, considerissues of scoring strategies and baselines, and present the resultsfor the 18 participating systems. The exercise identifies thestate-of-the-art for fine-grained word sense disambiguation, wheretraining data is available, as 74–78% correct, with a number ofalgorithms approaching this level of performance. For systems thatdid not assume the availability of training data, performance wasmarkedly lower and also more variable. Human inter-tagger agreementwas high, with the gold standard taggings being around 95%replicable.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1002693207386

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

5

Electronic Resource

Introduction to the Special Issue on SENSEVAL (2000)

Kilgarriff, A. ; Palmer, M.

Springer

Computers and the humanities 34 (2000), S. 1-13

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: word sense disambiguation ; evaluation ; SENSEVAL

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract Senseval was the first open, community-based evaluation exercise for WordSense Disambiguation programs. It took place in the summer of 1998,with tasks for English, French and Italian. There were participating systems from 23 researchgroups. This special issueis an account of the exercise. In addition to describing the contentsof the volume, this introduction considers how the exercise has shedlight on some general questions about wordsenses and evaluation.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1002619001915

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

6

Electronic Resource

Quantification of rewriting by the Brothers Grimm: A comparison of successive versions of three tales (1989)

Anderson, C. W. ; McMaster, G. E.

Springer

Computers and the humanities 23 (1989), S. 341-346

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: evaluation ; activity ; potency ; emotional tone

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract A comparison was made of the levels and patterns of emotional tone scores in four successive versions of three stories that have been translated from German by Ellis to illustrate his argument that the Grimm Brothers made extensive revisions from the proported manuscript of the stories to their celebrated first edition versions. This objective analysis was based upon the evaluation, activity, and potency of the emotions connoted by those of the 1000 most frequent English words detected by the computer as occurring in the narratives. The stores were:The King's Daughter and The Enchanted Prince: Frog King, Sleeping Beauty, andThe Little Brother and Little Sister (Hansel and Gretel). Changes in story length, in mean levels of emotional tone, and in patterns of emotional tone across story versions support Ellis's judgement that subsequent revisions were less drastic than the first one, from the manuscript. It was also shown that the stories are quite different from each other in level and pattern of emotional tone.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF02176639

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

7

Electronic Resource

Developing and evaluating language courseware (1992)

Ephratt, Michal

Springer

Computers and the humanities 26 (1992), S. 249-259

add to mindlist on the mindlist

Details

ISSN: 1572-8412

Keywords: linguistics ; language instruction ; CALL ; evaluation

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Media Resources and Communication Sciences, Journalism

Notes: Abstract The paper sets out twenty proposals for the development and evaluation of Computer Assisted Language Learning (CALL) programs. These proposals emerge from special characteristics of language instruction and of the use of computers to assist in language instruction. We combine theoretically-based assumptions with empirical findings drawn from investigation of language courseware for Hebrew speakers in Israel. We first list four unique features of language instruction: (1) the object-language-meta-language distinction; (2) computer as written medium vs. language as primary spoken medium; (3) teaching of second language skills vs. linguistics; (4) the computer as an electronic tool vs. the computer as a cognitive entity simulating the speaker. We then show how these unique characteristics of language instruction (mother-tongue and foreign language) impose special proposals on language courseware. These proposals should be observed in the development of language courseware and in the evaluation of such programs. Clearly, these proposals integrate with general courseware proposals.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00054270

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext