ISSN:
1572-8412
Keywords:
Bible
;
computational linguistics
;
parallel corpora
;
Corpus Encoding Standard
;
translation lexicons
Source:
Springer Online Journal Archives 1860-2000
Topics:
Computer Science
,
Media Resources and Communication Sciences, Journalism
Notes:
Abstract We report on a project to annotate biblical texts in order to create an aligned multilingual Bible corpus for linguistic research, particularly computational linguistics, including automatically creating and evaluating translation lexicons and semantically tagged texts. The output of this project will enable researchers to take advantage of parallel translations across a wider number of languages than previously available, providing, with relatively little effort, a corpus that contains careful translations and reliable alignment at the near-sentence level. We discuss the nature of the text, our annotation process, preliminary and planned uses for the corpus, and relevant aspects of the Corpus Encoding Standard (CES) with respect to this corpus. We also present a quantitative comparison with dictionary and corpus resources for modern-day English, confirming the relevance of this corpus for research on present day language.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1023/A:1001798929185
Permalink