Evaluation of TnT Tagger for Spanish
- Authors
- Carrasco, R.M.; Gelbukh, A.
- Issue Date
- Sep-2003
- Publisher
- IEEE Computer Society
- Keywords
- Character recognition; Error analysis; Mood; Natural languages; Speech processing; Speech recognition; Tagging; Testing; Text processing; Text recognition
- Citation
- Proceedings of the Mexican International Conference on Computer Science, v.2003-January, pp 18 - 25
- Pages
- 8
- Journal Title
- Proceedings of the Mexican International Conference on Computer Science
- Volume
- 2003-January
- Start Page
- 18
- End Page
- 25
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/65617
- DOI
- 10.1109/ENC.2003.1232869
- ISSN
- 1550-4069
- Abstract
- Part of speech (POS) tagger is a necessary module in many natural language text processing tasks. A POS tagger is a program that accepts an unprepared raw text in input and to each word adds a tag specifying its grammatical properties, such as part of speech, number, person, etc. One of popular POS taggers - TnT tagger - has been extensively tested for English and some other languages. This paper reports on its evaluation for Spanish language. Error analysis is reported, explaining how some specific features of Spanish language affect tagger performance. It is reported that on Spanish texts TnT shows overall tagging accuracy between 92.5% and 95.84%, specifically, between 95.47% and 98.56% on known words and between 75.57% and 83.49% on unknown words. Results show that TnT has reached a good level of maturity and is helpful enough for NLP tasks. © 2003 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Software > School of Computer Science and Engineering > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/65617)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.