Evaluation of TnT Tagger for Spanish

Carrasco, R.M.; Gelbukh, A.

doi:10.1109/ENC.2003.1232869

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Evaluation of TnT Tagger for Spanish

Authors: Carrasco, R.M.; Gelbukh, A.

Issue Date: Sep-2003

Publisher: IEEE Computer Society

Keywords: Character recognition; Error analysis; Mood; Natural languages; Speech processing; Speech recognition; Tagging; Testing; Text processing; Text recognition

Citation: Proceedings of the Mexican International Conference on Computer Science, v.2003-January, pp 18 - 25

Pages: 8

Journal Title: Proceedings of the Mexican International Conference on Computer Science

Volume: 2003-January

Start Page: 18

End Page: 25

URI: https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/65617

DOI: 10.1109/ENC.2003.1232869

ISSN: 1550-4069

Abstract: Part of speech (POS) tagger is a necessary module in many natural language text processing tasks. A POS tagger is a program that accepts an unprepared raw text in input and to each word adds a tag specifying its grammatical properties, such as part of speech, number, person, etc. One of popular POS taggers - TnT tagger - has been extensively tested for English and some other languages. This paper reports on its evaluation for Spanish language. Error analysis is reported, explaining how some specific features of Spanish language affect tagger performance. It is reported that on Spanish texts TnT shows overall tagging accuracy between 92.5% and 95.84%, specifically, between 95.47% and 98.56% on known words and between 75.57% and 83.49% on unknown words. Results show that TnT has reached a good level of maturity and is helpful enough for NLP tasks. © 2003 IEEE.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Software > School of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,481,344; Today View :933

RSS_1.0 RSS_2.0 ATOM_1.0

84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea (06974)02-820-6194

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE