TwiSNER: Semi-supervised Method for Named Entity Recognition from Text Streams on Twitter
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Van Cuong Tran | - |
dc.contributor.author | Hwang, Dosam | - |
dc.contributor.author | Jung, Jason J. | - |
dc.date.available | 2019-03-08T15:58:44Z | - |
dc.date.issued | 2016 | - |
dc.identifier.issn | 0948-695X | - |
dc.identifier.issn | 0948-6968 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/8740 | - |
dc.description.abstract | The data on Social Network Services (SNSs) has recently become an interesting source for researchers conducting different Natural Language Processing (NLP) experiments, such as sentiment analysis, information extraction, Named Entity Recognition (NER), and so on. The characteristics of SNS data are usually described as short, noisy, with insufficient supplemental information. They often contain grammatical errors, misspellings, and unreliable capitalization. Thus, standard NLP tools (e.g., NER systems) have difficulty obtaining good results when they are applied on these data, even if they perform well on well-formatted texts. Most of the traditional NER methods are based on supervised learning techniques that often require a large amount of standard training data to train a classifier. In this paper, we propose a method called TwiSNER to classify named entities in Twitter data (called tweets) by using a semi-supervised learning approach combined with the conditional random field model, hand-made rules, and the co-occurrence coefficient of the featured words surrounding entities. In the experiments, TwiSNER is applied on a dataset collected from Twitter, which includes 11,425 tweets for training with 4,716 labeled tweets and 1,450 tweets for testing. TwiSNER produces promising results, where the best F-measure is better than the baselines. | - |
dc.format.extent | 20 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | GRAZ UNIV TECHNOLGOY, INST INFORMATION SYSTEMS COMPUTER MEDIA-IICM | - |
dc.title | TwiSNER: Semi-supervised Method for Named Entity Recognition from Text Streams on Twitter | - |
dc.type | Article | - |
dc.identifier.doi | 10.3217/jucs-022-06-0782 | - |
dc.identifier.bibliographicCitation | JOURNAL OF UNIVERSAL COMPUTER SCIENCE, v.22, no.6, pp 782 - 801 | - |
dc.description.isOpenAccess | N | - |
dc.identifier.wosid | 000384891200004 | - |
dc.identifier.scopusid | 2-s2.0-84983389589 | - |
dc.citation.endPage | 801 | - |
dc.citation.number | 6 | - |
dc.citation.startPage | 782 | - |
dc.citation.title | JOURNAL OF UNIVERSAL COMPUTER SCIENCE | - |
dc.citation.volume | 22 | - |
dc.type.docType | Article | - |
dc.publisher.location | 오스트리아 | - |
dc.subject.keywordAuthor | Named Entity Recognition | - |
dc.subject.keywordAuthor | SNS Analysis | - |
dc.subject.keywordAuthor | Semi-supervised Learning | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Software Engineering | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea (06974)02-820-6194
COPYRIGHT 2019 Chung-Ang University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.