Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data

Yang Y.; Kang S.; Seo J.

Detailed Information

Cited 5 time in webofscience

Cited 5 time in scopus

Metadata Downloads

Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yang Y.	-
dc.contributor.author	Kang S.	-
dc.contributor.author	Seo J.	-
dc.date.available	2020-03-03T06:45:29Z	-
dc.date.created	2020-02-24	-
dc.date.issued	2020-01	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/17740	-
dc.description.abstract	Machine reading comprehension (MRC) is a natural language processing task wherein a given question is answered according to a holistic understanding of a given context. Recently, many researchers have shown interest in MRC, for which a considerable number of datasets are being released. Datasets for MRC, which are composed of the context-query-answer triple, are designed to answer a given query by referencing and understanding a readily-available, relevant context text. The TriviaQA dataset is a weakly labeled dataset, because it contains irrelevant context that forms no basis for answering the query. The existing syntactic data cleaning method struggles to deal with the contextual noise this irrelevancy creates. Therefore, a semantic data cleaning method using reasoning processes is necessary. To address this, we propose a new MRC model in which the TriviaQA dataset is validated and trained using a high-quality dataset. The data validation method in our MRC model improves the quality of the training dataset, and the answer extraction model learns with the validated training data, because of our validation method. Our proposed method showed a 4.33% improvement in performance for the TriviaQA Wiki, compared to the existing baseline model. Accordingly, our proposed method can address the limitation of irrelevant context in MRC better than the human supervision. © 2013 IEEE.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.relation.isPartOf	IEEE Access	-
dc.title	Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data	-
dc.type	Article	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.identifier.wosid	000524677500039	-
dc.identifier.doi	10.1109/ACCESS.2019.2963569	-
dc.identifier.bibliographicCitation	IEEE Access, v.8, pp.5667 - 5677	-
dc.description.isOpenAccess	N	-
dc.identifier.scopusid	2-s2.0-85078309202	-
dc.citation.endPage	5677	-
dc.citation.startPage	5667	-
dc.citation.title	IEEE Access	-
dc.citation.volume	8	-
dc.contributor.affiliatedAuthor	Kang S.	-
dc.type.docType	Article	-
dc.subject.keywordAuthor	Computational and artificial intelligence	-
dc.subject.keywordAuthor	data validation	-
dc.subject.keywordAuthor	machine reading comprehension	-
dc.subject.keywordAuthor	natural language processing	-
dc.subject.keywordAuthor	neural networks	-
dc.subject.keywordAuthor	weak label	-
dc.subject.keywordPlus	Neural networks	-
dc.subject.keywordPlus	Query processing	-
dc.subject.keywordPlus	Semantics	-
dc.subject.keywordPlus	Computational and artificial intelligences	-
dc.subject.keywordPlus	Data validation	-
dc.subject.keywordPlus	NAtural language processing	-
dc.subject.keywordPlus	Reading comprehension	-
dc.subject.keywordPlus	Weak labels	-
dc.subject.keywordPlus	Natural language processing systems	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: IT융합대학 > 소프트웨어학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kang, Sang Woo photo

Kang, Sang Woo: College of IT Convergence (Department of Software)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,153,250; Today View :15,047

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE