Rational Text Augmentation Method with Korean Misspellings
- Authors
- Yu, Seunguk; Song, Sangmin; Kim, Youngbin
- Issue Date
- Jan-2024
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- Data Augmentation; Deep Learning; Natural Language Processing
- Citation
- Digest of Technical Papers - IEEE International Conference on Consumer Electronics, v.2024 IEEE
- Journal Title
- Digest of Technical Papers - IEEE International Conference on Consumer Electronics
- Volume
- 2024 IEEE
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/73046
- DOI
- 10.1109/ICCE59016.2024.10444190
- ISSN
- 0747-668X
- Abstract
- With the development of deep learning, natural language processing models have been proposed, but they may not be able to perform fully on sentences with misspellings. From this perspective, we propose a text augmentation method using Korean misspellings from online comments. Unlike previous text augmentation methods, we focus on the fact that the original meaning of Korean words can be inferred even when misspellings occur. Furthermore, Korean has a variety of misspellings even within a single character, so we analyze the frequencies of them to create augmented sentences. The experimental results show that our method can achieve an average performance improvement of 4.5% p on the NSMC and Korean Hate Speech and is a more rational and efficient method compared to previously used text augmentation methods. © 2024 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Graduate School of Advanced Imaging Sciences, Multimedia and Film > Department of Imaging Science and Arts > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/73046)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.