Dataset retrieval system based on automation of data preparation with dataset description model
- Authors
- Mun, J.; Lee, S.; Choi, J.; Choi, J.; Bae, K.
- Issue Date
- 25-Jan-2021
- Publisher
- John Wiley and Sons Ltd
- Keywords
- data preparation; dataset description; dataset retrieval
- Citation
- Concurrency Computation , v.33, no.2
- Journal Title
- Concurrency Computation
- Volume
- 33
- Number
- 2
- URI
- http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/34739
- DOI
- 10.1002/cpe.5288
- ISSN
- 1532-0626
- Abstract
- Data preparation is the most effortful task in the process of statistical learning. Many studies related to data mining are performed without data preparation by assuming that qualified datasets are already prepared. It may hide useful patterns of data, which can result in poor performance and incorrect learning. Automation of data preparation can solve these problems. For automation of data preparation, a few issues should be considered, such as flexible expression of requirements according to the purpose of the learning model, accessibility to data sources, and performance degradation due to automation. In this paper, we propose a dataset description model that can express the requirements for data processing and dataset retrieval system based on automated data preparation. The proposed system makes it possible to provide good quality datasets for statistical learning applications using data preparation methods such as data acquisition, refinement, and organization. In the experiment, we demonstrate that the proposed system doesn't have performance loss as compared to the existing manual systems. Moreover, the quality of the datasets are also improved by using the proposed system. © 2019 John Wiley & Sons, Ltd.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/34739)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.