오픈소스 OCR을 이용한 군수품 시험성적서 전산화 연구A Study on the Computerization of Military Supplies Test Reports with Open Source OCR
- Other Titles
- A Study on the Computerization of Military Supplies Test Reports with Open Source OCR
- Authors
- 백승현
- Issue Date
- Sep-2025
- Publisher
- 한국품질경영학회
- Keywords
- DMADOV; Open Source; OCR; Military Supplies; Test Report
- Citation
- 품질경영학회지, v.53, no.3, pp 435 - 460
- Pages
- 26
- Indexed
- KCI
- Journal Title
- 품질경영학회지
- Volume
- 53
- Number
- 3
- Start Page
- 435
- End Page
- 460
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/126563
- DOI
- 10.7469/JKSQM.2025.53.3.435
- ISSN
- 1229-1889
2287-9005
- Abstract
- Purpose: This study aims to transform test reports in the defense industry into a structured database (DB)
by leveraging open-source Optical Character Recognition (OCR) and following the DMADOV methodology
for quality improvement.
Methods: The research was conducted in two phases following the DMADOV procedure. First, a baseline
system was developed using the open-source OCR engine Tesseract to create a text extraction program,
with data structuring attempted via rule-based post-processing. Subsequently, to overcome the system's limitations,
a multi-model pipeline, specifically PaddleOCR's PP-Structure, was applied to enhance structural
recognition performance, including layout analysis and table recognition. The performance of both systems
was comparatively verified through quantitative metrics and qualitative analysis.
Results: The initial Tesseract-based model heavily relied on strict, rule-based post-processing to ultimately
achieve a 100% data match rate, but this revealed the system's lack of scalability and flexibility. In contrast,
the optimized system using the multi-model pipeline (PP-Structure) accurately recognized the document's
structure and content without requiring separate, complex post-processing, demonstrating superior performance
in both qualitative and quantitative aspects.
Conclusion: This study clearly identified the limitations of a simple OCR engine and demonstrated that a
multi-model pipeline is an effective alternative for the automated structuring of defense quality data. The
findings provide a practical roadmap for system integration companies and their partners to build a big data-based quality information system. Furthermore, the study is significant in its proposal of data utilization
strategies for the implementation of Defense Quality 4.0.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - COLLEGE OF BUSINESS AND ECONOMICS > DIVISION OF BUSINESS ADMINISTRATION > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.