ACR: Adaptive Confidence Re-Scoring for Reliable Answer Selection Among Multiple Candidatesopen access
- Authors
- Jeong, Eunhye; Choi, Yong Suk
- Issue Date
- Aug-2025
- Publisher
- MDPI
- Keywords
- natural language processing; question answering; large language models; prompt engineering; verification
- Citation
- Applied Sciences-basel, v.15, no.17, pp 1 - 16
- Pages
- 16
- Indexed
- SCIE
SCOPUS
- Journal Title
- Applied Sciences-basel
- Volume
- 15
- Number
- 17
- Start Page
- 1
- End Page
- 16
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208862
- DOI
- 10.3390/app15179587
- ISSN
- 2076-3417
2076-3417
- Abstract
- With the improved reasoning capabilities of large language models (LLMs), their applications have rapidly expanded across a wide range of tasks. In recent question answering tasks, performance gains have been achieved through Self-Consistency, where LLMs generate multiple reasoning paths and determine the final answer via majority voting. However, this approach can fail when the correct answer is generated but does not appear frequently enough to be selected, highlighting its vulnerability to inconsistent generations. To address this, we propose Adaptive Confidence Re-scoring (ACR)-a method that adaptively evaluates and re-scores candidate answers to select the most trustworthy one when LLMs fail to generate consistent reasoning. Experiments on arithmetic and logical reasoning benchmarks show that ACR maintains or improves answer accuracy while significantly reducing inference cost. Compared to existing verification methods such as FOBAR, ACR reduces the number of inference calls by up to 95%, while improving inference efficiency-measured as accuracy gain per inference call-by a factor of 2x to 17x, depending on the dataset and model.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.