AlphaRouter: Bridging the Gap Between Reinforcement Learning and Optimization for Vehicle Routing with Monte Carlo Tree Searches

Kim, Won-Jun; Jeong, Junho; Kim, Taeyeong; Lee, Kichun

doi:10.3390/e27030251

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

AlphaRouter: Bridging the Gap Between Reinforcement Learning and Optimization for Vehicle Routing with Monte Carlo Tree Searches

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Won-Jun	-
dc.contributor.author	Jeong, Junho	-
dc.contributor.author	Kim, Taeyeong	-
dc.contributor.author	Lee, Kichun	-
dc.date.accessioned	2025-04-28T08:30:20Z	-
dc.date.available	2025-04-28T08:30:20Z	-
dc.date.issued	2025-02	-
dc.identifier.issn	1099-4300	-
dc.identifier.issn	1099-4300	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/207257	-
dc.description.abstract	Deep reinforcement learning (DRL) as a routing problem solver has shown promising results in recent studies. However, an inherent gap exists between computationally driven DRL and optimization-based heuristics. While a DRL algorithm for a certain problem is able to solve several similar problem instances, traditional optimization algorithms focus on optimizing solutions to one specific problem instance. In this paper, we propose an approach, AlphaRouter, which solves routing problems while bridging the gap between reinforcement learning and optimization. Fitting to routing problems, our approach first proposes attention-enabled policy and value networks consisting of a policy network that produces a probability distribution over all possible nodes and a value network that produces the expected distance from any given state. We modify a Monte Carlo tree search (MCTS) for the routing problems, selectively combining it with the routing problems. Our experiments demonstrate that the combined approach is promising and yields better solutions compared to original reinforcement learning (RL) approaches without MCTS, with good performance comparable to classical heuristics.	-
dc.format.extent	26	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Multidisciplinary Digital Publishing Institute (MDPI)	-
dc.title	AlphaRouter: Bridging the Gap Between Reinforcement Learning and Optimization for Vehicle Routing with Monte Carlo Tree Searches	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/e27030251	-
dc.identifier.scopusid	2-s2.0-105001337085	-
dc.identifier.wosid	001453942700001	-
dc.identifier.bibliographicCitation	Entropy, v.27, no.3, pp 1 - 26	-
dc.citation.title	Entropy	-
dc.citation.volume	27	-
dc.citation.number	3	-
dc.citation.startPage	1	-
dc.citation.endPage	26	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Physics	-
dc.relation.journalWebOfScienceCategory	Physics, Multidisciplinary	-
dc.subject.keywordPlus	GO	-
dc.subject.keywordPlus	ALGORITHMS	-
dc.subject.keywordPlus	GAME	-
dc.subject.keywordAuthor	deep reinforcement learning	-
dc.subject.keywordAuthor	reinforcement learning	-
dc.subject.keywordAuthor	MCTS	-
dc.subject.keywordAuthor	vehicle routing problem	-
dc.identifier.url	https://www.mdpi.com/1099-4300/27/3/251	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 산업공학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lee, Ki chun photo

Lee, Ki chun: COLLEGE OF ENGINEERING (DEPARTMENT OF INDUSTRIAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE