ADuLTS: Appearance Descriptions under Long-Tailed Scenarios with diverse synthesized images
- Authors
- Cha, Seungju; Choi, Seunghee; Lee, Kwanyoung; Kim, Dong-Jin
- Issue Date
- Mar-2026
- Publisher
- ACADEMIC PRESS INC ELSEVIER SCIENCE
- Keywords
- Data synthesizing; Generation; Long-tail classification; LLM; Supervised contrastive learning
- Citation
- COMPUTER VISION AND IMAGE UNDERSTANDING, v.265, pp 1 - 8
- Pages
- 8
- Indexed
- SCIE
SCOPUS
- Journal Title
- COMPUTER VISION AND IMAGE UNDERSTANDING
- Volume
- 265
- Start Page
- 1
- End Page
- 8
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211383
- DOI
- 10.1016/j.cviu.2026.104684
- ISSN
- 1077-3142
1090-235X
- Abstract
- Unlike class-balanced datasets such as ImageNet and CIFAR, real-world data often show an imbalanced distribution where the majority of image samples are concentrated in a few head classes while tail classes occupy a small portion of the data. If existing image classification models are trained using data with an imbalanced distribution, the results of the training can be biased towards head classes. Recently, supervised contrastive learning (SCL) methods using balanced datasets have shown effective results in the field of image classification. However, when applying SCL to image classification models with imbalanced datasets, the classification performance decreases because of the lack of tail class representations. With the recent success of synthesizing realistic images from generative models, to handle this representation shortage in tail classes, we propose Appearance Description under Long-Tailed Scenarios (ADuLTS) that utilize pre-trained large language models (LLM) for tail class sample generation. By asking LLM to describe the semantic appearance of each class, we synthesize images to alleviate the sample shortage in the tail classes. Furthermore, we propose a simple but effective joint training network to decrease the feature distribution difference while training with our balanced dataset with real and generated images. Our method shows noticeable classification performance enhancement on long-tailed scenarios, especially on the tail classes with fewer samples in CIFAR10-LT, CIFAR100-LT, and miniImageNet-LT.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > ETC > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.