Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

AI-driven synthetic data generation for accelerating hepatology research: A study of the United Network for Organ Sharing (UNOS) database

Authors
Ahn, Joseph C.Noh, Yung-KyunHu, MingzhaoShen, XiaotongSimonetto, Douglas A.Kamath, Patrick S.Loomba, RohitShah, Vijay H.
Issue Date
Feb-2026
Publisher
Lippincott Williams and Wilkins
Keywords
artificial intelligence; diffusion models; liver transplantation; privacy-preserving healthcare data; synthetic data
Citation
Hepatology, v.83, no.2, pp 304 - 316
Pages
13
Indexed
SCIE
SCOPUS
Journal Title
Hepatology
Volume
83
Number
2
Start Page
304
End Page
316
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/210465
DOI
10.1097/HEP.0000000000001299
ISSN
0270-9139
1527-3350
Abstract
Background and Aims: Clinical hepatology research often faces limited data availability, underrepresentation of minority groups, and complex data-sharing regulations. Synthetic data - artificially generated patient records designed to mirror real-world distributions - offers a potential solution. We hypothesized that diffusion models, a state-of-the-art generative technique, could produce synthetic liver transplant waitlist data from the United Network for Organ Sharing (UNOS) database that maintains statistical fidelity, replicates clinical correlations and survival patterns, and ensures robust privacy protection. Methods: Diffusion models were used to generate synthetic patient cohorts mirroring the UNOS liver transplant waitlist database between years 2019 and 2023. Statistical fidelity was assessed using Maximum Mean Discrepancy (MMD) and Wasserstein distance, correlation analysis, and variable-level metrics. Clinical utility was evaluated by comparing transplant-free survival via Kaplan-Meier curves and the MELD score performance. Privacy was quantified using the Distance to Closest Record (DCR) and attribute disclosure risk assessments. Results: The synthetic dataset was nearly indistinguishable from the original dataset (MMD=0.002, standardized Wasserstein distance<1.0), preserving clinically relevant correlations and survival patterns as evidenced by similar median survival times (110 vs. 101 days) and 5-year survival rates (22.2% vs. 22.8%). MELD-based 90-day mortality prediction was maintained (original AUC=0.839 vs. synthetic AUC=0.844). Privacy metrics indicated no identifiable patient matches, and mean DCR values ensured that synthetic individuals were not direct replicas of real patients. Conclusion: AI-generated synthetic data derived from diffusion models can faithfully replicate complex hepatology datasets, maintain key clinical signals, and ensure strong privacy safeguards. This approach can help address data scarcity, enhance model generalizability, foster multi-institutional collaboration, and accelerate progress in hepatology research.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Noh, Yung Kyun photo

Noh, Yung Kyun
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE