Detailed Information

Cited 1 time in webofscience Cited 3 time in scopus
Metadata Downloads

An efficient framework for semantically-correlated term detection and sanitization in clinical documents

Full metadata record
DC Field Value Language
dc.contributor.authorMoqurrab, Syed Atif-
dc.contributor.authorAnjum, Adeel-
dc.contributor.authorTariq, Noshina-
dc.contributor.authorSrivastava, Gautam-
dc.date.accessioned2023-07-12T08:40:16Z-
dc.date.available2023-07-12T08:40:16Z-
dc.date.created2023-07-12-
dc.date.issued2022-05-01-
dc.identifier.issn0045-7906-
dc.identifier.urihttps://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/88491-
dc.description.abstractIn clinical documents, privacy and confidentiality protection are the two main challenges before sharing or publishing data. According to the Health Insurance Portability and Accountability Act (HIPAA) and the General Data Protection Regulation (GDPR), even a few terms can cause privacy threats. In retrospect, confidentiality threats are not fully explored due to the complex nature as well as massive number of clinical terms and phrases. Current approaches use information theoretic-based techniques to detect and sanitize risky semantically-correlated terms. However, they have language ambiguity and non-monotonic behavior, coupled with the fact that pre-trained classifiers and human-tagging are required to construct classifiers. This paper offers a generic and adaptable method for protecting risky terms in clinical data using word embedding (Word2Vec and BERT) for risky term detection and comparative analysis. Our methodology uses WordNet taxonomy to minimize a document's semantic and utility loss by substituting privacy-preserving generalization for disclosive words and by eliminating manual data tagging. The results show significant protection and utility-preservation, compared to information-theoretic approaches.-
dc.language영어-
dc.language.isoen-
dc.publisherPERGAMON-ELSEVIER SCIENCE LTD-
dc.relation.isPartOfCOMPUTERS & ELECTRICAL ENGINEERING-
dc.titleAn efficient framework for semantically-correlated term detection and sanitization in clinical documents-
dc.typeArticle-
dc.type.rimsART-
dc.description.journalClass1-
dc.identifier.wosid000798029100003-
dc.identifier.doi10.1016/j.compeleceng.2022.107985-
dc.identifier.bibliographicCitationCOMPUTERS & ELECTRICAL ENGINEERING, v.100-
dc.description.isOpenAccessN-
dc.identifier.scopusid2-s2.0-85129306146-
dc.citation.titleCOMPUTERS & ELECTRICAL ENGINEERING-
dc.citation.volume100-
dc.contributor.affiliatedAuthorMoqurrab, Syed Atif-
dc.type.docTypeArticle-
dc.subject.keywordAuthorMachine learning-
dc.subject.keywordAuthorData privacy-
dc.subject.keywordAuthorUnsupervised learning-
dc.subject.keywordAuthorSemantically-correlated terms-
dc.subject.keywordAuthorDetection-
dc.subject.keywordAuthorSanitization-
dc.subject.keywordAuthorUtility-preservation-
dc.subject.keywordAuthorClinical documents-
dc.subject.keywordAuthorClinical data privacy-
dc.subject.keywordAuthorWord embedding-
dc.subject.keywordPlusPRIVACY-
dc.subject.keywordPlusPROTECTION-
dc.subject.keywordPlusMODEL-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryComputer Science, Hardware & Architecture-
dc.relation.journalWebOfScienceCategoryComputer Science, Interdisciplinary Applications-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Moqurrab, Syed Atif photo

Moqurrab, Syed Atif
College of IT Convergence (Department of Software)
Read more

Altmetrics

Total Views & Downloads

BROWSE