Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering

Yu, Zhiwen; Luo, Peinan; You, Jane; Wong, Hau-San; Leung, Hareton; Wu, Si; Zhang, Jun; Han, Guoqiang

doi:10.1109/TKDE.2015.2499200

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yu, Zhiwen	-
dc.contributor.author	Luo, Peinan	-
dc.contributor.author	You, Jane	-
dc.contributor.author	Wong, Hau-San	-
dc.contributor.author	Leung, Hareton	-
dc.contributor.author	Wu, Si	-
dc.contributor.author	Zhang, Jun	-
dc.contributor.author	Han, Guoqiang	-
dc.date.accessioned	2024-04-09T03:01:46Z	-
dc.date.available	2024-04-09T03:01:46Z	-
dc.date.issued	2016-03	-
dc.identifier.issn	1041-4347	-
dc.identifier.issn	1558-2191	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/118558	-
dc.description.abstract	Traditional cluster ensemble approaches have three limitations: (1) They do not make use of prior knowledge of the datasets given by experts. (2) Most of the conventional cluster ensemble methods cannot obtain satisfactory results when handling high dimensional data. (3) All the ensemble members are considered, even the ones without positive contributions. In order to address the limitations of conventional cluster ensemble approaches, we first propose an incremental semi-supervised clustering ensemble framework (ISSCE) which makes use of the advantage of the random subspace technique, the constraint propagation approach, the proposed incremental ensemble member selection process, and the normalized cut algorithm to perform high dimensional data clustering. The random subspace technique is effective for handling high dimensional data, while the constraint propagation approach is useful for incorporating prior knowledge. The incremental ensemble member selection process is newly designed to judiciously remove redundant ensemble members based on a newly proposed local cost function and a global cost function, and the normalized cut algorithm is adopted to serve as the consensus function for providing more stable, robust, and accurate results. Then, a measure is proposed to quantify the similarity between two sets of attributes, and is used for computing the local cost function in ISSCE. Next, we analyze the time complexity of ISSCE theoretically. Finally, a set of nonparametric tests are adopted to compare multiple semi-supervised clustering ensemble approaches over different datasets. The experiments on 18 real-world datasets, which include six UCI datasets and 12 cancer gene expression profiles, confirm that ISSCE works well on datasets with very high dimensionality, and outperforms the state-of-the-art semi-supervised clustering ensemble approaches.	-
dc.format.extent	14	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Institute of Electrical and Electronics Engineers	-
dc.title	Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/TKDE.2015.2499200	-
dc.identifier.scopusid	2-s2.0-84962429330	-
dc.identifier.wosid	000370755300008	-
dc.identifier.bibliographicCitation	IEEE Transactions on Knowledge and Data Engineering, v.28, no.3, pp 701 - 714	-
dc.citation.title	IEEE Transactions on Knowledge and Data Engineering	-
dc.citation.volume	28	-
dc.citation.number	3	-
dc.citation.startPage	701	-
dc.citation.endPage	714	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	sci	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	CLASS DISCOVERY	-
dc.subject.keywordPlus	CONSENSUS	-
dc.subject.keywordPlus	FRAMEWORK	-
dc.subject.keywordAuthor	Cluster ensemble	-
dc.subject.keywordAuthor	semi-supervised clustering	-
dc.subject.keywordAuthor	random subspace	-
dc.subject.keywordAuthor	cancer gene expression profile	-
dc.subject.keywordAuthor	clustering analysis	-
dc.identifier.url	https://ieeexplore.ieee.org/document/7323847	-

Files in This Item: Go to Link

Appears in Collections: COLLEGE OF ENGINEERING SCIENCES > SCHOOL OF ELECTRICAL ENGINEERING > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher ZHANG, Jun photo

ZHANG, Jun: ERICA 공학대학 (SCHOOL OF ELECTRICAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE