Spectral clustering based on the local similarity measure of shared neighbors

Cao, Zongqi;Chen, Hongjia;Wang, Xiang;

doi:10.4218/etrij.2021-0230

ETRI Journal

Volume 44 Issue 5
/
Pages.769-779
/
2022
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Spectral clustering based on the local similarity measure of shared neighbors

Cao, Zongqi (Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University) ;
Chen, Hongjia (Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University) ;
Wang, Xiang (Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University)

Received : 2021.07.23
Accepted : 2022.02.05
Published : 2022.10.10

https://doi.org/10.4218/etrij.2021-0230 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

Keywords

Acknowledgement

Thank the anonymous reviewers for very helpful comments and suggestions. The work is supported by the National Natural Science Foundation of China under Frant Nos. 11961048, 12001262, and 11801258. The work is supported by Jiangxi Provincial Natural Science Foundation under Grant No. 20181ACB20001.

References

G. James, D. Witten, T. Hastie, and R. Tibshirani, An introduction to statistical learning, Vol. 112, Springer, 2013.
T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning: data mining, inference, and prediction, Springer Science & Business Media, 2009.
J. Malik, S. Belongie, T. Leung, and J. Shi, Contour and texture analysis for image segmentation, Int. J. Comput. Vis. 43 (2001), no. 1, 7-27. https://doi.org/10.1023/A:1011174803800
N. Nithya, K. Duraiswamy, and P. Gomathy, A survey on clustering techniques in medical diagnosis, Int. J. Comput. Sci. Trends Technol. 1 (2013), no. 2, 17-23.
N. Jardine and C. J. van Rijsbergen, The use of hierarchic clustering in information retrieval, Inform. Storage Retr. 7 (1971), no. 5, 217-240. https://doi.org/10.1016/0020-0271(71)90051-9
A. Cuzzocrea, Privacy-preserving big data stream mining: opportunities, challenges, directions, (IEEE International Conference on Data Mining Workshops, New Orleans, LA USA), 2017. https://doi.org/10.1109/ICDMW.2017.140
A. K. Jain, M. N. Murty, and P. J. Flynn, Data clustering: a review, ACM Comput Surv 31 (1999), no. 3, 264-323. https://doi.org/10.1145/331499.331504
Z. Huang, Extensions to the k-means algorithm for clustering large data sets with categorical values, Data Mining Knowl Discov 2 (1998), no. 3, 283-304. https://doi.org/10.1023/A:1009769707641
J. MacQueen, Some methods for classification and analysis of multivariate observations, (Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA), 1967, pp. 281-297.
J. C. Dunn, Well-separated clusters and optimal fuzzy partitions, J. Cybern. 4 (1974), no. 1, 95-104. https://doi.org/10.1080/01969727408546059
A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Statistical Soc.: Series B (Methodological) 39 (1977), 1-22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
Z. Huang, A fast clustering algorithm to cluster very large categorical data sets in data mining, DMKD 3 (1997), no. 8, 34-39.
H. Jia, S. Ding, X. Xu, and R. Nie, The latest research progress on spectral clustering, Neural Comput. Applicat. 24 (2014), no. 7, 1477-1486. https://doi.org/10.1007/s00521-013-1439-2
J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 22 (2000), no. 8, 888-905. https://doi.org/10.1109/34.868688
L. Wang and M. Dong, Multi-level low-rank approximationbased spectral clustering for image segmentation, Pattern Recogn. Lett. 33 (2012), no. 16, 2206-2215. https://doi.org/10.1016/j.patrec.2012.07.024
H. Y. Chae, K. Lee, J. Jang, K. Park, and J. J. Kim, A wearable sEMG pattern-recognition integrated interface embedding analog pseudo-wavelet preprocessing, IEEE Access 7 (2019), 151320-151328. https://doi.org/10.1109/ACCESS.2019.2948090
D. Havrilov, S. Baraban, A. Volovyk, O. Zviahin, A. Semenov, and A. Savytskyi, Real-time video processing system based on field programmable gate array, (14th International Conference on Computer Sciences and Information Technologies, Lviv, Ukraine), Sept. 2019. https://doi.org/10.1109/STC-CSIT.2019.8929758
Z. Yu, L. Li, J. You, H.-S. Wong, and G. Han, Sc³ : triple spectral clustering-based consensus clustering framework for class discovery from cancer gene expression profiles, IEEE/ACM Trans Computat. Biology Bioinform. 9 (2012), no. 6, 1751-1765. https://doi.org/10.1109/TCBB.2012.108
D. J. Higham, G. Kalna, and M. Kibble, Spectral clustering and its use in bioinformatics, J. Computat. Appl Math. 204 (2007), no. 1, 25-37. https://doi.org/10.1016/j.cam.2006.04.026
U. Von Luxburg, A tutorial on spectral clustering, Statistics Comput. 17 (2007), no. 4, 395-416. https://doi.org/10.1007/s11222-007-9033-z
A. Ng, M. Jordan, and Y. Weiss, On spectral clustering: analysis and an algorithm, Adv. Neural Inform. Process. Syst. 14 (2001), 849-856.
L. Zelnik-Manor and P. Perona, Self-tuning spectral clustering, Adv. Neural Inform. Process. Syst. 17 (2005), 1601-1608.
X. Zhang, J. Li, and H. Yu, Local density adaptive similarity measurement for spectral clustering, Pattern Recogn. Lett. 32 (2011), no. 2, 352-358. https://doi.org/10.1016/j.patrec.2010.09.014
M. Lucinska and S. T. Wierzchon, Spectral clustering based on k-nearest neighbor graph, (11th International Conference on Computer Information Systems and Industrial Management, Venice, Italy), 2012, pp. 254-265.
M. Tan, S. Zhang, and L. Wu, Mutual kNN based spectral clustering, Neural Comput. Applicat. 32 (2018), no. 11, 6435-6442.
R. A. Jarvis and E. A. Patrick, Clustering using a similarity measure based on shared near neighbors, IEEE Trans. Comput. 100 (1973), no. 11, 1025-1034.
X. Ye and T. Sakurai, Spectral clustering using robust similarity measure based on closeness of shared nearest neighbors, (International Joint Conference on Neural Networks, Killarney, Ireland), 2015. https://doi.org/10.1109/IJCNN.2015.7280495
X. Ye and T. Sakurai, Robust similarity measure for spectral clustering based on shared neighbors, ETRI J. 38 (2016), no. 3, 540-550.
Q. Zhu, J. Feng, and J. Huang, Natural neighbor: a selfadaptive neighborhood method without parameter K, Pattern Recogn. Lett. 80 (2016), 30-36. https://doi.org/10.1016/j.patrec.2016.05.007
M. Yuan and Q. Zhu, Spectral clustering algorithm based on fast search of natural neighbors, IEEE Access 8 (2020), 67277-67288. https://doi.org/10.1109/ACCESS.2020.2985425
M. Alshammari, J. Stavrakaris, and M. Takatsuka, Refining a k-nearest neighbor graph for a computationally efficient spectral clustering, Pattern Recogn. 114 (2021), 2021. https://doi.org/10.1016/j.patcog.2021.107869
B. Mohar, Y. Alavi, G. Chartrand, and O. Oellermann, The Laplacian spectrum of graphs, Graph Theory, Combinatorics Applicat. 2 (1991), no. 12, 871-898.
B. Mohar, Some applications of Laplace eigenvalues of graphs, Graph Symmetry Springer (1997), 225-275.
F. R. Chung and F. C. Graham, Spectral graph theory, no. 92 American Mathematical Soc., 1997.
Feature selection, 2018. http://featureselection.asu.edu/datasets.php
A. Asuncion and D. Newman, UCI machine learning repository, 2007. https://archive.ics.uci.edu/ml/
A. Strehl and J. Ghosh, Cluster ensembles-a knowledge reuse framework for combining multiple partitions, J. Mach. Learn. Res. 3 (2002), 583-617.
M. Du, S. Ding, and H. Jia, Study on density peaks clustering based on k-nearest neighbors and principal component analysis, Knowledge-Based Syst. 99 (2016), 135-145. https://doi.org/10.1016/j.knosys.2016.02.001
D. Cheng, Q. Zhu, J. Huang, Q. Wu, and L. Yang, Clustering with local density peaks-based minimum spanning tree, IEEE Trans. Knowl. Data Eng. 33 (2019), no. 2, 374-387.
J. Xie, Z. Y. Xiong, Y. F. Zhang, Y. Feng, and J. Ma, Density core-based clustering algorithm with dynamic scanning radius, Knowledge-Based Syst. 142 (2018), 58-70. https://doi.org/10.1016/j.knosys.2017.11.025
Y. Chen, S. Tang, L. Zhou, C. Wang, J. Du, T. Wang, and S. Pei, Decentralized clustering by finding loose and distributed density cores, Inform. Sci. 433 (2018), 510-526.

ETRI Journal

Spectral clustering based on the local similarity measure of shared neighbors

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)