Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods
Year of publication
2022
Authors
Niemelä, Marko; Kärkkäinen, Tommi
Abstract
Missing data introduces a challenge in the field of unsupervised learning. In clustering, when the form and the number of clusters are to be determined, one needs to deal with the missing values both in the clustering process and in the cluster validation. In the previous research, the clustering algorithm has been treated using robust clustering methods and available data strategy, and the cluster validation indices have been computed with the partial distance approximation. However, lately special methods for distance estimation with missing values have been proposed and this work is the first one where these methods are systematically applied and tested in clustering and cluster validation. More precisely, we propose, implement, and analyze the use of distance estimation methods to improve the discrimination power of clustering and cluster validation indices. A novel, robust prototype-based clustering process in two stages is suggested. Our results and conclusions confirm the usefulness of the distance estimation methods in clustering but, surprisingly, not in cluster validation.
Show moreOrganizations and authors
Publication type
Publication format
Article
Parent publication type
Compilation
Article type
Other article
Audience
ScientificPeer-reviewed
Peer-ReviewedMINEDU's publication type classification code
A3 Book section, Chapters in research booksPublication channel information
Parent publication name
Parent publication editors
Tuovinen, Tero T.; Periaux, Jacques; Neittaanmäki, Pekka
Publisher
Pages
123-133
ISSN
ISBN
Publication forum
Publication forum level
2
Open access
Open access in the publisher’s service
No
Self-archived
Yes
Other information
Fields of science
Mathematics; Computer and information sciences
Keywords
[object Object],[object Object],[object Object]
Publication country
Switzerland
Internationality of the publisher
International
Language
English
International co-publication
No
Co-publication with a company
No
DOI
10.1007/978-3-030-70787-3_9
The publication is included in the Ministry of Education and Culture’s Publication data collection
Yes