undefined

Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods

Year of publication

2022

Authors

Niemelä, Marko; Kärkkäinen, Tommi

Abstract

Missing data introduces a challenge in the field of unsupervised learning. In clustering, when the form and the number of clusters are to be determined, one needs to deal with the missing values both in the clustering process and in the cluster validation. In the previous research, the clustering algorithm has been treated using robust clustering methods and available data strategy, and the cluster validation indices have been computed with the partial distance approximation. However, lately special methods for distance estimation with missing values have been proposed and this work is the first one where these methods are systematically applied and tested in clustering and cluster validation. More precisely, we propose, implement, and analyze the use of distance estimation methods to improve the discrimination power of clustering and cluster validation indices. A novel, robust prototype-based clustering process in two stages is suggested. Our results and conclusions confirm the usefulness of the distance estimation methods in clustering but, surprisingly, not in cluster validation.
Show more

Organizations and authors

University of Jyväskylä

Niemelä Marko Orcid -palvelun logo

Kärkkäinen Tommi Orcid -palvelun logo

Publication type

Publication format

Article

Parent publication type

Compendium

Article type

Other article

Audience

Scientific

Peer-reviewed

Peer-Reviewed

MINEDU's publication type classification code

A3 Book section, Chapters in research books

Publication channel information

Parent publication editors

Tuovinen, Tero T.; Periaux, Jacques; Neittaanmäki, Pekka

Publisher

Springer

Pages

123-133

​Publication forum

5952

​Publication forum level

2

Open access

Open access in the publisher’s service

No

Self-archived

Yes

Other information

Fields of science

Mathematics; Computer and information sciences

Keywords

[object Object],[object Object],[object Object]

Publication country

Switzerland

Internationality of the publisher

International

Language

English

International co-publication

No

Co-publication with a company

No

DOI

10.1007/978-3-030-70787-3_9

The publication is included in the Ministry of Education and Culture’s Publication data collection

Yes