QA275 : Clustering with the classification method of Random Forests
Thesis > Central Library of Shahrood University > Mathematical Sciences > MSc > 2014
Authors:
Zohreh Farhadi [Author], Davood Shahsavani[Supervisor]
Abstarct: Clustering, that is widely used in various fields of science, is one of the important tools in data mining. Many clustering methods have been built baxsed on dissimilarity of observations which is calculated by a distance function. The progress of science and technology in the world today has resulted in large data sets with very large number of underlying variables. As the performance of distance functions are affected by increasing the dimension, identifying the clusters in the high dimensional space can be viewed as a challenge for researchers. In this thesis, a new kind of dissimilarity measure baxsed on the classification method of Random Forests is investigated to provide a new basis for any clustering method. Thereafter, the Multidimensional scaling method is combined with Partition around medoid clustering algorithm to handle some simulation and real examples. The obtained results confirm the superiority of this approach in comparison with common methods.
Keywords:
#Clustering #Random forest #Variables importance #Partition around medoid #Multidimensional scaling Link
Keeping place: Central Library of Shahrood University
Visitor: