QA275 : Clustering with the classification method of Random Forests
Thesis > Central Library of Shahrood University > Mathematical Sciences > MSc > 2014
Authors:
Abstarct: Clustering, that is widely used in various fields of science, is one of the important tools in
data mining. Many clustering methods have been built baxsed on dissimilarity of observations
which is calculated by a distance function. The progress of science and technology
in the world today has resulted in large data sets with very large number of underlying
variables. As the performance of distance functions are affected by increasing the dimension,
identifying the clusters in the high dimensional space can be viewed as a challenge
for researchers. In this thesis, a new kind of dissimilarity measure baxsed on the classification
method of Random Forests is investigated to provide a new basis for any clustering
method. Thereafter, the Multidimensional scaling method is combined with Partition
around medoid clustering algorithm to handle some simulation and real examples. The
obtained results confirm the superiority of this approach in comparison with common
methods.
Keywords:
#Clustering #Random forest #Variables importance #Partition around medoid #Multidimensional scaling
Keeping place: Central Library of Shahrood University
Visitor:
Keeping place: Central Library of Shahrood University
Visitor: