Q168 : Distributed Classification with Stochastic Gradient Descent
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2019
Authors:
Behnaz Sadat Mirhadi Tafreshi [Author], Hoda Mashayekhi[Supervisor], Mohsen Biglari[Advisor]
Abstarct: Today, for data processing in classification algorithms, we use huge datasets. Having a huge amount of calculations for big data classification, it is important to use a high-speed system for this kind of algorithms. Also, to optimization objective function of classification algorithms, use various optimizations methods. Stochastic Gradient Descent (SGD) is one of the most popular algorithms to perform optimization and by far the most common way to optimize classification problems. This optimization method is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of the gradient descent optimization, It is also one of the most important utilized algorithms to optimize parameters to reduce classification error in discrimination baxsed methods. However, due to the extra computation and the challenge of choosing the appropriate learning rate as well as the large fluctuations when reaching the minimum of the objective function, it can cause a high variance resulting in slow convergence. Therefore, to increase the convergence rate and reduce the variance in the algorithm, the Stochastic Recursive Gradient Algorithm (SARAH) was proposed as a novel approach to the finite-sum minimization problems. In this method, by using a recursive update method, the variance of each iteration calculated and the parameters of the classification model are updated. However, it cannot support the distributed systems trivially due to the intrinsic design, which causes it to be able to perform in just one computational node. Therefore, when dealing with big data, using this optimization method, reduces the speed of classification algorithm. In this study, to speed up performance of big data classification, a method is proposed that is called Distributed Classification with Stochastic Gradient Descent. In this method, a logistic regression classification problem is implemented using a Stochastic Gradient Descent optimization algorithm in a distributed system. In the algorithm implemented in a server parameter system, the parameters of the classification problem are updated in a cluster by the server and worker nodes, such that the worker nodes pull a copy of the global parameter using a request sent to the server, and parameters. Eventually the updated parameters are pushed to the server. The server node receives and aggregate these parameters and stores them as a global parameter. The proposed algorithm is implemented in distributed spark environment. Spark is a distributed general-purpose cluster-computing frxamework. This technology can be used for a variety of big data applications specially where the speed of operation is of particular importance. The results obtained from comparing distributed method with centralized method indicates that for all four training data sets, with the increase of worker nodes per run, the algorithm execution rate has almost doubled compare to the centralized method, keeping the convergence rate.
Keywords:
#Big Data #Classification #logistic regression #Gradient #Gradient Descent #Variance Reduced #Distributed Systems #Apache Spark Link
Keeping place: Central Library of Shahrood University
Visitor: