Q254 : Multi-label classification considering label correlations
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2023
Authors:
Mohaddese Sadeghi [Author], Hoda Mashayekhi[Supervisor]
Abstarct: Multi-label classification (MLLC) is a afield of machine learning, which consists of classifying data by assigning to each instance a set of labels instead of one. Many real-world applications such as medical diagnosis and text classification involve multi-label classification, where consideration of label correlation is essential for proper classification performance. Therefore, how to learn and use the dependencies between labels has become one of the key issues of multi-label classification. In order to exploit the correlation between labels, in this research, we propose two features, temporal correlations and emotion labels correlation, which consider the correlation of multiple emotions from the user-level view to build an effective learning model. For this purpose, in this research, we deal with the detection of multiple emotions in online social networks from the user-level view and set this problem as a multi-label classification problem. First, two temporal correlations and correlations between user-level emotion labels from the GoEmotions dataset, which includes 27 emotions and one "neutral" emotion, as well as two versions of the same one for Ekman space, which includes 7 labels and one "neutral" emotion, and We extract another for the sentiment-grouped space, which contains 3 labels and a "neutral" emotion, Then we add these correlation-related features to the previous feature set. Finally, we investigate the task of learning from data using three sets of labels, with 28, 8, and 4 emotions, and using methods that transform the multi-label problem into several single-label classification problems. To this end, in this work, how to learn and use the correlation between labels in the data set in this research, we will use the evaluation criterion Macro-F1, to compare the efficiency among all models. As a result, among all the models, the best results were obtained for the version four-label version model with F1-score = 55.08.
Keywords:
#Data Mining #Multi-label classification #label Correlations #Problem transformation #GoEmotions Keeping place: Central Library of Shahrood University
Visitor: