Q127 : A Statistical Model for Evaluation of Interactive Question Answering Systems
Thesis > Central Library of Shahrood University > Computer Engineering > PhD > 2018
Authors:
Mohammad Mehdi Hosseini [Author], Morteza Zahedi[Supervisor], Prof. Hamid Hassanpour[Advisor]
Abstarct: The QA system is an automated system for obtaining the correct answers to questions posed by human in the natural language. In these systems, if the response is found, and if it is not the user's expected response or needs more information, there is no possibility of exchanging information between the system and the user to ask more questions and get answers related to it. To solve this problem, interactive Question answering (IQA) systems were created. Evaluation plays an important role in designing IQA systems. However, there is not yet a specific method for evaluating these systems in general, and it is only possible to take advantage of the evaluation methods used in the QA systems, dialog-baxsed systems and the use of human assessors. Therefore, providing a proper mechanism to the process of evaluating IQA systems can contribute significantly to improving these systems. Presenting a model instead of human estimator can be a challenging task for assessing the interactive systems so that the output of the model can predict the.score given by the.estimator..In this thesis, an attempt was made to determine the appropriate parameters of a statistical model for evaluating IQA systems, and this statistical model can be used for assessing IQA systems in the assessment process. The aim of the statistical model was to provide the independence of the model from the language.of.the.interactive.system. In order to achieve the.most appropriate model, several.statistical features.were.extracted,.then the regression and gene exxpression programming were used to reach the model. First, a databaxse of conversations took place with four IQA systems. Then, the feature extraction was performed on each conversation and finally the regression was used to extract the model. Also, due to the high number of suggested features and to prevent over-fitting, the best features were chosen using REF method to form the proposed model baxsed on the remaining features. The best model was determined by Lasso and Power series regression according to the root mean square error. Next, gene exxpression programming was used to achieve a more appropriate model. At the first step, a regression equation was formed to predict the score of the estimators baxsed on the entire conversation. Then conversations were grouped.into three classes with good, moderate and poor scores.and a regression equation was.obtained for each class. Therefore,.for.a.new conversation, after the feature extraction phase and its allocation to a class, it was calculated baxsed on the regression model. According to the evaluation criteria, the average output error rate of the model compared to the actual output was 0.09, which indicated that the appropriate model was proposed in order to evaluate these systems. The proposed method in this study revealed that if a standard set of conversation between users and the system is available, it is the model’s advantage to predict the score for the IQA systems if a human estimator is absent.
Keywords:
#Interactive Question Answering #Evaluation of Systems #Question Answering system #Regression #Gene exxpression programming Link
Keeping place: Central Library of Shahrood University
Visitor: