Word Sense Disambiguation of English Modal Verb Will: a Comparative Study between ANFIS and FCM
|Course||Foreign Linguistics and Applied Linguistics|
|Keywords||Adaptive Network-based Fuzzy Inference System Fuzzy C-means Clustering linguistic features English modal verb will word sense disambiguation corpus|
Fuzziness, which is one of the inherent and defining features of natural language, canbe found in all aspects of human life. In1965L. A. Zadeh first put forward the concept ofFuzzy Sets. The formalization and mathematization of the concept bring great changes tothe study of linguistic fuzziness, leading to a new branch of linguistics---Fuzzy linguistics.Since then, linguistic fuzziness has been approached by using mathematical methods.Foreign studies on linguistic fuzziness mainly focus on calculation of mathematical logicand measurement of experiment statistics; while qualitative method has been mainlyadopted by Chinese researchers. However, few studies have been done so far to simulatethe process of human brain to deal with fuzzy sense inference and compute the distributionof different fuzzy senses to achieve fuzzy sense determination respectively by integratingadvanced computational techniques and Fuzzy Sets Theory; moreover, far fewer havecompared the two computational methods with each other.English modal verbs, which play an essential role in human communication, are aquite complicated system. The primary focus of this paper is to establish an adaptiveneuro-fuzzy inference system to infer fuzzy root meanings of English modal verb will. Asecondary purpose is to utilize a fuzzy c-means cluster algorithm to determine fuzzy rootmeanings of English modal verb will, a comparison of the two research methods is thenmade and the comparative results are analyzed.First, a corpus of1.2million words is established, and the root meanings of targetword will are tagged manually, software Wconcord is then used for statistics byquantifying selected linguistic features. The Mutual Information of subject and will andthe Mutual Information of will and the main verb are calculated based on the result of thestatistics. Second, a group of syntactic, semantic and contextual features which potentiallyinfluence the sense of will are selected and transformed into the binary logical value “0” or“1”. Some optimal parameters of the ANFIS model are obtained through a number ofexperiments in the fuzzy logic toolbox of Matlab, and the average testing error of theANFIS model reaches90%. Moreover, the contribution of different syntactic, semantic and contextual features on the fuzzy root meanings of will is revealed and ranked afterseveral experiments. The results are of practical value for the semantic gradience study ofmodal verb will to allow English learners to achieve a further understanding of therelativity of these features.A FCM of English modal verb will is built by employing a fuzzy c-means clusteralgorithm in the Matlab environment, displaying the distribution of different fuzzy sensesof modal verb will and the FCM reaches a correct clustering rate of79%.Finally the two research methods are compared with each other and the comparativeresults are analyzed, and the experimental results reveal ANFIS model is the moreappropriate method dealing with the sense disambiguation of fuzzy root meanings ofEnglish modal verb will.