Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing

Text Sentiment Analysis of Chinese Comments for Online Public Opinion

Author LuoYaPing
Tutor MaGang
School Dongbei University of Finance
Course E-commerce
Keywords subjective sentence extraction emotion lexicon text sentiment classification SVM
CLC TP391.1
Type Master's thesis
Year 2010
Downloads 390
Quotes 0
Download Dissertation

With the rapid development of Internet, more and more people express their feelings, opinion and attitude over Internet, which promotes the development of online public opinion. Especially with the development of the technology of Web 2.0, blog, BBS and news message board have become the main vehicles for online public opinion. In recent years, more and more local governments pay attention to supervising and controlling online public opinion. Online public opinion has some characteristics, such as burst, directness and anonymity. If the negative online public opinions are improperly guided, they will have harmful effect on the development of society. So the government and related managers need to mine and analyze online public opinion. This is very important for the government to grasp the development trends of public opinion and improve the government’s ability of monitoring online public opinion.Sentiment analysis of online public opinion involves text sentiment analysis technology.The main task of text sentiment analysis is text sentiment classification, that is to judge the text is positive or negative. Text sentiment analysis is a hot research subject in the field of natural language processing. It is widely used in the fields of text filtering, product comment mining and text sentiment analysis and so on. This paper mainly studies text sentiment analysis for online public opinion.Firstly, we summarize the current methods of text sentiment analysis after referring to a lot of documents. Then the key technologies of text classification are researched, including word segmentation, stop word process, text feature selection and text representation, which help understand the whole process of text classification clearly. Text sentiment classification is different from traditional text classification. Traditional text classification is based on subjects. For example, we can classify the texts into economics, politics, military, entertainment and so on. The words that are related to the subjects are more useful for the classification. While for text sentiment classification, sentimental words are more useful. Based on the work of the predecessors, we establish the model of text sentiment classification, two parts included. One is the establishment of emotion lexicon. The other is to implement text sentiment classification by SVM.In terms of the establishment of emotion lexicon, there are mainly three tasks. The first task is to extract subjective text. The second is to acquire emotion words automatically. The third is to compute the polarity of emotion words. We use the N-POS model to extract subjective sentences based on the research of other people. We improve the method by considering the length of sentences and demonstrate the improved method is more effective. Then, we acquire emotion words automatically based on CRF model and reduce the human workload. Finally, we compute the polarity of emotion words with the method based on HowNet to establish emotion lexicon.Based on the emotion lexicon, we select emotion words as text feature, compute the weight of feature by TF-IDF, and implement text sentiment classification by SVM. As traditional TF-IDF dose not consider the polarity of words, we combine TF-IDF with the polarity of emotion words to improve the method. Then we compare the results by using traditional TF-IDF and improved TF-IDF and verify the improved method is more effective. So we should take the polarity of emotion words into consideration when we implement the text sentiment classification.As natural language processing is very complex and my ability is limited, we need to further improve some deficiencies. We need to seek more effective methods in terms of acquiring emotion words automatically. The emotion lexicon needs to be improved and expanded. Besides, the modified polarity of emotion words need further study.

Related Dissertations
More Dissertations