Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Text Processing

Research on Approaches of the Subjective Automated Assessment

Author GuoZuoZuo
Tutor YinWenSheng
School Huazhong University of Science and Technology
Course Industrial Engineering
Keywords Natural Language Understanding Chinese Word Segmentation Latent Semantic Analysis Singular Value Decomposition Single Similar Degree Dynamic Programming
CLC TP391.1
Type Master's thesis
Year 2011
Downloads 10
Quotes 0
Download Dissertation

As Charles Dickens said in his novel The Tale of Two Cities,“It was the best of times; it was the worst of times”. Life of us is flooded with a large amount of information, so how to choose valuable information is becoming more and more important for everyone at any time when we face the Internet. Thus, more and more people pay attention to the research on artificial intelligence, data mining and natural language understanding, etc. Natural language processing technology combined computer science technology and artificial intelligence technology, was universally applied in information retrieval field at first. Base on information retrieval technology, some key issues about natural language understanding will be studied and discussed in this paper, and applied to subjective automated assessment field to obtain some desire results.The development and application of knowledge retrieval technology is introduced firstly, and then the present research status of subjective automated assessment technology is described afterwards. The analyses and summarization of the development orientation of natural language processing technology make a good foundation for subsequent research. The paper includes three aspects.The first aspect is about Chinese word segmentation. A dictionary based on all-character index structure is proposed. A Hash table is firstly established for all commonly-used characters in each phrase, so the dictionary can support non-first-character query and fuzzy query. The construction and maintenance of the dictionary is simple and easy to realize. Taking Word Sense Code as data resource to construct all-character indexing dictionary, the test indicates that the Chinese word segmentation algorithm based on this dictionary can separate phrases correctly.The second aspect is about Latent Semantic Analysis (LSA) and its application. Some concrete tests show that this method LSA has the function of extracting and representing the implicit semantic of a high-order matrix in the association of terms with documents by using a low-order matrix. LSA works by using Singular Value Decomposition (SVD), which is a way in Linear Algebra by using statistical computation applied to a large corpus of text. Unfortunately, performance data shows the effect is not desirable when LSA is applied to subjective automated assessment. It is because there is no a mass of training data to use, which needs more experimental verification.The third aspect is about an improved subjective automated assessment algorithm based on Single Similar Degree. It is based on closeness theory of the fuzzy mathematics, taking sentences as semantic element and combining with dynamic programming algorithm. Different from LSA, this method focused on semantic representation and match. Data performance shows that this method achieves a better results compared with LSA.

Related Dissertations
More Dissertations