SMS - based Q \u0026 A system study of the field of tourism entities
|School||Kunming University of Science and Technology|
|Course||Applied Computer Technology|
|Keywords||SMS Domain entities Database Q \u0026 A Query and Analysis Entity Recognition Subject Classification Lucene|
Natural language database Q \u0026 A is a good question-and-answer mode, allows natural language database queries, the answers returned to the user. SMS is the most convenient information services and exchanges, how fixed SMS with specific areas of business database, SMS-based natural language database of questions and answers, very valuable. This paper focuses on the questions and answers based on the SMS domain entities natural language database model the business database construction sector entities to identify, the SMS Subject Classification answer retrieval technology for a range of research and discussion, mainly made the results of the following aspects: (1 ) the business inquiries characteristics for the areas of database, define and describe a specific areas of domain entities, entity attributes, properties, categories and attribute value structures common business database construction method, the method of the specific entity class in the field of business database query structure defined as the entities, entity attributes, entity attribute category, entity attribute value, attribute columns in the entity class business database query into queries corresponding attribute of an entity attribute values, universal storage can be realized in different areas of business data, defined The structure has nothing to do with the field. (2) to business inquiries characteristics for SMS field, the the the SMS query analysis method obtained by the fusion entity identification and thematic classification of a combination of the entity's business database entities and attributes category, the characteristics of the method for the SMS characteristics and the field of the entity's business database structure, SMS query analysis problem into the problem of identifying the query domain entities and SMS query topic categories, and named entity recognition method to identify query domain entities based on conditional random field of airport, using a dynamic combination of short text classification algorithm SMS theme, and extract the field of entity attributes category, the answer extraction experiments show business database entities and attributes class identifier method can improve the accuracy of business data query. (3) for the areas of the entity's business database characteristics and SMS features constructed answers to business database retrieval system based on Lucene specific areas entities, the system uses the the Lucene framework of structured business database into a text database, upcoming database each record as a document using Lucene to establish all the text inverted index files, and database areas entities, entity attribute categories corresponding column as an index word, similarity calculation method based on the words of the field HowNet calculated query entity name The similarity between the subject categories and database indexing terms similarity index word retrieval extract business data query answer, answer retrieval experiments prove that this method can improve the answer to the query accuracy. (4), the Yunnan Tourism entity data query, for example, designed and implemented a prototype system based on SMS Yunnan attractions, hotels Answers.