System research and application based on the Index Server Search Site
|Course||Computer Software and Theory|
|Keywords||Search Data Acquisition Socket Index Server Windows Service|
Internet business in the emerging e-commerce today, not only in order to demonstrate corporate image, improve visibility; also means endless opportunities and wealth. Intranet intranet is bring a new way to communicate and management philosophy. So to build enterprise Web site has rafts sector agenda of many companies. The advantages of the Web is that you can easily display a lot of information, but also brought a deluge of information makes finding valid information is very difficult. To this end, a good corporate website has a powerful search engine, making the site more user-friendly and convenient. Internet business website for a large number of documents have policies and regulations, contract orders, information search service is essential. Index Server is a professional search engine designed specifically for the corporate website, it can be very easy powerful information search function on the site. And Index Server search the file is not limited to HTML format, also supports TXT, DOC, EXL, RTF, GIF, JPEG and other file formats, and can be inserted into the third-party plug-ins to support more file formats. The search range can be stored in the local server content, and may be other machines in the network shared resources, including resources in the INTERNET. In addition to the document keywords search, the attributes of the file size, modification date, author, etc. to search. In addition, Index Server also supports English, Simplified Chinese, German, French, Japanese and other languages, without programming, multi-lingual search engine can be implemented on the site. Index Server is a zero-maintenance design, simply start the Index Server service, the search engine will automatically run. The Web server side, you need to add the page connection with the Index Server. Index Server work process makes a request to the Web server, Web servers, through a similar database interface special file by the browser through HTML document FORM form IDQ Index Server connected to the customer's request into the Index Server understand statements. Index Server and then query results in accordance with the format defined by the template file HTX organized into HTML documents returned by the Web server to the browser. This way is known as HTML / IDQ / HTX way, requires three files with the completion of the query. This way query results can not be processed, and HTX format single template file. Microsoft Index Server 2.0 added support for asp.net, with a ASPX file instead of the previous three files. Asp.net flexible and powerful scripting language to manipulate Web developers can design complex query conditions, and more accurate results. Search on the subject around the station, the Index server index technology, data acquisition the search core support two major stations were thorough and meticulous research and analysis, and the search for google, Baidu, Sogou existing station products and markets detailed research on the basis of a real-time station search system. Contributions of the thesis main work, technical difficulties with innovation at as follows: (1) a large number of access to the relevant information station search, retrospective of the station search for the rise of the original Gang and its development process, to seriously study the search station architecture, architecture process and implementation of the application of knowledge, in-depth understanding of the the Search concept and its core support technology. 2-depth and detailed study of the Index Server indexes concept, characteristics and Index Server system architecture and data indexing process: filtration, word separation and standardized, and index query data flow, architecture, the query after the result set and the the Index server index server database connection as a secondary development library for a detailed analysis and research, but also from the point of view of the application the Index Server server time Sort exact ordering depth learning. Conscientiously study the relevant knowledge of SQL database, database architecture, and focus on learning how to implement monitoring and security access to the database. 4-depth study of the Visual Studio 2003 development tools, a detailed study of the development of a Windows service mode, the Windows service application to Socket Communications server-side, make full use of the advantages of Windows services, weaknesses, not only greatly improve the efficiency of production, and clear the system The structure has provided a guarantee for the stable and secure running of the system. Depth learning network communication programming techniques, as well as knowledge of the TCP / IP protocol and packet, and to be used in the C / S data acquisition system. (6) Based on the above principles and the existing research results, Search system was designed and implemented the system has the characteristics of innovation as follows: 1) the system data acquisition subsystem using the C / S structure, all users of the system can be Internet through the use of the inquiry system, independent of time and place restrictions on web-based mode of operation, and the system more competitive for the Chinese market has a huge number of Internet users. 2) real-time information query: Baidu and Google and other famous search engine giant, they also provide a free station search systems, they also can complete the search function in the fast station, but the station Baidu or Google search system, but can not be found in the website of latest news content or information found a site that no longer exists, because their web spider is not real-time on each site's content for data collection and index. So there is a lot of data delayed. Site within the search system for real-time monitoring process through the website database to capture the latest information on changes to the site, to achieve a real-time query. 3) the generation and management of information and data, in order to make the site customers can search for the most up-to-date information, use Window background service SQL database information updated in real-time monitoring, timely information to change the TXT file is stored and added to the Index server. 4) the use of the TXT file stored in the form of data, Index Server to support a variety of file formats such as HTML, XML indexing feature, but TXT is the most stable and efficient storage format. At present, the system has been in the securities STAR test of internal security, to good effect. Author In school, the post in the appendix.