Dissertation
Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Information processing (information processing) > Retrieval machine

Research and Design Topital Crawler for Agricultural Information

Author ZhangNing
Tutor GuoWenMing
School Beijing University of Posts and Telecommunications
Course Software Engineering
Keywords Reptile theme Information collection Nutch Chinese word segmentation
CLC TP391.3
Type Master's thesis
Year 2010
Downloads 119
Quotes 1
Download Dissertation

With the development of Internet technology , the rapid growth of network information resources , the number of Internet users are more and more networks are increasing role in people's daily life and work , so people are increasingly concerned about how quickly and effectively from mass network information extracted potentially valuable information to effectively play a role in the work and life , so effective access to the various industries is the basis of the effective use of network information resources professional field topics Web information . For agricultural information the Reptile theme is focused on the massive network information to identify agriculture - related Web information resources , and access and up-to-date system . It can download the picture of crawling the Web coding unified filter crawl agriculture resources identified to meet the needs of content pages . The first intelligent information service platform for agricultural preliminary description focuses on the characteristics of the Reptile theme built on this platform for agriculture . Introduce topic reptiles , reptiles architecture , theory , composition , described in the workflow . Especially for the special requirements of the agricultural business platform resources , reptiles in the collection of information , to do the kind of processing . This article focuses on the development of the Reptile theme for agricultural information . Start with nutch open source search engine , the secondary development , adding primaries module in nutch workflow based on a detailed description of the system development process and methods of achieving the results , with a clear show proved for the reptile theme design and realization method of agricultural information with the feasibility and practicality.

Related Dissertations
More Dissertations