Dissertation > Industrial Technology > Automation technology,computer technology > Computing technology,computer technology > Computer applications > Computer network > General issues > The application of computer network > Web browser

Reserch of Blog Information Collecting System Based on Consortium Mining

Author ZhenTao
Tutor GuoShaoZhong
School PLA Information Engineering University
Course Computer Software and Theory
Keywords Blog Complex Network Bayesian Classifier SNA(Social Network Analysis) Betweenness
CLC TP393.092
Type Master's thesis
Year 2009
Downloads 64
Quotes 0
Download Dissertation

Blog site, which issues various personalized information and updates frequently, is gradually becoming a important information-obtaining source. With its increasing intelligence value, it also draws attention from relevant authorities. How to effectively target and comprehensively analyze the information released by Blog and mining useful information rapidly, has become an urgent problem to solve facing security department.In this thesis, the structure and characteristics of Blog has been analyzed in depth, and the current technologies and achievements had been studied and validated. A prototype system aiming at monitoring Blog sites has been designed, which firstly collect the information from the page, secondly mine the text, lastly analyze the social network. The system mainly consists of data collection module, content filtering module and the network analysis module. Each module can be operated independently and cooperatively. The jobs mainly include: A Crawler mainly monitoring Blog site has been designed and implemented. A Blog database has been set up, and the Blog text classification tools has established by means of database language design. With social networks set up using the topic and the comment relationships in Blog, method of highlighting the central figure, the core of the page and the core of content by finding out community structure from Blog was put forward. The analytical results were also demonstrated by graphical display. Finally, performance of the system is verified by acquiring real data from Internet. It proved that the system is overall stable. It can collect Blog data, classify the text and set up the Network rapidly with accurate analysis results. This system can provide great help and guidance to later artificial work.

Related Dissertations
More Dissertations