Design and Implementation of Network Security Hotspots Analysis System
|Keywords||security hotspots topic model trend analysis topic chain|
The rapid development of Internet technology makes the applications of the Internet penetrate into all aspects of life and provides a great convenience to people’s lives. Because of the virtual Internet’s convenience for cybercrime, it’s easy to attack others by using of advantage of the Internet’s vulnerabilities. Currently, the main network security hot spot analysis system is based on natural language processing techniques of which the key analysis methods is retrieving key information from massive amounts of data based on theme-based model, LDA model and N-gram model. It guarantees higher accuracy to analyze by natural language processing techniques. The unsupervised learning methods are more popular now for that it improves the accuracy and efficiency of the classification through automatic classification after trained by a series of training data instead of manually categorizing.One of the main tasks of this paper was to present a topic-based model of network security hotspot identification method. First, crawl massive data from websites. Second, generate the theme which is the core topic behind the massive data based on LDA model and N-gram model. Then, adopt two different kind of methods, constructing theme chain and trend analysis, the aim of which are the same that to find the hot spots. However, they have different focuses that trend analysis needs to introduce the dimension of time or author for analysis. At last, visualize each module by Wigis projects.Another major work of this paper was to introduce a topic-based model of network security hot spot analysis system. This system was a combination of B/S architecture of which the framework structure was MVC which was common employed in J2EE and C/S architecture based on the open source framework structure SSH(Spring, Struts, Hibernate). Multi-threading technology was adopted to improve the efficiency of the implementation of the whole system. Meanwhile, the multi-modular design provided good scalability. The crawler, topic model extraction, topic chain, security hot trend, visualization of topic chain and trend analysis of the system could be tailored according to different configuration.The system could extract rather accurate themes and topic chains on the basis of the system’s function test on large amount of network security data and the trend analysis module presents a more intuitive view of theme changes over time. The accurate data extraction from a large number of network security hotspots was achieved through all modules’coordination. Besides, future work is to add more features to improve the system.