Regular Expression Matching with Multi-step Speculation
|Keywords||low latency parallel pattern matching regular expressions speculativepattern matching|
Intrusion Detection and Prevention System(IPS) is an effective way to maintainnetwork security. Deep Packet Inspection(DPI) is the core of IPS. Now, networkdevelopment speed is very fast. IPS needs the help of multi-core or hardwaretechnology to keep up with the network development.Because of the stronger expression ability and the flexibility, regular expressionis used to expresent rules. Regular expression matching algorithm uses finiteautomata to express signatures. There are two types of finite automata:nondeterministic finite automata(NFA) and deterministic finite automata(DFA)．DFAcan match faster so used commonly. However, the processing rate of DFA can notmeet the requirements of network. In particular, one byte at a time make DFA becomethe major bottleneck of IPS. Therefore, the study of efficient DFA-based regularexpression matching is important and necessary. This article describes severalhardware-based regular expression matching algorithms, such as Speculative ParallelPattern Matching, SPPM, and analyzes their theoretical basis and the key ofalgorithms, pointed out the advantages and disadvantages of each at last.Through analysising the characteristics of Snort rule set, this article applied the"speculative" concept in the regular expression matching algorithm supported bymulti-core technology. The packet is broken into several chunks, opportunisticallyscan them in parallel and if the speculation is wrong, correct it later. Focusing onimproving the success rate of speculation, we present Regular Expression Matchingwith Muti-step Speculation(MSPPM) which effectively improve the success ratespeculation, reduced the number of visits to the DFA to reduce the latency.Experiments show that compared SPPM, MSPPM reduces parallel iterations and theaccesses to DFA about15%. Thus its latency is lower. The rate of MSPPM increasesby about15%so algorithm matching rate higher. In a word, MSPPM is a kind of lowlatency and high efficient regular expression matching algorithm based on themulti-core. So MSPPM can improve the system throughput and matching efficiency tomake IPS adapt to the development of the network much better.