Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397
Issue N9 2022 year
In many modern applications, such as intrusion detection and prevention systems, expert knowledge can be formalized in the form of regular expressions. After this formalization, a finite automaton checks whether a word belongs to a regular language. Deterministic finite automata have an optimal time complexity, but the number of automaton states (space complexity) can grow exponentially with the length of the regular expression. At the same time, the time complexity is the main disadvantage of non-deterministic finite automata. Therefore, reducing spatial complexity while maintaining low time complexity is highly relevant. In applied problems of regular language recognition, a significant problem is the problem of the exponentially growing number of states of the recognizing deterministic finite automaton depending on the length of regular expressions of the recognized language — the exponential explosion problem. There are the following three main approaches to solving this problem using finite automata: 1) restriction on signatures given by experts; 2) regular language modification — this approach assumes the appearance in the solution of a practical problem of recognizing errors of the first and second kind; 3) finite automata modification without recognizing regular language changing. The third approach can be implemented as a finite automata modification through compression algorithms and particular structural elements. The paper presents a review of modern solutions, the main idea of which is the transition from an abstract finite automaton, represented by a table-specified function, to a structural automaton that combines the abstract part stored in the memory and various structural elements such as bit arrays and counters. Some ideas and algorithms have become especially successful when solving the exponential explosion problem by adding special structural elements. First, such successful solutions include the use of counters. Second, the idea of storing additional information about the state machine. Third, the combination of non-deterministic and deterministic finite automata. Fourth, the economic attitude to the state machine regarding its location to fast and slow memory allows us to consider the empirical experience of accumulated malicious traffic in intrusion detection systems.