Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N5 2018 year

DOI: 10.17587/prin.9.235-240
On Expressive Possibilities of Schemes for Strings Structure Description
S. D. Makhortov, msd_exp@outlook.com, Voronezh State University, Voronezh, 394018, Russian Federation
Corresponding author: Makhortov Sergey D., Head of Department of Applied and System Software, Voronezh State University, Voronezh, 394018, Russian Federation, E-mail: msd_exp@outlook.com
Received on March 22, 2018
Accepted on April 05, 2018

The algorithms for computation on strings find increasing use in various applied fields, such as word processing, data compression, cryptography, speech recognition, computational geometry, molecular biology. Their relevance has significantly increased in recent decades with the appearance of new subject areas that require efficient processing of giant character sequences. These areas, in particular, include computer vision, information search in global networks, computational genomics and some others. Many effective algorithms for texts comparing and analyzing are based on preprocessing — a preliminary study of the structure of a string. As a result of this processing, some auxiliary construction is built, which allows one to significantly reduce the number of redundant character comparisons. At this stage, only one of the processed strings (for example, a sample or text) is usually examined, and the other string compared can remain unknown. It is important that preprocessing takes an acceptable amount of time. This is followed by the phase of direct search or analysis of substrings, on which the earlier obtained information about the string is used to reduce the work. This article considers two of the most well-known and effective schemes for preliminary analysis and description of the strings structure — borders arrays (prefix-function) and Z-blocks (Z-function). The author carries out a comparative analysis of their expressive possibilities. He also proposes and justifies a new linear algorithm for transforming a borders array into an array of Z-blocks, discusses some of its modifications that can improve the efficiency of work. The results obtained can be applied to practical study of large character sequences.

Keywords: string, string matching, preprocessing, prefix-function, Z-function, complexity analysis
pp. 235–240
For citation:
Makhortov S. D. On Expressive Possibilities of Schemes for Strings Structure Description, Programmnaya Ingeneria, 2018, vol. 9, no. 5, pp. 235—240.