Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N6 2018 year

DOI: 10.17587/prin.9.272-280
A Predicative Estimation of Supremum of the Model's Forecast Error Resulting from a Conceptual Dataset Shift. on the Example of the Meme-gram-model
A. A. Artemov, artemsince2@ya.ru, Center of Strategic Nuclear Forces Research at Academy of Military Sciences, Moscow Region, Jubileiny, 141090, Russian Federation
Corresponding author: Аrtemov Аrtem A., Research associate, Сenter of Strategic Nuclear Forces Research at Academy of Military Sciences, Moscow Region, Jubileiny, 141090, Russian Federation, E-mail: artemsince2@ya.ru
Received on February 26, 2018
Accepted on April 09, 2018

The paper presents a solution to the problem of estimating the forecast error of statistical probability of presence of a feature sequence in the objects verifying the model of the studied object domain. The author considers objects feature sequence as n-grams. The object is an n-gram of fixed length. A population of "mutating" objects, united by a given criterion (e.g. time), forms evolving multisets of variative power. With such formalization, a common element of two multisets or meme-gram identifies the presence of a feature sequence. The probability of such event is defined as a functional of the number of copies of a meme-gram. The model in which this axiomatics is determined is called a meme-gram model. The presented solution focuses on the issue of estimating relative frequency of repeating elements of multisets with the condition of possibility of forecasting their number only for a part of these elements. The proposed solution is particularly in demand in the field of creating knowledge representation models for self-learning systems in the condition of limited amount of training examples from the total volume of permanently changing Big Data.

Keywords: Meme-gram model, m-gram, data shift, elements of heredity, memetic algorithm, evolving system of knowledge, similar multisets, embedded multisets, Rastorguevs theorem
pp. 272–280
For citation:
Artemov A. A. A Predicative Estimation of Supremum of the Model's Forecast Error Resulting from а Conceptual Dataset Shift. On the Example of the Meme-gram-model, Programmnaya Ingeneria, 2018, vol. 9, no. 6, pp. 272—280.