Статті
Permanent URI for this collection
Browse
Browsing Статті by Subject "simulation modeling methods"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
Item ПОРІВНЯННЯ АЛГОРИТМІВ КЛАСИФІКАЦІЇ BIG DATA МЕТОДАМИ ІМІТАЦІЙНОГО МОДЕЛЮВАННЯ(Відкритий міжнародний університет розвитку людини «Україна»., 2023) Одегов М. А.; Odehov M.; Гаджиєв М. М.; Hadzhyiev M.; Буката Л. М.; Bukata L.; Глазунова М. В.; Hlazunova M.; Кочеткова М. В.; Kochetkova M.У статті вирішується задача порівняльного аналізу швидких алгоритмів класифікації, що можуть застосовувати для вирішення задач з надвеликими об’ємами даних (Big Data). Задача розв’язується методами імітаційного моделювання за допомогою програми Adaptive Metrics. Алгоритми найближчих сусідів, центрів класів та адаптивних правил порівнюються за критеріями надійності та продуктивності. Отримані результати дозволяють зробити висновок, що алгоритми. засновані на принципах M-means можуть ефективно використовуватись в задачах класифікації за певних умов, оскільки мають значну перевагу за критерієм продуктивності. With the development of information transmission and storage technologies, the volumes of data that require processing and analysis are growing rapidly. Therefore, the task of developing algorithms for solving various artificial intelligence problems for Big Data volumes is urgent. In our works, this informal term "Big Data" refers to situations when known processing algorithms do not allow solving a problem in a practically acceptable time. With regard to classification tasks, such conditions are possible when the first place is not even high reliability (that is, the minimum number of errors), but productivity (classification speed). The well-known method of nearest neighbors is one of the most productive. However, the indicator of the order of growth (the number of typical operations) for it is K x M x N, where K is the number of nearest neighbors, M is the number of classes, N is the typical number of class elements. Along with this, we propose to consider algorithms based on the principles of M-means, where classes are replaced by only a small number of their characteristics. Among such algorithms, the article considers: the algorithm of class centers and the algorithm of adaptive rules. The order of growth for these algorithms is only M according to the number of classes. The comparative analysis of these algorithms is performed by the method of simulation modeling. Simulation models are implemented by the Adaptive Metrics program, developed at the Department of Software Engineering at DUITZ. In this program, the classification problem is solved using the example of the dichotomy problem for classes A and B. The program has the possibility of very flexible setting of models. Problems can be solved in 1-dimensional, 2-dimensional,..., 6-dimensional spaces. The distribution of factor values for classes A and B can have quite different statistical characteristics - from uniform and triangular distribution functions to functions approaching a normal distribution. The graphical interface of the program allows you to dynamically observe the solution of the classification problem in one-dimensional, two-dimensional and 6-dimensional projections. As a result of multiple runs of the program, it was established that the algorithms of the nearest neighbors slightly outperform the algorithms of class centers and adaptive rules according to the criterion of reliability, and also comply with the principle of compactness (concentration of the largest number of erroneous solutions in the hypercube of errors). Algorithms based on M-means principles significantly outperform this algorithm in terms of performance. Also, the algorithm of adaptive rules best corresponds to the principle of equality of classes and is the most productive of the considered ones