CLUSTERING-BASED OBJECTS STATE DIAGNOSTICS IN CONDITIONS OF FUZZY SOURCE DATA

У The problem of distribution of a set of objects, the state of which is determined by a set of controlled parameters, into a set of subsets of objects maximally homogeneous in their properties is considered. Relevance of the problem and important advantage of the clustering procedure: when its implementing it is possible to reduce the initial difficult problem of high dimensionality objects analysing to the solution of a number of simpler problems of lower dimensionality.  This circumstance acquires additional attractiveness and importance if the initial data of the problem contain uncertainty, for example, are vaguely defined.  Research object is the procedure of partitioning a set of objects into clusters under conditions of uncertainty.  In this regard, the purpose of the study is to develop a method for solving the problem of clustering in conditions where the initial data on objects controlled parametersthe values contain uncertainty.  The method of solving the problem is based on clustering procedure mathematical model development, containing analytical expressions for the criterion of its effectiveness, written in the form of a twice fractionally quadratic function.  The impossibility of mathematical programming problem direct solution   initiated the development of a heuristic algorithm for its solution.  As a result, an iterative method was obtained and applied to solve the clustering problem under conditions of fuzzy initial data.  The developed computational procedure is based on a reasonable system of rules for performing operations on fuzzy numbers.  The situations when the belonging functions of problem fuzzy parameters are defined on infinite or compact media are considered.  The developed system of rules allows to correctly perform operations in the metric of fuzzy defined states between clustering objects.  The proposed method is easily generalised to the case when the uncertainty of the initial data is hierarchical.

generalised to the case when the uncertainty of the initial data is hierarchical.
Key words: clustering problem, uncertainty of initial data, rules of operations execution, step-by-step computational procedure of solution.

Introduction.
Clustering is a multivariate statistical procedure that divides a set of objects into a set of subsets containing maximally homogeneous groups of similar controlled features (parameters) of objects.
Clustering technology is widely used in economics, medicine, psychology, chemistry, sociology, etc.Its practical application allows: -better understand and explain the mechanism of typical features and parameters of objects on the nature of their functioning; -improve the quality of objects analysis taking into account their differences in groups; -to perform data compression by selecting the most typical representatives of objects and to divide them into subsets according to the type (nature) of their state, to select atypical objects that cannot be attached to any of the groups.
The clustering procedure can be hierarchical, when subsets of objects included in each cluster are again subjected to clustering.
Analysis of clustering problem publications, formulation and solution.Formal formulation of the clustering problem has the following form [1.2].Set of objects is given whose position in the phase space of the controlled parameters and the state of these objects is determined by a set of coordinates.The task is to distribute a set of objects into a number of compact subsets in accordance with some selected criterion.
We   the number of pairs of objects in s -th cluster.
sum of the distances between objects in different clusters.
number of pairs of objects.
It is clear that the result of clustering is naturally considered to be of higher quality, the further apart the objects that fall into different clusters are, and the closer they are to each other if they belong to the same cluster.In accordance with these, we introduce The meaning of criterion (1) is clear: its value is higher the greater the average distance between objects from different clusters and, at the same time, the smaller the average distance between objects for the most "loose" (least compact) cluster.
Thus, the clustering problem can be formulated as follows: find a set   js U that maximizes and satisfying natural constraints The resulting task (2), ( 3) is a fractional-quadratic programming problem.Let's consider possible methods for solving such tasks.The fractional-polynomial structure of the criterion function (3) leads to a significant complication of the analytical description of the derivatives of this function.Therefore, the use of optimization methods of the first and second order is not feasible in this case.The possibility of using universal zero-order methods here is limited by the high dimension of practical clustering problems.
We set the task of developing a special method for solving the fractional-quadratic optimization problem.Let's consider the simplest example of such a task.We introduce the criterion function Transform ( 4) and ( 5) considering (8).At the same time Now the original problem (5), ( 6) is transformed to the form: find a set {} y , 1, 2,.., , jn = maximizing (10) and satisfying (10).Let's solve this problem by the method of indefinite Lagrange multipliers.
Let 's introduce 0 11 ( ) .The solution has been received.Note, however, that the technique used in this problem, which transformed the fractional-quadratic problem (5), ( 6) into an ordinary mathematical programming problem with a quadratic objective function, cannot be used to solve the problem (2)-( 4).The fact is that the structure of the objective function ( 2) is doubly fractional-quadratic, that is, the components of the fractional-quadratic function themselves are fractional-quadratic.In this regard, to solve the problem (2)-( 3), we use one of the well-known approximate clustering methods [3,4], for example, the following.Let be given a set of n objects to be clustered, whose position in the L -dimensional space of controlled parameters is determined by sets of coordinates s , the nearest of the grouping centers is determined for each object, and this object is attached to the corresponding cluster.If the cluster grouping centers are not known, various algorithms for their sequential determination are used.At the same time, the order of objects joining becomes particularly important.Well-known methods successfully solve such a clustering problem with a good level of approximation.However, the task becomes significantly more complicated if objects coordinates are not precisely defined, but are given, for example, in terms of fuzzy mathematics [5][6].
Clustering Method in the Conditions of Fuzzy Source Data.Let the objects coordinates be fuzzy numbers with a probability function () LR − type [7,8].In this case, the belonging function is uniquely determined by the set ,, m   , where m is the modal value of the fuzzy number,  and  , respectively, are the left and right fuzzy coefficients.The analytical relations describing the corresponding belonging functions depend on the selected type of functions.If, in particular, these are Gaussian functions given on an infinite interval, then the analytical description of the belonging function has the form: Let, in particular, the belonging functions of fuzzy controlled parameters of objects are given on an infinite carrier and have the form (14). Then the clustering problem is solved in the simplest way.
Let 's introduce Certainly, this method is unacceptable if the belonging functions of object controlled parameters are set on compact media.
Such typical function has a triangular form, and its description is defined by the relation 0, , ( . In this case, the fuzzy description of the clustering object parameters belonging functions in the form (15) necessitates the appropriate mathematical support development.During the implementation of the clustering procedure, arithmetic operations of addition, subtraction, multiplication, as well as square root extraction are performed.Here are the rules for performing these operations, taking into account the description of the belonging functions of fuzzy parameters in the form of () LR − -type functions.
Let the belonging functions of two fuzzy () LR − -type numbers be given by the values: In accordance with the arithmetic rules of fuzzy numbers of () LR − -type [9], the formulas for calculating the components of the belonging function of a fuzzy calculation result have the following form: addition The result of performing the operation of extracting the square root from a fuzzy number 1 1 1 m   has the form: When solving many practical problems, the uncertainty about the values of the controlled parameters of these tasks has a hierarchical character.In this case, the components ,, m   of the belonging function of fuzzy parameters are themselves fuzzy quantities, that is , , , Let's write down the rules for performing arithmetic (20) operations on fuzzy numbers defined by (20): Addition , , , Now, in accordance with (19), it is required to determine the parameters of the belonging functions of the following fuzzy numbers: Thus, the obtained relations ( 17)-(34) allow us to calculate the parameters values of the belonging functions of fuzzy and non-fuzzy results of arithmetic operations on fuzzy numbers of () LR − -type.At the same time, the clustering problem solution in conditions of uncertainty can be performed using a simple step-by-step algorithm.When implementing this algorithm, the distances from this object to each of the grouping centers are sequentially calculated for each clustering object.After that, this is attached to the cluster specified by the "nearest" grouping center.The meaning of the term "nearest" should be clarified.When comparing the distances from the clustered object to the two grouping centers, the second of the obtained fuzzy numbers is subtracted from the first.If the resulting difference R is positive, then the second distance is smaller than the first and, therefore, the second grouping center is "closer" to the object than the first.The sign of a fuzzy number R is determined by the formula Conclusions.The problem of clustering a set of objects whose position in the phase space is given by their coordinates is considered.Analytical expression for clustering efficiency criterion in the form of a doubly fractional-quadratic function is obtained.Step-by-step method for solving the problem is proposed.The resulting method is extended to the case when the coordinates of objects subject to clustering are given indistinctly by their belonging of () LR − -type functions on infinite or compact carriers.
the j -th object and the s -th grouping center, 1, if -й object belongs to the -th cluster, 0, otherwise.js js u  =   Let some distribution of objects by clusters be obtained.Then of the distances between the objects in s -th cluster.
distance between pairs of objects caught in s -th cluster.distance between pairs of objects belonging to different clusters.Now the clustering quality criterion can be calculated by the formula will  find an indefinite multiplier using (10).At the same time ,.., ),( , ,.., ),..,( , ,.., )}.
the clustered objects and the one for which the value min ( ), 1, 2,.., ,