TY - JOUR
T1 - Granular approximations
T2 - a novel statistical learning approach for handling data inconsistency with respect to a fuzzy relation
AU - Palangetić, Marko
AU - Cornelis, Chris
AU - Greco, Salvatore
AU - Słowiński, Roman
N1 - Funding Information:
Marko Palangetić and Chris Cornelis would like to thank Odysseus project from Flanders Research Foundation ( FWO ), grant no. G0H9118N , for funding their research. Salvatore Greco wishes to acknowledge the support of the Ministero dell'Istruzione, dell'Universitá e della Ricerca (MIUR) - PRIN 1576 2017, project “Multiple Criteria Decision Analysis and Multiple Criteria Decision Theory”, grant 2017CY2NCA . Roman Słowiński is acknowledging the support of Polish Ministry of Education and Science , grant 0311/SBAD/0726 .
Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/7/1
Y1 - 2023/7/1
N2 - Inconsistency in classification and regression problems occurs when instances that relate in a certain way on the condition attributes, do not follow the same relation on the decision attribute. It typically appears as a result of perturbation in data caused by incomplete knowledge (missing attributes) or by random effects that occur during data generation (instability in the assessment of decision attribute values). Inconsistencies with respect to a crisp preorder relation (expressing either dominance or indiscernibility between instances) can be handled with set-theoretic approaches like rough sets and by using statistical/machine learning approaches that involve optimization methods. In particular, the Kotłowski-Słowiński (KS) approach relabels the objects from a dataset such that inconsistencies are removed, and such that the new class labels are as close as possible to the original ones in terms of a given loss function. In this paper, we generalize the KS approach to handle inconsistency determined by a fuzzy preorder relation rather than a crisp one. The method produces a consistent fuzzy relabeling of the instances and may be used as a preprocessing tool with algorithms for binary classification and regression. As the obtained fuzzy sets can be represented as unions of meaningful simple fuzzy sets or granules, we call them granular approximations. We provide statistical foundations for our method, develop appropriate optimization procedures, provide didactic examples, and prove several important properties.
AB - Inconsistency in classification and regression problems occurs when instances that relate in a certain way on the condition attributes, do not follow the same relation on the decision attribute. It typically appears as a result of perturbation in data caused by incomplete knowledge (missing attributes) or by random effects that occur during data generation (instability in the assessment of decision attribute values). Inconsistencies with respect to a crisp preorder relation (expressing either dominance or indiscernibility between instances) can be handled with set-theoretic approaches like rough sets and by using statistical/machine learning approaches that involve optimization methods. In particular, the Kotłowski-Słowiński (KS) approach relabels the objects from a dataset such that inconsistencies are removed, and such that the new class labels are as close as possible to the original ones in terms of a given loss function. In this paper, we generalize the KS approach to handle inconsistency determined by a fuzzy preorder relation rather than a crisp one. The method produces a consistent fuzzy relabeling of the instances and may be used as a preprocessing tool with algorithms for binary classification and regression. As the obtained fuzzy sets can be represented as unions of meaningful simple fuzzy sets or granules, we call them granular approximations. We provide statistical foundations for our method, develop appropriate optimization procedures, provide didactic examples, and prove several important properties.
KW - Fuzzy logic
KW - Inconsistencies in data
KW - Rough sets
KW - Statistical learning
UR - http://www.scopus.com/inward/record.url?scp=85147606822&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2023.01.119
DO - 10.1016/j.ins.2023.01.119
M3 - Article
AN - SCOPUS:85147606822
SN - 0020-0255
VL - 629
SP - 249
EP - 275
JO - Information Sciences
JF - Information Sciences
ER -