This chapter is devoted to knowledge discovery from data, taking into account prior knowledge about preference semantics in patterns to be discovered. The data concern a set of objects (situations, states, examples) described by a set of attributes (properties, features, characteristics). The attributes are, in general, divided into condition and decision attributes, corresponding to input and output descriptions of an object. The set of objects is partitioned by decision attributes into decision classes. A pattern discovered from the data has a symbolic form of decision rule or decision tree. In many practical problems, some condition attributes are defined on preference ordered scales, and the decision classes are also preference ordered. The known methods of knowledge discovery unfortunately ignore this preference information, risking drawing wrong patterns. To deal with preference-ordered data, we propose to use a new approach called Dominance-based Rough Set Approach (DRSA). Given a set of objects described by at least one condition attribute with preference-ordered scale and partitioned into preference-ordered classes, the new rough set approach is able to approximate this partition by means of dominance relations. The rough approximation of this partition is a starting point for induction of "if..., then..." decision rules. The syntax of these rules is adapted to represent preference orders. The DRSA analyzes only facts present in data, and possible inconsistencies are identified. It preserves the concept of granular computing; however, the granules are dominance cones in evaluation space, and not bounded sets. It is also concordant with the paradigm of computing with words, as it exploits the ordinal, and not necessarily the cardinal, character of data. The basic DRSA and its major extensions are presented in two consecutive parts in this book. In the present part, we give a general perspective of DRSA, explaining its use in the context of multicriteria classification, choice, and ranking. Moreover, we present a variant of DRSA that handles missing values in data sets.
|Title of host publication||Intelligent technologies for information analysis. Vol. 4|
|Editors||N. Zhong, J. Liu|
|Place of Publication||Berlin|
|Number of pages||40|
|Publication status||Published - 2004|