Unsupervised forward selection: a method for eliminating redundant variables

David Whitley, M. Ford, D. Livingstone

    Research output: Contribution to journalArticlepeer-review

    Abstract

    An unsupervised learning method is proposed for variable selection and its performance assessed using three typical QSAR data sets. The aims of this procedure are to generate a subset of descriptors from any given data set in which the resultant variables are relevant, redundancy is eliminated, and multicollinearity is reduced. Continuum regression, an algorithm encompassing ordinary least squares regression, regression on principal components, and partial least squares regression, was used to construct models from the selected variables. The variable selection routine is shown to produce simple, robust, and easily interpreted models for the chosen data sets.
    Original languageEnglish
    Pages (from-to)1160-1168
    Number of pages9
    JournalJournal of Chemical Information and Computer Sciences
    Volume40
    Issue number5
    DOIs
    Publication statusPublished - 2000

    Fingerprint

    Dive into the research topics of 'Unsupervised forward selection: a method for eliminating redundant variables'. Together they form a unique fingerprint.

    Cite this