Complex Output
In practice, the required output of machine learning algorithms is rarely just binary classification, but may involve multiple classes, hierarchical classification, multi-label classification, the possibility of abstention or, more generally, structured output. We developed a general-purpose method for multi-class and hierarchical classification based on so-called nested dichotomies (NDs). Nested dichotomies (NDs) are a standard statistical technique for tackling certain polytomous classification problems with logistic regression. A system of nested dichotomies is a hierarchical decomposition of a multi-class problem with c classes into c − 1 two-class problems and can be represented as a tree structure. If no hierarchy of classes is given a priori, every system of nested dichotomies can be treated as equally likely. Therefore, it is possible to form an ensemble classifier based on this assumption. Ensembles of nested dichotomies (ENDs) have proven to be an effective approach to multi-class learning problems. Further improvements in runtime are possible if ensembles of balanced nested dichotomies are used. Moreover, the method can easily be adapted to cases where a hierarchy of classes is given, as in fold recognition, enzyme classification, or related problems. Empirical results, however, clearly indicate that strong non-hierarchical multi-class schemes as ENDs are able to outperform several hierarchical classification variants. Finally, we proposed a new visualization method for abstaining classifiers, that is, classifiers that can abstain from prediction depending on cost conditions. Abstention cost curves, a generalization of cost curves, visualize the strengths and weaknesses of classifiers over a broad range of cost scenarios.
Publications
Friedel, C, Rückert, U, and Kramer, S
(2006).
Cost Curves for Abstaining Classifiers
In: Proc. of the ICML 2006 workshop on ROC Analysis in Machine Learning, Pittsburgh, PA.
Dong, L, Frank, E, and Kramer, S
(2005).
Ensembles of Balanced Nested Dichotomies for Multi-class Problems
In: Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2005), pp. 84-95.
Frank, E and Kramer, S
(2004).
Ensembles of Nested Dichotomies for Multi-Class Problems
In: Proceedings of the 21st International Conference on Machine Learning (ICML-2004), pp. 305-312, ACM Press.
