Adapted Transfer of Distance Measures
Information and datasets for the 2010 Discovery Science Publication and the 2012 Computer Journal Submission
Computer Journal 2012 (submitted for Special Issue on Discovery Science)
Quantitative structure-activity relationships (QSARs) are regression models relating chemical structure to biological activity. Such models allow to make predictions for toxicologically or pharmacologically relevant endpoints, which constitute the target outcomes of trials or experiments. The task is often tackled by instance-based methods (like k-nearest neighbor), which are all based on the notion of chemical (dis-)similarity. Our starting point is the observation by Raymond and Willett that the two big families of chemical distance measures, fingerprint-based and maximum common subgraph based measures, provide orthogonal information about chemical similarity. The paper presents a novel method for finding suitable combinations of them, called adapted transfer, which adapts a distance measure learned on another, related dataset to a given dataset. Adapted transfer thus combines distance learning and transfer learning in a novel manner. In a set of experiments, we compare adapted transfer with distance learning on the target dataset itself and inductive transfer without adaptations. In our experiments, we visualize the performance of the methods by learning curves (i.e., depending on training set size) and present a quantitative comparison for 10\% and 100\% of the maximum training set size. Additionally, we present an approach to select the source task in a data-driven manner.
Publications
Rückert, U, Girschick, T, Buchwald, F, and Kramer, S
(2010).
Adapted Transfer of Distance Measures for Quantitative Structure-Activity Relationships
In: Proceedings of the 13th International Conference on Discovery Science, ed. by B. Pfahringer, G. Holmes, A. Hoffman, vol. 6332, pp. 341-355, Springer. LNCS/LNAI.
Files
- adaptedTransfer_kramer2010.zip
- Datasets for the Discovery Science 2010 Adapted Transfer publication experiments.
- adaptedTransfer_kramer2010.tar.gz
- Datasets for the Discovery Science 2010 Adapted Transfer publication experiments.
- AdaptedTransfer_CJ.zip
- Datasets for the Computer Journal submission.
- AdaptedTransfer_CJ.tar.gz
- Datasets for the Computer Journal submission.
- adapted_transfer_source_code.tar.gz
- Matlab source code for the Adapted Transfer publication experiments.
