Fabian Buchwald, Tobias Girschick, Madeleine Seeland, and Stefan Kramer (2011)
Using Local Models to Improve (Q)SAR Predictivity
Molecular Informatics, 30(2-3):205-218.
We present a novel (Q)SAR approach that detects groups of structures for local (Q)SAR modeling. The algorithm combines clustering and classification or regression for making predictions on chemical structure data. A clustering procedure producing clusters with shared structural scaffolds is applied as a preprocessing step, before a (local) model is learned for each relevant cluster. Instead of using only one global model (classical approach), we use weighted local models for predictions of query compounds dependent on cluster memberships. The approach is evaluated and compared against standard statistical (Q)SAR algorithms on various datasets. The results show that in many cases the application of local models significantly improves the predictive power of the derived (Q)SAR models compared to the classical approach, to models that are induced by a fingerprint-based or a hierarchical clustering approach and to locally weighted learning.