Challenge:
The November challenge was to create the highest quality statistical activity model for a data set representing chemical compounds and their physical, chemical and structural properties..
Solution:
Lee Herman (Lee Herman Consulting) submitted a clever solution. He used four different classification methods, normalized to get probability scores out for each method, and then summed the results to get a composite (ensemble) score. This normalization is essential in order to put the classifier predictions on a common scale before summing them. As often happens, the composite model does better than any of the individual models.
The November challenge was to create the highest quality statistical activity model for a data set representing chemical compounds and their physical, chemical and structural properties..
Solution:
Lee Herman (Lee Herman Consulting) submitted a clever solution. He used four different classification methods, normalized to get probability scores out for each method, and then summed the results to get a composite (ensemble) score. This normalization is essential in order to put the classifier predictions on a common scale before summing them. As often happens, the composite model does better than any of the individual models.