Bayesian Classifier for AntiTargets

Given the Pipeline Pilot Learn Good Molecules component, and a sufficient set of active and inactive molecules towards a particular target, one can readily construct a classifier for activity against the target.  As a result of this process, one can obtain panels of substructures with positive and negative Bayesian scores (normalized log odds probabilities) that may be used to help guide chemists towards molecules that are active against that target.  Positive scores are typically associated with substructures found

in active molecules and negative scores are associated with substructures found in inactive molecules.

However, suppose the target is not desirable, or is an "antitarget."  Are the panels of substructures with negative Bayesian scores reliable guides for the construction of molecules that do not bind or interact with a target, or is this problematic, somewhat similar to the (textbook) logical fallacy of "lack of (activity) evidence is not evidence

of inactivity..."

Hmmm... would attempting to construct a model for inactivity i.e., using pIC50 <= (some value) make any difference?  Is this just a "practical issue" i.e., just predict

new molecules, have them tested, if the model is correct, great, if not then .... oh well.

I am very curious how folks deal with construction of meaningful ligand-based models for antitargets or undesirable activities.  How do you construct a model for something

you don't want with a limited training set, and of course all data sets are limited in one way or another!

Please share your thoughts, ideas, comments, interesting publication reference, etc.

Thank you.

Regards,

Jim Metz

James.Metz@AbbVie.com