I have need of a slightly different MCSS Clustering / Fuzzy matching algorithm.
I can read in a set, say 20 small molecules, that I would like to represent as query (core) molecules. These
structures may be considered as fuzzy substructures. They have already been chosen by chemists
and hence I do not want a computer to re-decide any substructures.
I would then like to read in a database of molecules and then cluster molecules in that database
according the substructure similarity of those molecule to the query molecules.
This is not a substructure matching problem, because there may be one or more atoms
in my database molecules that don't match my pre-defined query molecules. However
the algorithm should attempt to make the matches as close as possible within defined
criteria. If the similarity is not sufficient, the unmatched molecules are all placed in some
group.
I have checked in several locations in the PP User Forum for protocols that would perform
this type of analysis, but I could not find anything that specifically uses a pre-defined set
of core molecules.
Does anyone have ideas how to do this or a protocol that they are willing to share?
Thank you.
Regards,
Jim Metz