Is anyone aware of any cheminformatics publications that clearly demonstrate good agreement between
clustering and self-organizing maps (SOMs). In other words, if a set of molecules is in cluster # 1, I would
expect them to exist in either the same SOM grid point, or a close set of grid points. I would not expect
the molecules in cluster # 1 to be spread out in a SOM.
If someone has a PP protocol that they are willing to share that demonstrates good agreement between
these two approaches that would be helpful. So far, I am not able to demonstrate what I intuitively
expect to be reasonable agreeement using the same input set of molecules and the same set
of molecular descriptors for clustering and for SOM generation, but ... perhaps I am not doing
something correctly!
Details - I am processing a set of ~ 1500 molecules, descriptors are: ECFP_6, Num_Rings,
Num_AromaticRings, Num_RotatableBonds. I am using the R stats SOM component with
a rectangular grid size of 30 x 30, and the PP cluster molecules component - again using
exactly the same set of molecular descriptors.
Thank you.
Regards,
Jim Metz