Using Matched Molecular Pairs to cluster compounds

Attached is a proof of concept protocol showing how Matched Molecular Pairs (MMPs) can be used to cluster compounds. The idea is to put all compounds in cluster that share a common core. This results in clusters where all members form MMPs with all other members. In the example dataset using one-cut MMPs 2869 compounds contained 2595 unique common cores, but when taking into account that many of these are substructures of other cores only 430 unique cores remain (from which phenyl was removed manually). This clustering method allows assignment of one compound to more than one cluster.

The attached zip file contains the protocol, example input and a Powerpoint presentation explaining things in more detail.

Requirements: Pipeline Pilot 8.5 CU2 or later; Chemistry, Reporting collections