Name: Clustering Tools: Cluster Molecules by Threshold; Highlight & Align by Cluster MCS
Author: Christian Herhaus
Version: 1.0
Created: 8/2012
Modified: 14/11/2013
Purpose: These tools add an alternative clustering approach and extend cluster functionality by some cluster postprocessing. Some typical and frequently occurring cluster tasks are addressed like generation and highlighting of Maximum Common Structures (MCS's), alignment of structures by MCS and some statistical measures. Moreover, MCS's can be extracted separately with attached lists of cluster members for further analyses.
In contrast to the Cluster Molecules component packaged with Pipeline Pilot, the component Cluster Molecules by Threshold determines different clusters by a minimum descriptor similarity threshold. This circumvents limitations caused by the AvgNumberPerCluster or NumberOfCluster definitions of the shipped component.
Features of Cluster Molecules by Threshold:
- Clusters molecules by a minimum similarity threshold after calculating a similarity matrix
- Two methods: Accuracy or speed
- Start cluster population with small or large clusters
- Optionally, assign remaining singletons
- Molecules falling below the similarity threshold are sent to the failport
Features of Highlight & Align by Cluster MCS
- Works with the Cluster Molecules by Theshold component as well as with the regular Cluster Molecules component
- Maximum Common Substructures (MCS's) are generated from Cluster IDs
- MCS's are tagged by a new property 'IsMCS'
- Optionally, colour-highlight MCS's
- Optionally, align full structures of clusters by MCS's
- Optionally, calculate ClusterSize and AverageDistance (from DistanceToClosest; for use with regular Cluster Molecules component only)
- Optionally, MCS's are sent to the passport as separate records
- Optionally, MCS's are colour-highlighted to be distinguished from full structures
- Optionally, ID lists of cluster members are appended to MCS records
The example protocol demonstrates the differences in results of the two different cluster methods as well as some functionality of the Highlight & Align by Cluster MCS component.
Requirements: Pipeline Pilot 8.5 or later
O/S: Windows and Linux
Limitations: none known
Keywords (tags): pipeline_pilot, library, cluster, similarity, threshold, maximum common substructure, MCS, highlight, align
Contents: Cluster Molecules by Threshold.xml, Highlight and Align by Cluster MCS.xml, Clustering Tools Example.xml
Installation:
1. Unzip the archive.
2. Import the example protocol and/or the components into your user tab in the Pipeline Pilot client by dragging and dropping them in the Explorer window.
3. Open the example protocol.
4. Run the protocol and follow the instructions to explore the functionality.
14.11.2013: Previous version contained a minor bug which did not affect results but stopped running protocol under Pipeline Pilot v9.1. This is corrected now.