Name: Finding Duplicate Molecules
Author: Chris Farmer
Version: 2.0
Created: 11/2002
Modified: 7/2007 – Components upgraded, and help text revised/added.
Purpose: Compare two libraries and identify the molecules that are present in both.
Description: The two files containing the molecule libraries are read into Pipeline Plot, and duplicates removed from the files individually. This protocol uses the NCI and Maybridge libraries as examples, but any set of molecules can be substituted. Next, the Merge Molecules component merges identical molecules into the same data record. Note that Merge Molecules has the "OutputFrequency" parameter set to “True”. This causes the component to add a "Frequency" property to the data to tell how many source records were merged to make up each output record. The Frequency property is used in the next step to find all molecules which have a Frequency > 1. These are the molecules that occurred in both the input libraries (e.g. NCI and Maybridge).
Requirements: Pipeline Pilot 6.1.1, Chemistry Collection
Limitations: None
Keyword(s): Find Duplicate Molecules
Contents: Finding Duplicate Molecules.xml
Installation: Drag the protocol into the Pipeline Pilot client workspace, or drag and drop it on one of the Pipeline Pilot client explorer tabs to import it directly in the protocol database.