Pipeline Pilot Challenge / Ring System Complexity descriptor

I was a little disappointed to find that the Pipeline Pilot challenge in the old Pipeline Pilot newsletter (http://accelrys.com/products/pipeline-pilot/newsletter/) is not included in the eFlash http://accelrys.com/eflash/.  Some of the questions on the forum are challenges, but are often more issue-oriented than the newsletter challenges.  I came across a puzzle possibly worthy of the challenge name, though perhaps not as abstract as some of the previous challenges.

In two recent papers [1, 2] from AstraZeneca, two metrics were calculated and correlated with promiscuity.  The percent of a compound represented by the Bemis-Murcko framework was calculated in Pipeline Pilot; however, the authors used the OEChem toolkit to calculate a topology class (RSC: ring system complexity) based on number of terminal rings and presence of a molecular bridge.  There could be a speed consideration for using OEChem (the calculation was implemented in C++), but I was wondering if it could be prototyped in Pipeline Pilot either with the molecular toolkit or standard components (which may be accessible to a broader range of users).  I've an implementation that I think does the same thing [3, 4], but it’d be interesting to see what others would come up with.

Your challenge: reproduce the RSC descriptor in Pipeline Pilot.  The RSC descriptor is of the form xTR[+B], where x is the number of terminal ring systems in the Bemis-Murcko framework, and the +B is added if those rings are connected with any atoms at all (it appears the maximum value of x for which there are cases where +B is not added is 2, as there is no way to connect 3 or more terminal rings without connecting atoms).  Unfortunately, I can't offer a prize other than the glory you'll receive posting your protocol here :-)

molecular_frameworks.png

For added challenge, the second paper [2] expands the descriptor to xTR[+B_y] where y is the number of ring systems remaining once all terminal ring systems have been removed from the Bemis-Murcko framework.  (I'm still thinking about how to do this in Pipeline Pilot.)

1.  Yidong Yang, Hongming Chen, Ingemar Nilsson, Sorel Muresan, and Ola Engkvist, "Investigation of the Relationship between Topology and Selectivity for Druglike Molecules", J. Med. Chem., 2010, 53, 7709-7714.  http://dx.doi.org/10.1021/jm1008456.

2.  Hongming Chen, Yidong Yang, and Ola Engkvist, "Molecular Topology Analysis of the Differences between Drugs, Clinical Candidate Compounds, and Bioactive Molecules" J. Chem. Inf. Model., 2010, ASAP.  http://dx.doi.org/10.1021/ci1002558.

3.  One interesting idea that came up was that it could be useful in terms of speed/memory considerations to have a component that does "merge data assuming prior sort", which could have advantages over a run-to-completion subprotocol or a standard merge component.

4.  Hint, I found the following useful (select text to display), though with the disclaimer that as I've not posted my protocol yet this has not been checked and may not be correct.

[*R:1]!@[*:2]>>[*:1].[*:2]