How is HELM supported in BIOVIA Pipeline Pilot ?
A set of Readers and Writers for HELM, as well as XHELM is part of Pipeline Pilot Chemistry and PPChem SDK.
Reading and writing HELM strings in BIOVIA applications requires the availability of each HELM monomer source file that occurs in a given HELM string.
The HELM reader and writer operate by converting the input HELM notation or HELM strings to and from SCSR representation (MOL extension for biological sequences). Therefore, the components need to have SCSR templates corresponding to all the input HELM monomers and a file mapping those HELM monomers to the SCSR templates. The default set of 3 configuration files can be found on the Pipeline Pilot server under:
- chemistry\data\HELM\HELMMonomers.sd
- chemistry\data\HELM\SCSRTemplates.mol
- chemistry\data\HELM\HELM_SCSR_MonomerMap.txt
Why is HELM conversion into SCSR MOL important at all?
it provides full connectivity information of contracted and expanded sequences
- it enables substructure, exact and flexmatch searches of sequences
- starting with Pipeline Pilot 2022, SCSR MOL is supported in reactions (RXN format)
Expanded Set of monomers from Pistoia Alliance
Pistoia Alliance published a new library of common HELM monomers:
https://www.pistoiaalliance.org/helm/new-core-monomer-set-for-helm-on-github/
The new set of >700 monomers was curated (clean up and shorten names, remove duplicates, align coordinates) and used to create the corresponding SCSR templates for HELM/SCSR interconversions in Pipeline Pilot 2022.