2020 Pipeline Pilot Hackathon Challenge #2

The COVID-19 crisis has sparked a race to find a cure as quickly as possible. One of the quickest paths towards that goal is repurposing existing drugs. To support drug-repurposing initiatives, the American Chemical Society has made public a set of antiviral candidates (https://www.cas.org/covid-19-antiviral-compounds-dataset) that can be downloaded in SD format. This set contains existing antiviral drugs and compounds structurally similar to existing antivirals. You can use any other public data sources to accomplish the tasks below.


  1. Using publicly-available bioassay data (a good source is https://reframedb.org/), build a predictive model that can identify compounds with potential activity against COVID-19.
  2. Virtually screen the compounds in the CAS data set indicated above using the built model and select the top-scoring compounds.
  3. Assess the toxicity and bioavailability of these top compounds (predicted and real, if available).
  4. Create a Pipeline Pilot component that checks if the input compounds are patented. Assess the speed of your new component.
  5.  Generate new molecules starting with your top-scoring compounds found above using any method available in Pipeline Pilot, score them using your predictive model and select any that score higher than the top compounds selected from the CAS data set. Assess toxicity, bioavailability and patentability for these new compounds as well.
  6. Present your two sets of results in a report.