Life Sciences & HealthcareSeptember 28, 2020

Harnessing Data Science and AI for Drug Development Innovation

The pharmaceutical industry is a big business, often referred to as “big…
Avatar Irma Rastegayeva

The pharmaceutical industry is a big business, often referred to as “big pharma”. It is also a very competitive industry capable of a great societal impact because of its role in public health and the overall economy.

Despite its size and maturity, there is a growing “drug discovery problem” in the industry. While the spending on drug discovery Research and Development (R&D) is increasing, the regulatory approval of new therapeutics has been experiencing a decline. The mounting challenges of the drug discovery can be summarized as follows in the three major categories:

  • Time and cost
  • Failure rate
  • Saturated chemical IP space

Enter Data Science and AI

There is a growing trend across multiple industries on finding ways to harness the power of data. Leveraging data with Data Science and Artificial Intelligence (AI) are at the forefront of driving business transformation. When it comes to drug discovery and development, employing these technologies has become an existential challenge for the pharmaceutical industry. A 2017 McKinsey & Company article positioned the application of breakthrough digital technologies in R&D as “the $100 billion opportunity”.

So how can these technologies help the pharmaceutical companies reduce the staggering expense and

amount of time it takes to develop new therapeutics, getting there sooner, with less compounds needing to be tested and proved? AI & Machine Learning (ML) are deployed to explore potential drug candidates faster, able to meet a particular target product profile.

But while leveraging new software applications to design new therapeutics completely virtually is highly desirable, it has not been entirely possible. Generative Therapeutics Design (GTD) is an AI platform that is an innovative combination of the Virtual (in silico) and Real (in lab) development, in a synergistic “active learning” system. This “Lab in the Loop AI” approach consists of the iteration and interplay of the following three system components:

  • Virtually designing new molecules through automated learning
  • Suggesting compounds to synthesize, improving ML models in the process
  • Progressing lead candidates from the virtual steps to the lab for synthesis and screening

Traditionally, the new compound design effort has been mostly a trial-and-error approach. With AI and ML, it is possible to create a more structured approach to design and save time and money on experimentation.

Generative Therapeutics Design is particularly important for late state lead optimization, focusing on improving a specific target property once the rest of the target properties have already been met in the compound. By configuring GTD to make small, targeted modifications it is possible to accelerate synthesis and screening cycle. Another valuable outcome of the GTD is the knowledge obtained in the process, including imaginative approaches to types of molecules to synthesize and trade-offs when improving molecular characteristics.

Digital Investment Value and ROI

 The “discovery bottleneck” has been around for decades. In traditional drug discovery, it takes on average about 6 years and 4,000 compounds to get from target to lead optimization stage. By contrast, with AI-driven drug discovery, those investments can be shortened to about 3 years and 1,000 compounds. Dassault Systèmes estimates that the savings per optimized lead will be in the range of $6M to $9M, with savings per IND (Investigational New Drug) of about $400M (about 50% savings).

Collaboration is Even More Important

Drug discovery is, and will always be, a collaborative effort. We previously explored this: How Scientific Collaboration Can Be Fueled by the Cloud.

AI-driven drug discovery adds a new, virtual component to the process. The whole system needs to work in a synergistic and iterative way, with Virtual and Real components being connected with a kind of “middleware”.

  • Virtual: where medicinal chemists create virtual designs through the iterative process of generating, testing, scoring and pruning.
  • Real: where biologists compounds and medicinal/synthetic chemists synthesize and register compounds through the iterative process of designing, making, testing and analysing.
  • Connective layer: where medicinal chemists and project team assess virtually-designed compounds for synthetic accessibility and decide which should be taken to the lab, as well as take feedback from Real component back to the Virtual to continuously retrain the Machine Learning models.

There are of course many other experts and specialties in AI-driven drug discovery, including head of research, molecular modelers, data scientists and others, that work together to move drug discovery forward and bring life-saving therapies to life.


Editor: To learn more – listen to our webcast entitled, “Promises and Challenges of AI for Theraputic Target Identification.”   Learn how increasing market and regulatory demands for safety and efficacy are driving pharmaceutical R&D to improve upstream discovery efforts. Discover how AI and machine learning technologies are gaining popularity in extracting knowledge from a multitude of resources and enabling precision medicine.

Stay up to date

Receive monthly updates on content you won’t want to miss


Register here to receive a monthly update on our newest content.