Technology constantly evolves, but when and how does a new technology displace and replace an established one? You might think of examples such as digital and film cameras, or streaming services and physical media, or GPS navigation and physical maps. Conversely, when does a new technology enhance the benefits of a long-established set of technologies? Will artificial intelligence and machine learning (AI/ML) in drug discovery be the former technology or the latter? How will emerging AI/ML technologies being incorporated into existing software, workflows and platforms shift the landscape of pharmaceutical research? Will it reshape the trajectory of drug discovery?
STRUCTURE PREDICTION AS A NOVEL APPLICATION OF AI FOR DRUG DISCOVERY
One of the most exciting examples of new AI technologies in this arena is AlphaFold2 and OpenFold. These models are built upon a deep neural network architecture that combines attention mechanisms and residual networks to predict the three-dimensional (3D) structures of proteins. Both are artificial intelligence systems that start with an initial model that takes the amino acid sequence of a protein as input and predicts the distances between pairs of amino acids to approximate the 3D structure of the protein. Subsequently, the initial structure is refined through a novel attention-based algorithm that considers long-range interactions and dependencies within the protein, which is crucial for accurate structure prediction. The ability to predict protein structures holds significant advantages with broad implications in drug discovery and design, including protein engineering, vaccine development, and biomarker identification and in industries beyond pharmaceuticals, such as agriculture, food and beverage, and environmental sciences.
AlphaFold2 and OpenFold are muscling into a territory previously dominated by homology modeling or protein threading. Homology modeling is based on the premise that evolutionarily related proteins share similar structures. In this technique, the amino acid sequence of an unknown target protein is aligned with the sequence of the known structure (template) protein and the spatial arrangement of atoms in the target protein is inferred from the corresponding positions in the template protein. Protein threading is an alternative technique used when traditional homology modeling may fail due to low sequence similarity or when dealing with novel folds. Threading evaluates the compatibility of the target sequence with a library of known protein structures, assigning a score to each template, and reflecting how well the sequence aligns with the structural features of the template. The template with the highest score is considered the most probable structural homolog.
While there are still use cases where homology modeling methods, such as the long-validated and established MODELER1 algorithm in Discovery Studio Simulation may produce better structure predictions, future versions of AI models for protein folding and structure prediction will likely replace these technologies. However, there are other reputable in silico technologies, such as physics-based modeling methods that are routinely used with existing protein structure prediction methods that have the potential to work in synergy with these new AI models to accelerate drug discovery.
COMBINING AI/ML MODELS WITH PHYSICS-BASED MODELING
Molecular physics-based modeling methods are a powerful set of computational tools that simulate the intricate behaviors of atoms and molecules at the quantum level. These methods are based on fundamental principles of molecular physics, leveraging classical and quantum mechanics to predict and understand the dynamic interactions within molecular systems. They have long been used to provide insights into drug discovery, materials design, and the exploration of complex biological processes. Physics-based modeling has certain advantages over AI, particularly in terms of interpretability and the ability to incorporate known physical principles. In the case of protein structure prediction in a discovery workflow, molecular dynamics can refine a structure derived from an AI model and calculate energy estimations that communicate the biophysical constraints and forces governing protein folding more realistically. Molecular dynamics can simulate the changes to the structure across different environmental conditions, such as changes in temperature or pH solution conditions, as the algorithms inherently consider the principles of energy minimization and thermodynamics. Physics-based methods help bestow interpretability, generalizability, and greater confidence in the prediction and can mitigate common concerns for the scientist over the black-box nature of AI predictions.
BOOSTING THE POWER OF DISCOVERY STUDIO SIMULATION WITH ALPHAFOLD2 AND OPENFOLD
Discovery Studio Simulation is BIOVIA’s 3D in silico atomistic modeling and simulation application. It includes physics-based methods such as CHARMm2 and NAMD3 for molecular dynamics that have long been used in workflows combined with homology modeling predictions from MODELER. BIOVIA Discovery Studio Simulation now includes OpenFold and AlphaFold AI models for protein structure prediction, which are likely to supplant most of the need for homology modeling with MODELER. However, OpenFold and AlphaFold will not be used in isolation and will assuredly require supporting physics-based in silico techniques to realize their benefits in drug discovery workflows. In releases later in the year, Discovery Studio Simulation will incorporate these AI models into more simulation workflows, addressing other challenges that our users have and helping in the interpretation and analysis of complex structural data rather than solely focusing on black-box prediction algorithms.
Change is inevitable. Change for the better is a full-time job.
Adlai Stevenson I (Former Vice President of the United States, 1893–1897)
Our job at BIOVIA is to continually adapt to new cutting-edge technologies helping accelerate scientific innovation for our users. In 2024 and beyond, BIOVIA Discovery Studio Simulation will integrate innovative AI technologies with bedrock modeling workflows that use established physics-based methods, offering reliability, interpretability and democratized access to researchers. We foresee a long and exciting journey with our users as we merge these novel technologies with our existing software into role-based experiences on the 3DEXPERIENCE platform, providing them with the very best of tools to accelerate life-changing therapies to the market.
If you’re interested in learning about the history of AlphaFold and OpenFold, explore this related blog by Tien Luu.
REFERENCES
1. Sali, A.; Potterton, L.; Yuan, F.; van Vlijmen, H.; Karplus, M., Evaluation of comparative protein modeling by MODELLER. Proteins 1995, 23, 318-326.
2. Brooks, B.R.; Bruccoleri, R.E.; Olafson, B.D.; States, D.J.; Swaminathan, S.; Karplus, M., CHARMM: A program for macromolecular energy minimization and dynamics calculations. J. Comp. Chem. 1983, 4, 187-217.
3. Phillips, J.C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot R.; Skeel, R.D.; Kale, L; Schulten, K., Scalable molecular dynamics with NAMD. J. Mol. Biol. 2005, 26, 1781-1802.