Quantum Leap in AI/ML Applications Through BIOVIA’s Contract Research

Introduction

Artificial Intelligence (AI) and Machine Learning (ML) advancements are significantly influencing our society and our way of life, for example, through impressive progress and developments in computer vision, natural language processing, and autonomous vehicles. What are the implications in the world of science?

Scientific Machine Learning (SML) is a new and fast-growing discipline in scientific innovation. Through creative methodological approaches, SML aims to address domain-specific data challenges and derive insights from scientific data sets. SML is the next wave of data-driven scientific discovery in physical, biological, chemical and engineering sciences. It draws on machine learning and scientific computing tools to develop new techniques for scalable, domain-aware, robust, reliable and interpretable learning and data analysis. In addition to technical expertise, scientific knowledge is needed to leverage the full power of SML. Thus, SML requires multidisciplinary expertise from applied and computational sciences.

Explainable Artificial Intelligence (XAI) is an area of research that focuses on developing ML algorithms that can provide clear explanations for their decisions. As ML becomes more widespread, there is a growing need for algorithms that humans can easily understand and interpret. XAI can help bridge the gap between humans and machines by providing insight into how decisions are made. These algorithms can help ensure fairness and reliability when it comes to automated decision-making, enabling humans to understand better the decisions machines are making while still allowing machines to provide the speed and accuracy needed for large-scale applications. XAI can provide a crucial link between machine decision-making and human understanding and help guarantee that the decisions are accurate and ethical.

ML algorithms known as “generative AI” can extract new meaning from text, images, code, and other content types. ChatGPT from OpenAI is such a prominent example. Generative Design (GD) performed by computers produces immediate output. This does not mean that the final asset or product is obtained in a second, but it quickly provides new ideas that may not have been considered. The human can then iterate on those new ideas.

Thanks to the emerging technology of quantum computing, ML may undergo a revolution. Quantum Machine Learning (QML) is predicted to make significant strides with new algorithms and techniques that offer greater efficiency and scalability. Applications of QML range from advancing drug discovery to advancing material science and much more.

In the area of machine learning, BIOVIA’s Contract Research team is a leader in combining data analytics and prediction with simulation science. BIOVIA’s Contract Research team takes on complex and challenging problems of our clients and applies our software capabilities to help our clients with the solution. Topics range from battery cell design and cell manufacturing to sustainable formulation and packaging, therapeutics discovery, development and delivery. Using BIOVIA’s out-of-the-box software tools, the Contract Research team has developed Advanced Technology Capability (ATC) assets to address business-critical scientific and engineering challenges faced by the energy and materials industry, consumer packaged goods industry, automotive and aerospace industry as well as healthcare and Life Science industry.

Here are a few of the many successful cases where Contract Research has used our powerful and unique ATC assets on Machine Learning to address complex challenges our customers face.

Find Better Terminating Agents for Polymerization

“Living” Polymerization: The critical role of synthetic polymers ranges from their use in supersonic jets to surgical implants and requires the development of diverse materials with very specific properties. “Living” polymerization techniques allow critical molecular design and control, e.g., structure, molecular weight, functionalization, spatial location of key functional groups, creation of block, star, and macrocyclic polymers. A remarkable feature of living anionic polymerization is that the mechanism involves no formal termination step. Properties can be tuned by the use of the right terminating agent.

We in the Contract Research team at BIOVIA used ML to build suitable models for multiple key properties of the resulting polymer with the existing known terminating agents. Then, we pushed the boundary on two fronts – i) Using Generative Design strategies, we designed 100s of new potentially interesting terminating agents and ii) Explainable AI was used to obtain chemically meaningful new terminating agents. A couple of the best candidates were tested in the lab and the properties of the resulting polymers were measured to be an improvement over the original set.

Identify Aroma/Flavor Enhancer

Aroma/Flavor Enhancer: Olfactory receptors are chemoreceptors that detect aroma (or flavor). These receptors belong to the family of G protein-coupled receptors (GPCRs) and have an affinity to bind a broad range of aroma molecules. Following traditional structure-based approaches is very difficult in these cases since the range is vast and the impact on several receptors must be considered simultaneously. Instead, an “olfactophore” approach combined with ML was taken in this case. For a set of transient receptor potential (TRP) channel proteins, EC₅₀ data was provided by the customer for a list of selected aroma-enhancing molecules. Modulation of these TRP channels plays a critical role in sensory perception and their activation/inactivation can be linked to corresponding EC₅₀ values.
Machine Learning combined with descriptors from BIOVIA’s Discovery Studio resulted in improved predictions of sensory perception.
In this Scientific ML approach, we, in the Contract Research team, used 2D, 3D, and shape descriptors of the aroma molecules as well as binding site definitions of the proteins and generated predictive models. This approach was continued in multiple iterations of virtual candidates followed by lab testing to fine-tune predictions.

Optimize Performance and Cost for Fuel Cell

Performance-Cost Optimization (Electrolyzer/Fuel Cell Electrode): Proton exchange membrane (PEM) electrolyzers which efficiently split water to produce pure hydrogen and oxygen have attracted much attention due to their energy efficiency, compact design, and high current density. Hydrogen produced in this way can be used as fuel for internal combustion engines instead of gasoline and fuel cells. In a fuel cell, hydrogen and oxygen combine to generate electricity with water as the only byproduct. Fuel cells transform chemical energy into electrical energy, just like batteries, but they do not require recharging and operate as long as reactants are present. Therefore, both the electrolyzer and fuel cell are essential to achieve a zero-emission future.

Machine Learning with Pipeline Pilot and Materials Studio performed to optimize electrode cost and performance.

Contract Research used Scientific ML for a customer making the electrodes for these electrolyzers and fuel cells. Using both 3D chemistry-based quantum mechanics modeling as well as Generative Machine Learning, we were able to identify a composition that decreased the amount of expensive noble metal used and, at the same time, enhanced the performance. Based on our virtual screening of different electrode alloys, our customer built the PEM cell, tested it in the lab and successfully moved forward.

Predict Battery Lifetime

Battery Lifetime: The electrochemical reactions that drive the working of a battery and the environmental conditions they are exposed to cause the battery’s performance to degrade over time, both during usage and storage. In order to optimize the operation of the battery and ensure that it is safe and reliable, it is critical to predict the remaining useful life (RUL) accurately. It is important to predict the crucial properties at the cell level and at the pack and module level. BIOVIA Contract Research team has created an ATC for predicting the RUL using Machine Learning.

Contract Research ATC assets were applied to develop battery lifetime predictions from early cycle data.

In a Contract Research engagement, we leveraged the ATC assets we had developed and applied to our customer engagement and applied them to our customer’s battery cycling data set. Using a combination of physics-informed feature generation, chemistry-savvy descriptors and sequential features selection, we created the ML model. Following industry-standard best practices, we developed a model which can predict the lifetime accurately for the cells. We continually push the boundary of our methods and capabilities so that battery-cell manufacturers, Electric Vehicles Original Equipment Manufacturers and medical device companies get the maximum value from our ML technology.

Accelerate the Formulation Screening for Personal Care

Sustainable Personal Care: The personal care product industry faces a challenge to speed up the development process for a reduced time-to-market in a highly competitive and partially cost-sensitive environment. Customers desire safer and more sustainable products, and new regulatory demands urge formulators to replace obsolete ingredients with new ones without losing product performance. The BIOVIA Contract Research team enables real-time decisions using quantitative modeling. Dealing with complex formulations in this area is fierce for several reasons. To mention a few: i) The chemical classes of the ingredients are diverse, covering a wide range of roles such as solvents, actives, salts, solids and many more. ii) The size ranges from small molecules over mid-size surfactants and natural ingredients – often not fully characterized – to large-size (bio)polymers or even proteins. iii) To further increase the complexity, sophisticated phases are involved, such as emulsions, gels, sols or foam.

Unique formulation-enriched Scientific ML models were developed to screen bio-derived ingredients.

In a Contract Research engagement with our customer, we used the ATC assets we developed and applied them to predict relevant product-related properties. Unique formulation-enriched Scientific ML models succeeded in blind predictions of customer-specific product formulations, following best practices of feature selection and training the model. The ML models have been deployed to the customer in a democratized way and can be applied by non-experts and formulators to run real-time virtual screenings and to guide lab experiments.

Predict Formulated Product Performance and Shelf Life

Formulated Product Stability: Products in the personal care industry often degrade as they sit on store shelves and in warehouses waiting to be purchased by consumers. End users expect these products to maintain their consistency in performance and appearance until consumed. Unfortunately, sometimes these products can change in function and appearance as undesirable side reactions take place. Changes in appearance may be visible as haze or even yellowing of the product. In an engagement with our customer, we developed advanced Scientific ML models to predict how the color of product formulations changed over time. These models were constructed to make color predictions at day zero, along with predictions both several weeks and several months out. Our models were able to capture changes in the color of the product due to changes in the underlying formulation by predicting the color coordinates of the material. The predictive ability of these models provided valuable insights into the stability of various formulations. Formulators could run what-if scenarios virtually to predict new formulations that would minimize these color changes. The most suitable candidates for virtual lab testing had their models updated with the latest data from lab experiments, thus improving the machine learning models for future predictions.

In an engagement with our customer, we developed advanced Scientific ML models to predict how the color of product formulations changed over time. These models were constructed to make color predictions at day zero, along with predictions both several weeks and several months out. Our models were able to capture changes in the color of the product due to changes in the underlying formulation by predicting the color coordinates of the material. The predictive ability of these models provided valuable insights into the stability of various formulations. Formulators could run what-if scenarios virtually to predict new formulations that would minimize these color changes. The most suitable candidates for virtual lab testing had their models updated with the latest data from lab experiments, thus improving the machine learning models for future predictions.

Conclusion

As with all advanced technology, it is vital to reflect constantly on the advantages, disadvantages, limitations, and potential risks of such powerful developments. It can be a valuable tool for extracting information from big data, prioritizing, or automatizing processes, which frees time for creative tasks. On the other hand, one needs to be aware that the quality of AI and ML applications depends massively on the quality and quantity of training data. This is of particular importance when using it for diagnostic purposes in medicine. Also, in other contexts, human control and review of output from AI are still essential to ensure realistic interpretation and reasonable decisions. Overall, using AI and ML in a controlled manner with a focus on improving human life and the overall good of all society will enable more productive development and progress.

For more information, contact BIOVIA.Services.CR@3ds.com

Lalitha Subramanian, Kwan Skinner, Sabine Schweizer, Johannes Schwöbel
Dr. Lalitha Subramanian is the Global Head of Outcome-Based Contract Research at Dassault Systèmes and a Science Fellow of its BIOVIA brand. She has a Ph.D. in chemistry and did post-doctoral work with Nobel Laureate Roald Hoffman at Cornell University. Her specialties include materials research for sustainability, energy transformation (including battery and fuel cells), as well as drug development and delivery to improve human health.

Topics mentioned in this article

Science