Design & SimulationMarch 13, 2024

COSMO-RS for Modern Rational Drug Design and the SAMPL Challenge

SAMPL: Boosting Drug Design with Blind Computational Challenges
Avatar Arnim Hellweg
Avatar Michael Diedenhofen

SAMPL: Blind Challenges in Computational Drug Design

First of all, let us clarify what SAMPL means. SAMPL stands for Statistical Assessment of Modeling of Proteins and Ligands and represents a series of blind challenges aimed at improving the predictive power of computational methods relevant to modern rational drug design.  Participants should provide predictions of molecular properties for which undisclosed experimental data will be published after the contest deadline. The properties covered by the SAMPL challenges include hydration free energies, free energies of binding, protonation equilibria, as well as distribution and partition equilibria. Primarily blind prediction challenges for real-life applications like SAMPL provide information about the capabilities of the different methods. Still, they also have a positive impact on the development and improvement of prediction methods.


BIOVIA COSMO-RS is a quantum chemistry-based statistical thermodynamics model that can predict many thermodynamic properties of substances in solution. Among others, these include partition coefficients, free energies of solvation and solubilities, making COSMO-RS a valuable tool in the formulation design process in pharmaceutical industries. Using fast calculations in the virtual world can help to save time, material and costs in the real world. For example, if you are searching for a good solvent for an active pharmaceutical ingredient (API) a virtual solvent screening can provide a list of promising candidates in a short time. This can reduce the number of measurements and accelerates the development phase of a drug formulation. Because of the features above and the general approach that theoretically allows to treat any molecule that can be handled by quantum chemistry, it is obvious that the COSMO-RS method has been used repeatedly in SAMPL challenges. Ever since the first open challenge SAMPL1 in 2009, submissions with COSMO-RS have participated in these blind predictions. The submissions were often among the best.

The SAMPL 9 Challenge

The latest SAMPL9 challenge was to predict the toluene/water partition coefficients for 16 molecules, most of which are drugs with a variety of therapeutic benefits. The figure below shows that toluene and water are immiscible and form two phases. When a molecule is dissolved in this biphasic system, it is partitioned between the toluene and water phases. The ratio of the concentrations in the two phases is called the toluene/water partition coefficient. Since toluene is a non-polar molecule and water is the prime example of a polar molecule, the toluene/water partition coefficient can be seen as a measure of the affinity of a compound for polar or non-polar solvents.

Toluene/Water Partition Coefficient

In general, partition coefficients guide the rational solvent selection for purification [1] and can also be used to estimate the API’s distribution between the compartments of the human body [2].

Key Learnings from SAMPL9

The SAMPL9 blind challenge had 12 submissions. Two of them used COSMO-RS, but with slightly different approaches. The first was submitted by Thomas Nevolianis, Raja A. Ahmed, and Kai Leonhard from the RWTH Aachen University and the second by a BIOVIA team [3]. Based on the results, the submission of our colleagues from Aachen University was ranked in the 1st place (or 2nd, depending on the ranking criteria). In general, all of the contributions showed relatively large deviations predicting toluene/water partition coefficients compared to octanol/water partition coefficients in the SAMPL 7 challenge [4]. The root mean square deviation from the experimental data (RMSD) is 1.23 logP units and the correlation coefficient is 0.93. The RMSD is larger than we expected, but COSMO-RS has outperformed competing approaches and its predictive power is very impressive for a method that is designed to be generally applicable and not specifically adapted to the given problem. In the second submission, we attempted to use a workflow, which automatically creates possible tautomers of the molecules. However, this approach resulted in significant deviations in two cases, which indicates a need for improvement in this area.  We see this outcome as an excellent example of how blind tests positively impact the development and improvement of prediction methods.

1. Abramov Y A (2018) Rational Solvent Selection for Pharmaceutical Impurity Purge. Cryst Growth Des 18:1208−1214.

2. Moss G P, Cronin M T D (2002) Quantitative structure–permeability relationships for percutaneous absorption: re-analysis of steroid data. Int J Pharm 238:105-109

3. Nevolianis T, Ahmed R A, Hellweg A, Diedenhofen M, Leonhard K (2023) Blind prediction of toluene/water partition coefficients using COSMO-RS: results from the SAMPL9 challenge. Phys Chem Chem Phys 25, 31683-31691.

4. Warnau J, Wichmann K, Reinisch J (2021) COSMO-RS predictions of LogP in the SAMPL7 blind challenge. J Comput Aided Mol Des 35: 813–818.

Stay up to date

Receive monthly updates on content you won’t want to miss


Register here to receive a monthly update on our newest content.