#rnd
Valuable addition to the modern chemist’s toolkit.
HArD (HeteroAryl Descriptors) - a comprehensive DFT-based database of >31,500 heteroaryl substituents spanning diverse ring systems and regioisomers, each annotated with rich steric, electronic, and aromaticity descriptors.
Their introduction of σₕₑₜ, a Hammett-type constant derived from computed carboxylic acid pKₐ values, nicely extends the classical Hammett framework to heteroaromatic chemistry.
I’ve often leaned on traditional tables and Ertl's tool for aryl substituents - but this work is a real step forward for quantitatively capturing heteroaryl electronic effects.
Interactive Database: https://hard.pengliugroup.com/
Paper: https://www.nature.com/articles/s41597-025-05198-z
Valuable addition to the modern chemist’s toolkit.
HArD (HeteroAryl Descriptors) - a comprehensive DFT-based database of >31,500 heteroaryl substituents spanning diverse ring systems and regioisomers, each annotated with rich steric, electronic, and aromaticity descriptors.
Their introduction of σₕₑₜ, a Hammett-type constant derived from computed carboxylic acid pKₐ values, nicely extends the classical Hammett framework to heteroaromatic chemistry.
I’ve often leaned on traditional tables and Ertl's tool for aryl substituents - but this work is a real step forward for quantitatively capturing heteroaryl electronic effects.
Interactive Database: https://hard.pengliugroup.com/
Paper: https://www.nature.com/articles/s41597-025-05198-z
Nature
A database of steric and electronic properties of heteroaryl substituents
Scientific Data - A database of steric and electronic properties of heteroaryl substituents
👍1
#oprd
This paper presents a web-based digital tool for virtually screening and optimizing aqueous liquid–liquid extraction, with interactive visualization of speciation, partitioning, and extraction efficiency across pH and solvent space. While the authors built it in Streamlit, the workflow can be easily reproduced in Jupyter notebooks. A must-read if you’re exploring selective extraction strategies.
https://pubs.rsc.org/en/content/articlelanding/2025/dd/d5dd00104h
This paper presents a web-based digital tool for virtually screening and optimizing aqueous liquid–liquid extraction, with interactive visualization of speciation, partitioning, and extraction efficiency across pH and solvent space. While the authors built it in Streamlit, the workflow can be easily reproduced in Jupyter notebooks. A must-read if you’re exploring selective extraction strategies.
https://pubs.rsc.org/en/content/articlelanding/2025/dd/d5dd00104h
pubs.rsc.org
A digital tool for liquid–liquid extraction process design
Aqueous liquid–liquid extractions are crucial for purifying compounds and removing impurities in the pharmaceutical industry. However, the extensive solvent space involved in such operations highlights the need for an informed approach in solvent selection.…
👍2
#rnd
This paper from Coley's group presents FlowER, a forward reaction prediction model that treats chemistry as electron redistribution and explicitly respects mass conservation. Unlike black-box product predictors, it generates full mechanistic pathways, making results more interpretable and adaptable even to new reaction classes. An interesting step toward more reliable and mechanistically aware forward synthesis prediction.
https://arxiv.org/abs/2502.12979
https://www.nature.com/articles/s41586-025-09426-9
This paper from Coley's group presents FlowER, a forward reaction prediction model that treats chemistry as electron redistribution and explicitly respects mass conservation. Unlike black-box product predictors, it generates full mechanistic pathways, making results more interpretable and adaptable even to new reaction classes. An interesting step toward more reliable and mechanistically aware forward synthesis prediction.
https://arxiv.org/abs/2502.12979
https://www.nature.com/articles/s41586-025-09426-9
👍2
#oprd
Lilly’s digital group has a reputation for going big when it comes to process design prediction, and this paper is a great example focused on crystallisation. The authors show how you can go from a chemical structure all the way to predicting polymorphs, solubility, morphology, and even growth behaviour - basically a digital shortcut to designing a crystallisation process.
If you don’t feel like reading a whole textbook on crystallisation theory, this piece gives you the essentials in a form that’s directly useful for process chemists (especially those with a materials science background)
https://pubs.acs.org/doi/10.1021/acs.cgd.3c01390
Lilly’s digital group has a reputation for going big when it comes to process design prediction, and this paper is a great example focused on crystallisation. The authors show how you can go from a chemical structure all the way to predicting polymorphs, solubility, morphology, and even growth behaviour - basically a digital shortcut to designing a crystallisation process.
If you don’t feel like reading a whole textbook on crystallisation theory, this piece gives you the essentials in a form that’s directly useful for process chemists (especially those with a materials science background)
https://pubs.acs.org/doi/10.1021/acs.cgd.3c01390
ACS Publications
Pharmaceutical Digital Design: From Chemical Structure through Crystal Polymorph to Conceptual Crystallization Process
A workflow for the digital design of crystallization processes starting from the chemical structure of the active pharmaceutical ingredient (API) is a multistep, multidisciplinary process. A simple version would be to first predict the API crystal structure…
👍1
OPRD Radar
#oprd Lilly’s digital group has a reputation for going big when it comes to process design prediction, and this paper is a great example focused on crystallisation. The authors show how you can go from a chemical structure all the way to predicting polymorphs…
#oprd
Which reminds me about really practical workflow for figuring out how impurities end up in crystals - whether through surface adsorption, inclusions, or solid solutions - using just a few straightforward experiments.
The authors show with several API case studies that once you know the exact mechanism, you can target purification strategies much more effectively instead of relying on endless trial-and-error. A great read if you’ve ever had to wrestle with impurity headaches during crystallisation.
https://pubs.acs.org/doi/10.1021/acs.oprd.0c00166
Which reminds me about really practical workflow for figuring out how impurities end up in crystals - whether through surface adsorption, inclusions, or solid solutions - using just a few straightforward experiments.
The authors show with several API case studies that once you know the exact mechanism, you can target purification strategies much more effectively instead of relying on endless trial-and-error. A great read if you’ve ever had to wrestle with impurity headaches during crystallisation.
https://pubs.acs.org/doi/10.1021/acs.oprd.0c00166
ACS Publications
A Structured Approach To Cope with Impurities during Industrial Crystallization Development
The perfect separation with optimal productivity, yield, and purity is very difficult to achieve. Despite its high selectivity, in crystallization unwanted impurities routinely contaminate a crystallization product. Awareness of the mechanism by which the…
👍1
#llm #retrosynthesis
Looks like even Nvidia is now jumping in retrosynthesis.
ReaSyn is basically retrosynthesis rebranded in LLM-speak: instead of MCTS grinding through disconnections with extracted templates, it uses a “Chain-of-Reaction” reasoning trace so the model can narrate each step. They blend supervised learning, RL fine-tuning, and heavy decoding tricks to steer the model toward plausible chemistry.
Probably not as scalable as classical MCTS, but it could be handy if transformations are well-curated and the building block space is carefully pre-filtered.
📄https://arxiv.org/pdf/2509.16084
⚙️ https://shorturl.at/QvQgt (Model weights)
Looks like even Nvidia is now jumping in retrosynthesis.
ReaSyn is basically retrosynthesis rebranded in LLM-speak: instead of MCTS grinding through disconnections with extracted templates, it uses a “Chain-of-Reaction” reasoning trace so the model can narrate each step. They blend supervised learning, RL fine-tuning, and heavy decoding tricks to steer the model toward plausible chemistry.
Probably not as scalable as classical MCTS, but it could be handy if transformations are well-curated and the building block space is carefully pre-filtered.
📄https://arxiv.org/pdf/2509.16084
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
#oprd
Nice BO use case from ETH Zurich & Novo Nordisk: they co-optimize mAb formulations across Tm, kD, and air–water interface stability, finding good excipients combination in ~33 experiments while keeping pH and osmolality in check.
Still some room for refinement, but great to see BO moving beyond the small-molecule domain.
https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.5c00591
Nice BO use case from ETH Zurich & Novo Nordisk: they co-optimize mAb formulations across Tm, kD, and air–water interface stability, finding good excipients combination in ~33 experiments while keeping pH and osmolality in check.
Still some room for refinement, but great to see BO moving beyond the small-molecule domain.
https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.5c00591
ACS Publications
Bayesian Optimization for Efficient Multiobjective Formulation Development of Biologics
Biologics, including emerging engineered formats, can often exhibit poor developability profiles, complicating their translation into successful therapeutics. While formulation design can substantially mitigate some developability issues, it represents a…
👍2
#rnd
Large-scale cheminformatics analysis by Ertl et al. (SAscore author) maps how ring systems used in medicinal chemistry evolved over time, highlighting which heterocycles dominate today’s drug space and which are fading.
The study links shifts in ring popularity to synthetic accessibility and changing design strategies, offering a data-driven view of scaffold trends valuable for modern drug discovery.
https://chemrxiv.org/engage/chemrxiv/article-details/6891a60123be8e43d6d10ab0
Large-scale cheminformatics analysis by Ertl et al. (SAscore author) maps how ring systems used in medicinal chemistry evolved over time, highlighting which heterocycles dominate today’s drug space and which are fading.
The study links shifts in ring popularity to synthetic accessibility and changing design strategies, offering a data-driven view of scaffold trends valuable for modern drug discovery.
https://chemrxiv.org/engage/chemrxiv/article-details/6891a60123be8e43d6d10ab0
❤2👍2
OPRD Radar
#rnd Large-scale cheminformatics analysis by Ertl et al. (SAscore author) maps how ring systems used in medicinal chemistry evolved over time, highlighting which heterocycles dominate today’s drug space and which are fading. The study links shifts in ring…
#rnd
The aforementioned SAscore and its successor BR-SAscore - still the best heuristic for synthetic accessibility, imho. Neither the modern SCscore, RAscore, nor SYBA manage to beat it. Very handy if you want to estimate which AI-slop molecules actually have a chance of being synthesised.
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-1-8
The aforementioned SAscore and its successor BR-SAscore - still the best heuristic for synthetic accessibility, imho. Neither the modern SCscore, RAscore, nor SYBA manage to beat it. Very handy if you want to estimate which AI-slop molecules actually have a chance of being synthesised.
https://jcheminf.biomedcentral.com/articles/10.1186/1758-2946-1-8
BioMed Central
Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions - Journal…
Background A method to estimate ease of synthesis (synthetic accessibility) of drug-like molecules is needed in many areas of the drug discovery process. The development and validation of such a method that is able to characterize molecule synthetic accessibility…
👍1
#cmc
ICH is merging all stability guides into one big chunk - Q1A–E/Q5C will now live together as just Q1.
What’s new:
– Enhanced emphasis on knowledge- and risk-based approaches
– New or expanded types of stability / supportive studies (f… finally)
– Lifecycle / post-approval changes & commitments
– New Annex for ATMPs and other new modalities
A must-read for the grown-ups in pharma development - you’ll want to be ready for those GAP assessments.🥲
https://www.ema.europa.eu/en/ich-q1-guideline-stability-testing-drug-substances-drug-products
ICH is merging all stability guides into one big chunk - Q1A–E/Q5C will now live together as just Q1.
What’s new:
– Enhanced emphasis on knowledge- and risk-based approaches
– New or expanded types of stability / supportive studies (f… finally)
– Lifecycle / post-approval changes & commitments
– New Annex for ATMPs and other new modalities
A must-read for the grown-ups in pharma development - you’ll want to be ready for those GAP assessments.🥲
https://www.ema.europa.eu/en/ich-q1-guideline-stability-testing-drug-substances-drug-products
👍1
#oprd
Cheeky robots are now coming for lab coats too, not just IT jobs.
An interesting preprint introduces RAISE - a self-driving lab that fully automates formulation, contact-angle measurement, and optimization in a Bayesian closed loop. It turns surface science from tedious manual work into rapid, data-driven formulation discovery. Not pharma per se, but it could easily be applied there.
https://arxiv.org/abs/2510.06546
Cheeky robots are now coming for lab coats too, not just IT jobs.
An interesting preprint introduces RAISE - a self-driving lab that fully automates formulation, contact-angle measurement, and optimization in a Bayesian closed loop. It turns surface science from tedious manual work into rapid, data-driven formulation discovery. Not pharma per se, but it could easily be applied there.
https://arxiv.org/abs/2510.06546
👍1
#rnd
Not exactly groundbreaking, but a smart and pragmatic shift from Iktos.
Instead of asking “how do we make this designed molecule?” the authors flip the question to “what can we actually make from what’s already on the shelf - and what can we feed to the robots without endless reconfiguration?”
Their cluster synthesis strategy groups diverse reactions into a few shared condition clusters, streamlining execution and cutting setup pain. It’s less about grand design and more about making the most of what’s physically doable - a mindset that feels very relevant for real-world drug discovery labs.
https://chemrxiv.org/engage/chemrxiv/article-details/68de33daf2aff167708137a8 (preprint)
Not exactly groundbreaking, but a smart and pragmatic shift from Iktos.
Instead of asking “how do we make this designed molecule?” the authors flip the question to “what can we actually make from what’s already on the shelf - and what can we feed to the robots without endless reconfiguration?”
Their cluster synthesis strategy groups diverse reactions into a few shared condition clusters, streamlining execution and cutting setup pain. It’s less about grand design and more about making the most of what’s physically doable - a mindset that feels very relevant for real-world drug discovery labs.
https://chemrxiv.org/engage/chemrxiv/article-details/68de33daf2aff167708137a8 (preprint)
👍1
#rnd
AI for Scientific Discovery is a Social Problem
A tool is only as good as its user.
https://arxiv.org/abs/2509.06580
AI for Scientific Discovery is a Social Problem
A tool is only as good as its user.
https://arxiv.org/abs/2509.06580
👍2
#rnd #cmc
This review summarizes key challenges and considerations in translating machine learning models into decision-making tools for real-world drug discovery projects, in particular, related to compound toxicity and safety. This includes making choices about data, modeling, validation, model metrics, and applying the model thus obtained to the process of drug discovery.
https://pubs.acs.org/doi/10.1021/acs.chemrestox.5c00033
This review summarizes key challenges and considerations in translating machine learning models into decision-making tools for real-world drug discovery projects, in particular, related to compound toxicity and safety. This includes making choices about data, modeling, validation, model metrics, and applying the model thus obtained to the process of drug discovery.
https://pubs.acs.org/doi/10.1021/acs.chemrestox.5c00033
ACS Publications
Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World
Machine learning (ML) is increasingly valuable for predicting molecular properties and toxicity in drug discovery. However, toxicity-related end points have always been challenging to evaluate experimentally with respect to in vivo translation due to the…
👍2
#rnd
The preprint from Bayer applies WISP (Workflow for Interpretability Scoring using matched molecular Pairs) to real chemical datasets like LCAP yields, Factor Xa inhibition, and AMES mutagenicity. It shows how explainability methods can highlight which structural changes (e.g. adding a methyl group) influence model predictions. Importantly, WISP can also reveal when these explanations don’t reflect real chemistry, helping to spot weak or misleading models.
Think of it as a structured way to peek inside the black box and check whether the model is actually learning chemistry or just patterns in the data.
https://chemrxiv.org/engage/chemrxiv/article-details/68bb381ea94eede154ed44f8
The preprint from Bayer applies WISP (Workflow for Interpretability Scoring using matched molecular Pairs) to real chemical datasets like LCAP yields, Factor Xa inhibition, and AMES mutagenicity. It shows how explainability methods can highlight which structural changes (e.g. adding a methyl group) influence model predictions. Importantly, WISP can also reveal when these explanations don’t reflect real chemistry, helping to spot weak or misleading models.
Think of it as a structured way to peek inside the black box and check whether the model is actually learning chemistry or just patterns in the data.
https://chemrxiv.org/engage/chemrxiv/article-details/68bb381ea94eede154ed44f8
👍1
Building ChemInformatic Agents with LangGraph
A hands-on introduction to agents and tool calling
- Build and customize AI agents that can reason, plan, and execute tasks in chemical research
- Use tool calling to connect models with cheminformatics libraries
- Explore real-world use cases like property prediction
https://colab.research.google.com/drive/1nuuVA-1RTLqUC2AyKBc1OHmfPzdhhyJG
A hands-on introduction to agents and tool calling
- Build and customize AI agents that can reason, plan, and execute tasks in chemical research
- Use tool calling to connect models with cheminformatics libraries
- Explore real-world use cases like property prediction
https://colab.research.google.com/drive/1nuuVA-1RTLqUC2AyKBc1OHmfPzdhhyJG
Google
Part 8 Agent_Workshop_21_10.ipynb
Colab notebook
👍1
#cmc #oprd
Quality by digital design to accelerate sustainable medicines development
If you get tired of QbD, now some people invent QbDD.
Jokes aside - good review of what was already done and how QbD can evolve in digital era.
https://doi.org/10.1016/j.ijpharm.2025.125625
Quality by digital design to accelerate sustainable medicines development
If you get tired of QbD, now some people invent QbDD.
Jokes aside - good review of what was already done and how QbD can evolve in digital era.
https://doi.org/10.1016/j.ijpharm.2025.125625
👍1
#rnd
oligowiki is a curated, queryable database focused on therapeutic oligonucleotides and the chemistries that define them
https://www.oligowizard.com/wiki/
oligowiki is a curated, queryable database focused on therapeutic oligonucleotides and the chemistries that define them
https://www.oligowizard.com/wiki/
Oligowizard
Oligowiki - Nucleic Acid Therapeutics Knowledge Hub
Comprehensive, curated knowledge base for nucleic acid therapeutics...
👍1
OPRD Radar
#rnd oligowiki is a curated, queryable database focused on therapeutic oligonucleotides and the chemistries that define them https://www.oligowizard.com/wiki/
#rnd
It reminds me of another crowdsourced database collecting small-molecule synthesis routes.
https://chemistrybydesign.oia.arizona.edu/
It reminds me of another crowdsourced database collecting small-molecule synthesis routes.
https://chemistrybydesign.oia.arizona.edu/
👍1
#oprd
There is always room to read about crystallisation development!
This paper presented the general idea of using rapid process modeling as a parallel instrument to the widely applied DoE-based design and scale-up.
Kind of overkill, but interesting to read for understanding how process could be described.
https://pubs.acs.org/doi/10.1021/acs.oprd.4c00199
There is always room to read about crystallisation development!
This paper presented the general idea of using rapid process modeling as a parallel instrument to the widely applied DoE-based design and scale-up.
Kind of overkill, but interesting to read for understanding how process could be described.
https://pubs.acs.org/doi/10.1021/acs.oprd.4c00199
ACS Publications
Derisking Crystallization Process Development and Scale-Up Using a Complementary, “Quick and Dirty” Digital Design
Despite the spread of digital (model and AI-based) techniques, the industry-standard pharmaceutical crystallization design and scale-up is still based on experiments’ design (DoE). Many orthogonally designed and usually relatively lightly monitored experiments…
❤1👍1