Leveraging Prompt Engineering in Large Language Models for Accelerating Chemical Research🔥
https://pubs.acs.org/doi/full/10.1021/acscentsci.4c01935
📕 ACS Central Science (IF=13.1)
https://pubs.acs.org/doi/full/10.1021/acscentsci.4c01935
In this Outlook, we delve into various prompt engineering techniques and illustrate relevant examples for extensive research from metal–organic frameworks and fast-charging batteries to autonomous experiments.
We also elucidate the current limitations of prompt engineering with LLMs such as incomplete or biased outcomes and constraints imposed by closed-source limitations.
Although LLM-assisted chemical research is still in its early stages, the application of prompt engineering will significantly enhance accuracy and reliability, thereby accelerating chemical research.
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
Leveraging Prompt Engineering in Large Language Models for Accelerating Chemical Research
Artificial intelligence (AI) using large language models (LLMs) such as GPTs has revolutionized various fields. Recently, LLMs have also made inroads in chemical research even for users without expertise in coding. However, applying LLMs directly may lead…
🔥4❤3👍3
Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.4c00747
🔥 OA версия на ArXiv: https://arxiv.org/abs/2404.03969
📕 Journal of Chemical Information and Modeling (IF=5.6)
#review
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.4c00747
In this review, we aim to distill insights from current research on employing transformer models for Molecular Property Prediction (MPP). We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives.
Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field’s understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.
#review
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years
Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints…
❤3👍2🔥2
Understanding Conformation Importance in Data-Driven Property Prediction Models🔥
https://pubs.acs.org/doi/10.1021/acs.jcim.5c00018
📕 Journal of Chemical Information and Modeling (IF=5.6)
#method
https://pubs.acs.org/doi/10.1021/acs.jcim.5c00018
This study investigates the influence of using multiple conformers in machine learning-based property prediction, comparing two- and three-dimensional descriptors using three independent data sets: a large-scale quantum mechanical property, a medium-scale melting point, and small-scale enantioselective chemical reaction data sets.
One unique aspect of this study is creating these carefully controlled data sets for models’ performance evaluation in conformational diversity and the target property’s dependence on conformation.
Our findings show that using all available conformers as simple data augmentation consistently achieves high prediction accuracy among aggregation approaches, followed by mean aggregation.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
Understanding Conformation Importance in Data-Driven Property Prediction Models
The prediction of molecular properties is essential in chemoinformatics and has many applications in drug discovery and materials design. Molecular representations play a key role in the prediction models to achieve high prediction accuracy. Nevertheless…
👍4❤3🔥3
Transfer learning across different photocatalytic organic reactions🔥
https://doi.org/10.1038/s41467-025-58687-5
📕 Nature Communications (IF=14.7)
#method
https://doi.org/10.1038/s41467-025-58687-5
Herein, we apply a domain-adaptation-based transfer-learning (TL) approach to photocatalysis. Despite being different reaction types, the knowledge of the catalytic behavior of organic photosensitizers (OPSs) from photocatalytic cross-coupling reactions is successfully transferred to ML for a [2+2] cycloaddition reaction, improving the prediction of the photocatalytic activity compared with conventional ML approaches. Furthermore, a satisfactory predictive performance is achieved by using only ten training data points.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Transfer learning across different photocatalytic organic reactions
Nature Communications - The potential of transfer learning as an effective tool for predicting photosensitizer catalytic activity remains underexplored in organic chemistry. Here, the authors apply...
❤2👍2🔥2
Computational Discovery of Transition-metal Complexes: From High-throughput Screening to Machine Learning🔥
https://pubs.acs.org/doi/10.1021/acs.chemrev.1c00347
📕 Chemical Reviews (IF=51.4)
#review
https://pubs.acs.org/doi/10.1021/acs.chemrev.1c00347
The review will cover the development, promise, and limitations of “traditional” computational chemistry as it pertains to data generation for inorganic molecular discovery. The review will also discuss the opportunities and limitations in leveraging experimental data sources. We will focus on how advances in statistical modeling, artificial intelligence, multiobjective optimization, and automation accelerate discovery of lead compounds and design rules. The overall objective of this review is to showcase how bringing together advances from diverse areas of computational chemistry and computer science have enabled the rapid uncovering of structure–property relationships in transition-metal chemistry.
We aim to highlight how unique considerations in motifs of metal–organic bonding (e.g., variable spin and oxidation state, and bonding strength/nature) set them and their discovery apart from more commonly considered organic molecules. We will also highlight how uncertainty and relative data scarcity in transition-metal chemistry motivate specific developments in machine learning representations, model training, and in computational chemistry. Finally, we will conclude with an outlook of areas of opportunity for the accelerated discovery of transition-metal complexes.
#review
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4❤3🔥3
A Perspective on Foundation Models in Chemistry 🔥
https://pubs.acs.org/doi/10.1021/jacsau.4c01160
📕 JACS Au (IF=8.6)
#review
https://pubs.acs.org/doi/10.1021/jacsau.4c01160
Foundation models are an emerging paradigm in artificial intelligence (AI), with successful examples like ChatGPT transforming daily workflows. Generally, foundation models are large-scale, pretrained models capable of adapting to various downstream tasks by leveraging extensive data and model scaling.
Their success has inspired researchers to develop foundation models for a wide range of chemical challenges, from materials discovery to understanding structure–property relationships, areas where conventional machine learning (ML) models often face limitations.
In addition, foundation models hold promise for addressing persistent ML challenges in chemistry, such as data scarcity and poor generalization. In this perspective, we review recent progress in the development of foundation models in chemistry across applications of varying scope.
#review
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
A Perspective on Foundation Models in Chemistry
Foundation models are an emerging paradigm in artificial intelligence (AI), with successful examples like ChatGPT transforming daily workflows. Generally, foundation models are large-scale, pretrained models capable of adapting to various downstream tasks…
❤3👍3🔥2
Explicit relation between thin film chromatography and column chromatography conditions from statistics and machine learning🔥
https://doi.org/10.1038/s41467-025-56136-x
📕 Nature Communications (IF=14.7)
#method
https://doi.org/10.1038/s41467-025-56136-x
This study explicitly elucidates how chemists use thin-layer chromatography (TLC) to determine column chromatography (CC) conditions, employing statistical analysis and machine learning techniques. An experimental dataset of the CC is generated from the automatic platform developed in this study. On this basis, an “artificial intelligence (AI) experience” is generated through a knowledge discovery framework, where the relationship between the retardation factor (RF) value from TLC and retention volume from CC is unveiled in the form of explicit equations. These equations demonstrate satisfactory accuracy and generalizability, providing a scientific basis for the selection of the experimental conditions, and contributing to a better understanding of chromatography.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Explicit relation between thin film chromatography and column chromatography conditions from statistics and machine learning
Nature Communications - The selection of experimental conditions for column chromatography is usually determined by experience. Here, authors have discovered explicit relation between thin layer...
❤5👍5🔥4
Pre-trained molecular representations enable antimicrobial discovery🔥
https://www.nature.com/articles/s41467-025-58804-4
📕 Nature Communications (IF=14.7)
#method
https://www.nature.com/articles/s41467-025-58804-4
Here, we introduce a lightweight computational strategy for antimicrobial discovery that builds on MolE (Molecular representation through redundancy reduced Embedding), a self-supervised deep learning framework that leverages unlabeled chemical structures to learn task-independent molecular representations.
By combining MolE representation learning with available, experimentally validated compound-bacteria activity data, we design a general predictive model that enables assessing compounds with respect to their antimicrobial potential.
Our model correctly identifies recent growth-inhibitory compounds that are structurally distinct from current antibiotics. Using this approach, we discover de novo, and experimentally confirm, three human-targeted drugs as growth inhibitors of Staphylococcus aureus.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Pre-trained molecular representations enable antimicrobial discovery
Nature Communications - Here, the authors introduce a computational strategy for antimicrobial discovery that addresses the scarcity of large datasets. Based on data-driven representations of...
👍3❤2🔥2
The QDπ dataset, training data for drug-like molecules and biopolymer fragments and their interactions
https://www.nature.com/articles/s41597-025-04972-3
📕 Scientific Data (IF=5.9)
#dataset
https://www.nature.com/articles/s41597-025-04972-3
In this study, we introduce the QDπ dataset which incorporates data taken from several datasets. We use a query—by—committee active learning strategy to extract data from large datasets to maximize the diversity and avoid redundancy as relevant for neural network training to construct the QDπ dataset.
The QDπ dataset requires only 1.6 million structures to express the chemical diversity of 13 elements from the various source datasets at the ωB97M-D3(BJ)/def2-TZVPPD level of theory.
The QDπ dataset enables creation of flexible target loss functions for neural network training relevant to drug discovery, including information-dense data sets of relative conformational energies and barriers, intermolecular interactions, tautomers and relative protonation energies of drug-like compounds and biomolecular fragments.
#dataset
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
The QDπ dataset, training data for drug-like molecules and biopolymer fragments and their interactions
Scientific Data - The QDπ dataset, training data for drug-like molecules and biopolymer fragments and their interactions
👍4❤3🔥3
Machine learning prediction of enzyme optimum pH
https://www.nature.com/articles/s42256-025-01026-6
📕 Nature Machine Intelligence (IF=23.8)
#method
https://www.nature.com/articles/s42256-025-01026-6
Here we proposed and evaluated various machine learning methods for predicting pHopt, conducting extensive hyperparameter optimization and training over 11,000 model instances.
Our results demonstrate that models utilizing language model embeddings markedly outperform other methods in predicting pHopt. We present EpHod, the best-performing model, to predict pHopt, making it publicly available to researchers. From sequence data, EpHod directly learns structural and biophysical features that relate to pHopt, including proximity of residues to the catalytic centre and the accessibility of solvent molecules.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Machine learning prediction of enzyme optimum pH
Nature Machine Intelligence - Accurately predicting the optimal pH level for enzyme activity is challenging due to the complex relationship between enzyme structure and function. Gado and...
👍5❤3🔥3🐳1
Predictive modeling of visible-light azo-photoswitches’ properties using structural features🔥
https://doi.org/10.1186/s13321-025-00993-7
📕 Journal of Cheminformatics (IF=7.1)
#method
https://doi.org/10.1186/s13321-025-00993-7
In this manuscript we present the strategy for modeling photoswitch properties (maximum absorption wavelength and thermal half-life of photoisomers) of visible-light azo-photoswitches using structural data. We compile a comprehensive data set from literature sources and perform a rigorous benchmark to select the best feature type and modeling approach.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
SpringerLink
Predictive modeling of visible-light azo-photoswitches’ properties using structural features
Journal of Cheminformatics - In this manuscript we present the strategy for modeling photoswitch properties (maximum absorption wavelength and thermal half-life of photoisomers) of visible-light...
👍5❤3🔥3
Leveraging Quantum Chemistry and Machine Learning for the Design of Low-Valent Transition Metal Catalysts in Nitrogen to Ammonia Conversion
https://pubs.acs.org/doi/10.1021/jacs.5c00099
📕 Journal of the American Chemical Society (IF=14.4)
#method
https://pubs.acs.org/doi/10.1021/jacs.5c00099
Here, we integrate quantum chemistry, molecular dynamics, and machine learning (ML) to uncover mechanistic features governing nitrogen reduction reaction (NRR) activity and guide catalyst design.
Density functional theory (DFT) and ab initio molecular dynamics reveal that [Fe(CAAC)2] leverages redox noninnocent CAAC ligands to stabilize Fe(I) ([FeI(CAAC)2·–]), with strong antiferromagnetic coupling (JFe-CAAC = −1817 cm–1). Flexibility of bulky Dipp groups found to hinder N2 binding, rationalizing experimental observations. The exothermic formation of [(CAAC(H))2Fe] (ΔG = −4.5 kJ/mol) with in situ generated H2 exposure rationalizes the lower TON observed via catalyst deactivation.
ML models trained on quantum descriptors such as M–C bond lengths, spin density, and frontier orbital energies identify the M–C distance as a key predictor of reactivity. A composite free energy metric (ΔGtot) encompassing cis-trans isomerization (ΔG10), N2 binding (ΔG20), and the first reduction step (ΔG30) enables ranking of candidate catalysts. Moreover, Ti and V complexes show the lowest ΔGtot (24–60 kJ/mol), while late transition and coinage metals exceed 120 kJ/mol, correlating with lower activity.
By providing unprecedented insights into the interplay among ligand design, metal choice, and catalytic efficiency, this work lays a critical foundation for the rational design of homogeneous NRR catalysts, with implications for advancing sustainable ammonia production technologies.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
Leveraging Quantum Chemistry and Machine Learning for the Design of Low-Valent Transition Metal Catalysts in Nitrogen to Ammonia…
The conversion of N2 to NH3 under ambient conditions is a major goal in sustainable chemistry. Homogeneous catalysts, particularly those employing cyclic(alkyl)(amino)carbene (CAAC) ligands, have demonstrated promise in stabilizing low-valent Fe centers,…
👍4❤3🔥3
Token-Mol 1.0: tokenized drug design with large language models🔥
https://doi.org/10.1038/s41467-025-59628-y
📕 Nature Communications (IF=14.7)
#method
https://doi.org/10.1038/s41467-025-59628-y
Here, we present Token-Mol, a token-only 3D drug design model that encodes both 2D and 3D structural information, along with molecular properties, into discrete tokens.
The model surpasses existing methods, improving molecular conformation generation by over 10% and 20% across two datasets, while outperforming token-only models by 30% in property prediction. In pocket-based molecular generation, it enhances drug-likeness and synthetic accessibility by approximately 11% and 14%, respectively. Notably, Token-Mol operates 35 times faster than expert diffusion models.
In real-world validation, it improves success rates and, when combined with reinforcement learning, further optimizes affinity and drug-likeness, advancing AI-driven drug discovery.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Token-Mol 1.0: tokenized drug design with large language models
Nature Communications - In this work the authors present Token-Mol, a token-only 3D drug design model, which deploys the Gaussian cross-entropy (GCE) loss function for regression tasks. It exhibits...
👍4❤3🔥3
Bridging chemistry and artificial intelligence by a reaction description language
https://doi.org/10.1038/s42256-025-01032-8
📕 Nature Machine Intelligence (IF=23.8)
#method
https://doi.org/10.1038/s42256-025-01032-8
Here, we present ReactSeq, a reaction description language that defines molecular editing operations for step-by-step chemical transformation. Based on ReactSeq, language models for retrosynthesis prediction may consistently excel in all benchmark tests, and demonstrate promising emergent abilities in the human-in-the-loop and explainable artificial intelligence. Moreover, ReactSeq has allowed us to obtain universal and reliable representations of chemical reactions, which enable navigation of the reaction space and aid in the recommendation of experimental procedures and prediction of reaction yields. We foresee that ReactSeq can serve as a bridge to narrow the gap between chemistry and artificial intelligence.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Bridging chemistry and artificial intelligence by a reaction description language
Nature Machine Intelligence - Xiong et al. introduce ReactSeq, a reaction description language that captures molecular editing operations in chemical reactions. It enables language models to excel...
👍6❤4🔥4
AI Approaches to Homogeneous Catalysis with Transition Metal Complexes🔥
https://doi.org/10.1021/acscatal.5c01202
📕 ACS Catalysis (IF=11.7)
#review
https://doi.org/10.1021/acscatal.5c01202
Artificial intelligence (AI) is transforming research in chemistry, including homogeneous catalysis with transition metals. Over the past 15 years, the number of publications combining AI with catalysis has increased exponentially, reflecting the interest and strength of this strategy in the field. Since this is a broad emerging discipline, it is essential to establish guidelines that clarify the diverse approaches already available.
Initially, models were developed to predict key aspects of the reaction mechanism, aiming at screening catalyst candidates. Subsequent studies have incorporated experimental data to optimize reaction conditions and yields. More recently, generative AI based on deep learning methods has enabled the inverse design of novel catalysts with predefined target properties. While most studies rely on computational data, recent advancements have improved the acquisition of experimental data, enabling AI-driven automated workflows.
This Perspective gives a critical overview on selected studies that reflect the state of the art in the application of AI to homogeneous metal-catalyzed reactions, also highlighting future opportunities and challenges.
#review
Please open Telegram to view this post
VIEW IN TELEGRAM
ACS Publications
AI Approaches to Homogeneous Catalysis with Transition Metal Complexes
Artificial intelligence (AI) is transforming research in chemistry, including homogeneous catalysis with transition metals. Over the past 15 years, the number of publications combining AI with catalysis has increased exponentially, reflecting the interest…
👍4❤3🔥3
Generalizable, fast, and accurate DeepQSPR with fastprop🔥
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-025-01013-4
🖥 https://github.com/jacksonburns/fastprop
📕 Journal of Cheminformatics (IF=7.1)
#method
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-025-01013-4
This paper introduces fastprop, a software package and general Deep-QSPR framework that combines a cogent set of molecular descriptors with deep learning to achieve state-of-the-art performance on datasets ranging from tens to tens of thousands of molecules.
fastprop provides both a user-friendly Command Line Interface and highly interoperable set of Python modules for the training and deployment of feedforward neural networks for property prediction.
This approach yields improvements in speed and interpretability over existing methods while statistically equaling or exceeding their performance across most of the tested benchmarks.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
BioMed Central
Generalizable, fast, and accurate DeepQSPR with fastprop - Journal of Cheminformatics
Quantitative Structure–Property Relationship studies (QSPR), often referred to interchangeably as QSAR, seek to establish a mapping between molecular structure and an arbitrary target property. Historically this was done on a target-by-target basis with new…
❤4👍4🔥4
SurfPro – a curated database and predictive model of experimental properties of surfactants🔥
https://doi.org/10.1039/D4DD00393D
📕 Digital Discovery (IF=6.2)
#dataset
https://doi.org/10.1039/D4DD00393D
Surfactant data are scattered across many literature sources, and reported in a manner which is often unsuitable as input for predictive models. In this work, we address this limitation by compiling the SurfPro database of surfactant properties. SurfPro consists of 1624 surfactant entries curated from 223 literature sources, containing 1395 CMC values, 972 γCMC values and more than 657 values for Γmax, C20, πCMC and Amin. However, only 647 structures have all reported properties, and for most surfactants multiple properties are missing.
We trained a previously reported graph neural network architecture for single- and multi-property prediction on these incomplete data of all surfactant types in the database to accurately predict pCMC (−log10(CMC)), γCMC, Γmax and pC20. We achieved state-of-the-art performance of these four properties using an ensemble of AttentiveFP models trained on ten different folds of the training data in the multi-property setting. Finally, we leveraged the predictions and uncertainties of the ensemble model to impute all missing properties for all 977 surfactants with an incomplete set of properties. We make our curated SurfPro database, proposed test split and training datasets, the imputed database, as well as our code publicly available.
#dataset
Please open Telegram to view this post
VIEW IN TELEGRAM
pubs.rsc.org
SurfPro – a curated database and predictive model of experimental properties of surfactants
Despite great industrial interest, modeling the physical properties of surfactants in water based on their molecular structure remains a challenge. A significant part of this challenge is in obtaining sufficient amounts of high-quality data. Experimentally…
❤4👍4🔥4
MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction🔥
https://doi.org/10.48550/arXiv.2406.12950
arXiv
#method
https://doi.org/10.48550/arXiv.2406.12950
Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their restricted ability to generalize for unseen and new tasks, both of which are essential for real-world applications.
To address these challenges, we present MolecularGPT for few-shot MPP. From a perspective on instruction tuning, we fine-tune large language models (LLMs) based on curated molecular instructions spanning over 1000 property prediction tasks. This enables building a versatile and specialized LLM that can be adapted to novel MPP tasks without any fine-tuning through zero- and few-shot in-context learning (ICL). MolecularGPT exhibits competitive in-context reasoning capabilities across 10 downstream evaluation datasets, setting new benchmarks for few-shot molecular prediction tasks. More importantly, with just two-shot examples, MolecularGPT can outperform standard supervised graph neural network methods on 4 out of 7 datasets. It also excels state-of-the-art LLM baselines by up to 15.7% increase on classification accuracy and decrease of 17.9 on regression metrics (e.g., RMSE) under zero-shot. This study demonstrates the potential of LLMs as effective few-shot molecular property predictors.
arXiv
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
arXiv.org
MolecularGPT: Open Large Language Model (LLM) for Few-Shot...
Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their...
👍6❤4🔥4
Advancing molecular machine learning representations with stereoelectronics-infused molecular graphs
https://doi.org/10.1038/s42256-025-01031-9
@GPTyrannosaurus поздравляем с крутейшей публикацией!
📕 Nature Machine Intelligence (IF=18.8)
#method
https://doi.org/10.1038/s42256-025-01031-9
This work introduces a new approach to infusing quantum-chemical-rich information into molecular graphs via stereoelectronic effects, enhancing expressivity and interpretability. Learning to predict the stereoelectronics-infused representation with a tailored double graph neural network workflow enables its application to any downstream molecular machine learning task without expensive quantum-chemical calculations.
We show that the explicit addition of stereoelectronic information substantially improves the performance of message-passing two-dimensional machine learning models for molecular property prediction. We show that the learned representations trained on small molecules can accurately extrapolate to much larger molecular structures, yielding chemical insight into orbital interactions for previously intractable systems, such as entire proteins, opening new avenues of molecular design.
Finally, we have developed a web application (simg.cheme.cmu.edu) where users can rapidly explore stereoelectronic information for their own molecular systems.
@GPTyrannosaurus поздравляем с крутейшей публикацией!
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Advancing molecular machine learning representations with stereoelectronics-infused molecular graphs
Nature Machine Intelligence - Boiko et al. enhance expressiveness and interpretability of molecular representation in graph neural networks by including quantum-chemical-rich information into...
🔥6❤4👍4
Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry🔥
https://doi.org/10.1039/D5CS00146C
📕 Chemical Society Reviews (IF = 40.4)
#review
https://doi.org/10.1039/D5CS00146C
Here, we present ten diverse perspectives on the impact of AI coming from those with a range of backgrounds from experimental chemistry, computational chemistry, computer science, engineering and across different areas of chemistry, including drug discovery, catalysis, chemical automation, chemical physics, materials chemistry.
The ten perspectives presented here cover a range of themes, including AI for computation, facilitating discovery, supporting experiments, and enabling technologies for transformation. We highlight and discuss imminent challenges and ways in which we are redefining problems to accelerate the impact of chemical research via AI.
#review
Please open Telegram to view this post
VIEW IN TELEGRAM
pubs.rsc.org
Cross-disciplinary perspectives on the potential for artificial intelligence across chemistry
From accelerating simulations and exploring chemical space, to experimental planning and integrating automation within experimental labs, artificial intelligence (AI) is changing the landscape of chemistry. We are seeing a significant increase in the number…
🔥5❤4👍4
Ab initio structure solutions from nanocrystalline powder diffraction data via diffusion models 🔥
https://www.nature.com/articles/s41563-025-02220-y
🖥 https://github.com/gabeguo/cdvae_xrd
📕 Nature Materials (IF = 37.2)
#method
https://www.nature.com/articles/s41563-025-02220-y
A major challenge in materials science is the determination of the structure of nanometre-sized objects. Here we present an approach that uses a generative machine learning model based on diffusion processes that are trained on 45,229 known structures.
The model factors measured the diffraction pattern as well as the relevant statistical priors on the unit cell of atomic cluster structures. Conditioned only on the chemical formula and the information-scarce finite-sized broadened powder diffraction pattern, we find that our model, PXRDnet, can successfully solve the simulated nanocrystals as small as 10 Å across 200 materials of varying symmetries and complexities, including structures from all seven crystal systems.
We show that our model can successfully and verifiably determine structural candidates four out of five times, with an average error among these candidates being only 7% (as measured by the post-Rietveld refinement R-factor). Furthermore, PXRDnet is capable of solving structures from noisy diffraction patterns gathered in real-world experiments.
#method
Please open Telegram to view this post
VIEW IN TELEGRAM
Nature
Ab initio structure solutions from nanocrystalline powder diffraction data via diffusion models
Nature Materials - A machine learning model that can solve nanocrystalline structures from highly degraded PXRD patterns is presented. It is shown to be successful on simulated crystals as small as...
🔥8❤5👍3