GraphML News (March 23rd) - Neo-1 and Lila Sciences round
🧬 VantAI announced Neo-1, a foundation model for structure prediction and de novo generation capable of doing a bunch of protein design tasks (folding, co-folding, docking, all-atom molecule design, fragment linking, and more) at once instead of different modules. While we are waiting for the tech report, we could guesstimate that Neo-1 is an all-atom latent generative model (perhaps a Diffusion Transformer like in other competitors as it’s powered by a hefty cluster of H100s) with some advanced sampling techniques beyond standard guidance - the blog post talks about optimizing for non-differentiable properties with reward-like models and it sounds quite similar to the ICLR 2025 paper on posterior prediction.
As impressive as the modeling advances are, true aficionados know that data diversity and distribution is even more important at scale - on that front VantAI introduce NeoLink, a massive data generation flywheel based on cross-linking mass-spectrometry (XLMS). Reported experiments suggest it brings massive improvements in quality, so it’s likely to be the key innovation and the point of further scaling up. The graphics in the blog post are amazing and the graphic designer should get a raise 📈.
💸 Lila Sciences went out of stealth with $200M seed funding. Lila will focus on materials discovery and automated self-driving labs while alluding to Superscience, an AI 4 Science equivalent of Super Intelligence you often hear from LLM folks which would massive speed up exploration pipelines. Lila is part of the Flagship Pioneering ecosystem (you might know Generate Biomedicines and their Chroma generative model made some noise last year) and attracted funding from General Catalyst, March Capital, ARK, and other famous VCs (even Abu Dhabi Investment Authority). Knowing that the OpenAI VP of post-training William Fedus left to start his own AI 4 Science company, the area is likely to attract even more VC funding in the near future.
Weekend reading:
Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement by Huidong Liang and Oxford folks - introduces new long-range graph datasets extracted from road networks in OpenStreetMap. Good news: graphs are quite large and sparse (100k nodes with 100+ diameter). Less good news: GraphSAGE is still SOTA 🫠
No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets by Corinna Coupette, Jeremy Wayland, et al - studies the quality of 11 graph classification datasets, only NC11, MolHIV, and LRGB datasets are ok, others should be thrown to garbage.
A Materials Foundation Model via Hybrid Invariant-Equivariant Architectures by Keqiang Yan and large Texas A&M collab - introduces HIENet, an ML potential rivaling MACE-MP0, Equiformer, and ORB on energy, forces, and stresses predictions.
Survey on Generalization Theory for Graph Neural Networks by Antonis Vasileiou, Stefanie Jegelka, Ron Levie, and Christopher Morris - everything you wanted to know about GNNs linked to VC dimension, Rademacher complexity, PAC-Bayes, and learning theory. MATH ALERT
🧬 VantAI announced Neo-1, a foundation model for structure prediction and de novo generation capable of doing a bunch of protein design tasks (folding, co-folding, docking, all-atom molecule design, fragment linking, and more) at once instead of different modules. While we are waiting for the tech report, we could guesstimate that Neo-1 is an all-atom latent generative model (perhaps a Diffusion Transformer like in other competitors as it’s powered by a hefty cluster of H100s) with some advanced sampling techniques beyond standard guidance - the blog post talks about optimizing for non-differentiable properties with reward-like models and it sounds quite similar to the ICLR 2025 paper on posterior prediction.
As impressive as the modeling advances are, true aficionados know that data diversity and distribution is even more important at scale - on that front VantAI introduce NeoLink, a massive data generation flywheel based on cross-linking mass-spectrometry (XLMS). Reported experiments suggest it brings massive improvements in quality, so it’s likely to be the key innovation and the point of further scaling up. The graphics in the blog post are amazing and the graphic designer should get a raise 📈.
💸 Lila Sciences went out of stealth with $200M seed funding. Lila will focus on materials discovery and automated self-driving labs while alluding to Superscience, an AI 4 Science equivalent of Super Intelligence you often hear from LLM folks which would massive speed up exploration pipelines. Lila is part of the Flagship Pioneering ecosystem (you might know Generate Biomedicines and their Chroma generative model made some noise last year) and attracted funding from General Catalyst, March Capital, ARK, and other famous VCs (even Abu Dhabi Investment Authority). Knowing that the OpenAI VP of post-training William Fedus left to start his own AI 4 Science company, the area is likely to attract even more VC funding in the near future.
Weekend reading:
Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement by Huidong Liang and Oxford folks - introduces new long-range graph datasets extracted from road networks in OpenStreetMap. Good news: graphs are quite large and sparse (100k nodes with 100+ diameter). Less good news: GraphSAGE is still SOTA 🫠
No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets by Corinna Coupette, Jeremy Wayland, et al - studies the quality of 11 graph classification datasets, only NC11, MolHIV, and LRGB datasets are ok, others should be thrown to garbage.
A Materials Foundation Model via Hybrid Invariant-Equivariant Architectures by Keqiang Yan and large Texas A&M collab - introduces HIENet, an ML potential rivaling MACE-MP0, Equiformer, and ORB on energy, forces, and stresses predictions.
Survey on Generalization Theory for Graph Neural Networks by Antonis Vasileiou, Stefanie Jegelka, Ron Levie, and Christopher Morris - everything you wanted to know about GNNs linked to VC dimension, Rademacher complexity, PAC-Bayes, and learning theory. MATH ALERT
👍26❤11🔥1
GraphML News (April 5th) - Isomorphic Round, Graph Transformers at Kumo, new blogs
Got some news!
💸 Isomorphic Labs raised a generous $600M from Thrive Capital, GV, and Alphabet in the first external round. The attached press release also mentions collaborations with pharma giants Eli Lilly and Novartis - seems like whatever comes next after AlphaFold 3 looks quite appealing to the industry. We’ll keep you posted in our Geometric Wall Street Bulletin.
🏵️ Looking at LLM guts from the graph learning perspective becomes popular: Anthropic posted a massive study in two papers and lots of visual material on AI biology with strong graph vibes - LLMs perform multi-hop reasoning with concept graphs in mind, and you can actually identify circuits (DAGs) of activations doing certain kind of computation.
🚚 Kumo published a nice blog post on using graph transformers at scale in relational DL tasks. Perhaps the most insightful part is about positional encodings - as graphs are large (10M+ nodes in RelBench), global PEs don’t really scale up so they have to resort to more local options like hop encoding or relative PEs. Besides, there is a need for time encoding as the e-commerce graphs are always temporal. Experiments on RelBench bring some noticeable improvements.
⌛ Bryan Perozzi from Google Research (the OG of DeepWalk) wrote a review post looking at the milestones of graph learning from pre-historic times (pre-2013) to the most recent applications (those we often highlight in this channel).
Weekend reading:
Why do LLMs attend to the first token? by Federico Barbero, Alvaro Arroyo, and all the familiar folks from Transformers Need Glasses - here the authors study attention sinks and draw parallels to over-squashing and representational collapse.
On that note, don’t miss the talk by Petar Veličković on LLMs as Graph Neural Networks at the recent GLOW reading group - it adds much more context to the research area and hints at the above paper.
Affordable AI Assistants with Knowledge Graph of Thoughts by Maciej Besta and 13 (👀) co-authors. Maciej is the author of the famous Graph of Thoughts, here it’s extended to KGs and agentic environments.
Got some news!
💸 Isomorphic Labs raised a generous $600M from Thrive Capital, GV, and Alphabet in the first external round. The attached press release also mentions collaborations with pharma giants Eli Lilly and Novartis - seems like whatever comes next after AlphaFold 3 looks quite appealing to the industry. We’ll keep you posted in our Geometric Wall Street Bulletin.
🏵️ Looking at LLM guts from the graph learning perspective becomes popular: Anthropic posted a massive study in two papers and lots of visual material on AI biology with strong graph vibes - LLMs perform multi-hop reasoning with concept graphs in mind, and you can actually identify circuits (DAGs) of activations doing certain kind of computation.
🚚 Kumo published a nice blog post on using graph transformers at scale in relational DL tasks. Perhaps the most insightful part is about positional encodings - as graphs are large (10M+ nodes in RelBench), global PEs don’t really scale up so they have to resort to more local options like hop encoding or relative PEs. Besides, there is a need for time encoding as the e-commerce graphs are always temporal. Experiments on RelBench bring some noticeable improvements.
⌛ Bryan Perozzi from Google Research (the OG of DeepWalk) wrote a review post looking at the milestones of graph learning from pre-historic times (pre-2013) to the most recent applications (those we often highlight in this channel).
Weekend reading:
Why do LLMs attend to the first token? by Federico Barbero, Alvaro Arroyo, and all the familiar folks from Transformers Need Glasses - here the authors study attention sinks and draw parallels to over-squashing and representational collapse.
On that note, don’t miss the talk by Petar Veličković on LLMs as Graph Neural Networks at the recent GLOW reading group - it adds much more context to the research area and hints at the above paper.
Affordable AI Assistants with Knowledge Graph of Thoughts by Maciej Besta and 13 (👀) co-authors. Maciej is the author of the famous Graph of Thoughts, here it’s extended to KGs and agentic environments.
👍25❤4🔥2
GraphML News (April 13th) - Orb V3, RF Diffusion 2, Breakthrough Prizes, ICML 2025 Workshops
🔮 Orbital Materials released Orb V3, the next version of the universal ML potential. Some improvements include training a wider but shallower model (5-layer MPNN with 1024d MLP instead of 15-layer with 512d in v2), having both versions where forces are predicted directly (non-conservative) or as a gradient of energy (conservative force field), and a good bunch of training tricks listed on github. ORB v3 shows top results on MatBench Discovery and now has a confidence prediction head akin to pLDDT in AlphaFold. The accompanying paper and model checkpoint are available, plug them in right away.
🧬 The Baker Lab released RF Diffusion 2 (as a pre-print right now) focusing on de novo enzyme design. RFD2 is a Riemannian flow matching model in both frame and coordinate space (the frame part is very much like FrameFlow) and was trained for 17 days on 24 A100s (rather average compute those days). RFD2 outperforms original RFDiffusion on older and newly constructed benchmarks, the code is coming soon.
🏆 The Breakthrough Prize (founded by Sergey Brin, Mark Zuckerberg, Yuri Milner, and other famous tech folks) announced 2025 laureates: the authors of GLP-1 (aka Ozempic) got the Life Sciences award, CERN and Large Hadron Collider got the physics award, and Dennis Gaitsgory got the mathematics prize for proving the geometric Langlands conjecture (from which a proof of Ferma’s theorem stems naturally). Check out also the New Frontiers and New Horizons categories dedicated for younger scientists.
Finally, ICML 2025 announced a list of accepted workshops, all the usual suspects are there: generative models, comp bio, AI 4 Science, a handful of LLM and world models workshops, should be a nice selection for those attending in Vancouver in summer.
Weekend reading:
Orb-v3: atomistic simulation at scale by Benjamin Rhodes, Sander Vandenhaute, and folks from Orbital Materials
Atom level enzyme active site scaffolding using RFdiffusion2 by Woody Ahern, Jason Yim, Doug Tischer, UW and Baker Lab (including the man himself)
🔮 Orbital Materials released Orb V3, the next version of the universal ML potential. Some improvements include training a wider but shallower model (5-layer MPNN with 1024d MLP instead of 15-layer with 512d in v2), having both versions where forces are predicted directly (non-conservative) or as a gradient of energy (conservative force field), and a good bunch of training tricks listed on github. ORB v3 shows top results on MatBench Discovery and now has a confidence prediction head akin to pLDDT in AlphaFold. The accompanying paper and model checkpoint are available, plug them in right away.
🧬 The Baker Lab released RF Diffusion 2 (as a pre-print right now) focusing on de novo enzyme design. RFD2 is a Riemannian flow matching model in both frame and coordinate space (the frame part is very much like FrameFlow) and was trained for 17 days on 24 A100s (rather average compute those days). RFD2 outperforms original RFDiffusion on older and newly constructed benchmarks, the code is coming soon.
🏆 The Breakthrough Prize (founded by Sergey Brin, Mark Zuckerberg, Yuri Milner, and other famous tech folks) announced 2025 laureates: the authors of GLP-1 (aka Ozempic) got the Life Sciences award, CERN and Large Hadron Collider got the physics award, and Dennis Gaitsgory got the mathematics prize for proving the geometric Langlands conjecture (from which a proof of Ferma’s theorem stems naturally). Check out also the New Frontiers and New Horizons categories dedicated for younger scientists.
Finally, ICML 2025 announced a list of accepted workshops, all the usual suspects are there: generative models, comp bio, AI 4 Science, a handful of LLM and world models workshops, should be a nice selection for those attending in Vancouver in summer.
Weekend reading:
Orb-v3: atomistic simulation at scale by Benjamin Rhodes, Sander Vandenhaute, and folks from Orbital Materials
Atom level enzyme active site scaffolding using RFdiffusion2 by Woody Ahern, Jason Yim, Doug Tischer, UW and Baker Lab (including the man himself)
🔥17👍8❤1
Graph Learning Will Lose Relevance Due To Poor Benchmarks
by Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, Christopher Morris
📜 arxiv
📣 Our new spicy ICML 2025 position paper. Graph learning is less trendy in the ML world than it was in 2020-2022. We believe the problem is in poor benchmarks that hold the field back - and suggest ways to fix it!
We identified three problems:
#️⃣ P1: No transformative real-world applications - while LLMs and geometric generative models become more powerful and solve complex tasks every generation (from reasoning to protein folding), how transformative could a GNN on Cora or OGB be?
P1 Remedies: The community is overlooking many significant and transformative applications, including chip design and broader ML for systems, combinatorial optimization, and relational data (as highlighted by RelBench). Each of them offers $billions in potential outcomes.
#️⃣ P2: While everything can be modeled as a graph, often it should not be. We made a simple experiment and probed a vanilla DeepSet w/o edges and a GNN on Cayley graphs (fixed edges for a certain number of nodes) on molecular datasets and the performance is quite competitive.
#️⃣ P3: Bad benchmarking culture (this one hits hard) - it’s a mess 🙂
Small datasets (don’t use Cora and MUTAG in 2025), no standard splits, and in many cases recent models are clearly worse than GCN / Sage from 2020. It gets worse when evaluating generative models.
Remedies for P3: We need more holistic benchmarks which are harder to game and saturate - while it’s a common problem for all ML fields, standard graph learning benchmarks are egregiously old and rather irrelevant for the scale of problems doable in 2025.
💡 As a result, it’s hard to build a true foundation model for graphs. Instead of training each model on each dataset, we suggest using GNNs / GTs as processors in the “encoder-processor-decoder” blueprint, train them at scale, and only tune graph-specific encoders/decoders.
For example, we pre-trained several models on PCQM4M-v2, COCO-SP, and MalNet Tiny, and fine-tuned them on PascalVOC, Peptides-struct, and Stargazers to find that graph transformers benefit from pre-training.
The project started around NeurIPS 2024 when Christopher Morris gathered us to discuss the peeve points of graph learning and how to continue to do impactful research in this area. I believe the outcomes appear promising, and we can re-imagine graph learning in 2025 and beyond!
by Maya Bechler-Speicher, Ben Finkelshtein, Fabrizio Frasca, Luis Müller, Jan Tönshoff, Antoine Siraudin, Viktor Zaverkin, Michael M. Bronstein, Mathias Niepert, Bryan Perozzi, Mikhail Galkin, Christopher Morris
📜 arxiv
📣 Our new spicy ICML 2025 position paper. Graph learning is less trendy in the ML world than it was in 2020-2022. We believe the problem is in poor benchmarks that hold the field back - and suggest ways to fix it!
We identified three problems:
#️⃣ P1: No transformative real-world applications - while LLMs and geometric generative models become more powerful and solve complex tasks every generation (from reasoning to protein folding), how transformative could a GNN on Cora or OGB be?
P1 Remedies: The community is overlooking many significant and transformative applications, including chip design and broader ML for systems, combinatorial optimization, and relational data (as highlighted by RelBench). Each of them offers $billions in potential outcomes.
#️⃣ P2: While everything can be modeled as a graph, often it should not be. We made a simple experiment and probed a vanilla DeepSet w/o edges and a GNN on Cayley graphs (fixed edges for a certain number of nodes) on molecular datasets and the performance is quite competitive.
#️⃣ P3: Bad benchmarking culture (this one hits hard) - it’s a mess 🙂
Small datasets (don’t use Cora and MUTAG in 2025), no standard splits, and in many cases recent models are clearly worse than GCN / Sage from 2020. It gets worse when evaluating generative models.
Remedies for P3: We need more holistic benchmarks which are harder to game and saturate - while it’s a common problem for all ML fields, standard graph learning benchmarks are egregiously old and rather irrelevant for the scale of problems doable in 2025.
💡 As a result, it’s hard to build a true foundation model for graphs. Instead of training each model on each dataset, we suggest using GNNs / GTs as processors in the “encoder-processor-decoder” blueprint, train them at scale, and only tune graph-specific encoders/decoders.
For example, we pre-trained several models on PCQM4M-v2, COCO-SP, and MalNet Tiny, and fine-tuned them on PascalVOC, Peptides-struct, and Stargazers to find that graph transformers benefit from pre-training.
The project started around NeurIPS 2024 when Christopher Morris gathered us to discuss the peeve points of graph learning and how to continue to do impactful research in this area. I believe the outcomes appear promising, and we can re-imagine graph learning in 2025 and beyond!
arXiv.org
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current...
🔥27❤16👍11
GraphML News (May 10th) - PageRank and New Pope, Scientific Agents, more blogs
🤌 🇻🇦 Researchers from Bocconi University in Milan rolled the best usage of network science of 2025: using centrality measures to predict the results of the conclave (who elects the next Pope). They mined a graph of Vatican cardinals according to their job duties, informal relationships, and “spiritual genealogies”, and computed a bunch of centrality measures - eigenvector centrality (probably a PageRank), betweenness centrality (affordable for small networks), and some clustering metrics. One of them did rank the real elected Pope in the top (although others didn’t have him in top-5) which is a cool result. Good ole PageRank still makes headlines in 2025!
🦅 FutureHouse announced the Platform for scientific discovery tasks. Practically, the Platform combines 4 distinct multimodal agents (avian beings): Crow for search, Falcon for deep search, Owl for questions a-la “has anyone done X before”, and Phoenix as the next-gen ChemCrow for molecular design. Agents accept whatever text and image inputs you have at hand, and will search a huge collection of scientific documents. Having worked with PaperQA before, I have a good experience with FH tools - there might be more announcements coming soon about new scientific results achieved with those agents
✍️ More blogposts! Kumo is on the writing spree: a massive post on relational graph transformers for RelBench that improves over GNNs (but I bet is much faster and more scalable) and a more technical writeup on enabling torch.compile for GNNs which results in 30% training speedups. ProTip: GNNs in JAX are already JIT’table from the very beginning 😉
🧬 AITHYRA announced the AI4Science symposium to take place in Vienna on September 8-10 with top speakers from AI and Life Sciences areas.
Weekend reading:
System of Agentic AI for the Discovery of Metal-Organic Frameworks by Theo Jaffrelot Inizan, Sherry Yang, Aaron Kaplan, Yen-hsu Lin, and a team of UC Berkeley and DeepMind researchers - another take on the multi-agent discovery pipeline combining LLMs, diffusion models, and ML potentials for creating new metal-organic frameworks (MOFs) that helped synthesizing 5 new structures.
Plexus: Taming Billion-edge Graphs with 3D Parallel GNN Training by Aditya Ranjan and U of Maryland - a new platform to scale GNNs to supercomputers, tried 2048 GPUs on Frontier and Perlmutter on graphs up to OGB Papers100M.
🤌 🇻🇦 Researchers from Bocconi University in Milan rolled the best usage of network science of 2025: using centrality measures to predict the results of the conclave (who elects the next Pope). They mined a graph of Vatican cardinals according to their job duties, informal relationships, and “spiritual genealogies”, and computed a bunch of centrality measures - eigenvector centrality (probably a PageRank), betweenness centrality (affordable for small networks), and some clustering metrics. One of them did rank the real elected Pope in the top (although others didn’t have him in top-5) which is a cool result. Good ole PageRank still makes headlines in 2025!
🦅 FutureHouse announced the Platform for scientific discovery tasks. Practically, the Platform combines 4 distinct multimodal agents (avian beings): Crow for search, Falcon for deep search, Owl for questions a-la “has anyone done X before”, and Phoenix as the next-gen ChemCrow for molecular design. Agents accept whatever text and image inputs you have at hand, and will search a huge collection of scientific documents. Having worked with PaperQA before, I have a good experience with FH tools - there might be more announcements coming soon about new scientific results achieved with those agents
✍️ More blogposts! Kumo is on the writing spree: a massive post on relational graph transformers for RelBench that improves over GNNs (but I bet is much faster and more scalable) and a more technical writeup on enabling torch.compile for GNNs which results in 30% training speedups. ProTip: GNNs in JAX are already JIT’table from the very beginning 😉
🧬 AITHYRA announced the AI4Science symposium to take place in Vienna on September 8-10 with top speakers from AI and Life Sciences areas.
Weekend reading:
System of Agentic AI for the Discovery of Metal-Organic Frameworks by Theo Jaffrelot Inizan, Sherry Yang, Aaron Kaplan, Yen-hsu Lin, and a team of UC Berkeley and DeepMind researchers - another take on the multi-agent discovery pipeline combining LLMs, diffusion models, and ML potentials for creating new metal-organic frameworks (MOFs) that helped synthesizing 5 new structures.
Plexus: Taming Billion-edge Graphs with 3D Parallel GNN Training by Aditya Ranjan and U of Maryland - a new platform to scale GNNs to supercomputers, tried 2048 GPUs on Frontier and Perlmutter on graphs up to OGB Papers100M.
👍25🔥3❤2😁1
GraphML News (May 14th) - KumoRFM, Open Molecules 2025, TxPert
Lots of news over the past two weeks other than new Gemini and Claude models!
🏆 KumoAI presented KumoRFM - the first graph foundation model for relational databases capable of zero-shotting node regression, node classification, and link prediction. Given any set of relational tables with any categorical or numerical features and transforming them into a graph, you can now zero-shot typical tasks like regression or classification. Perhaps the biggest difference of KumoRFM compared to other inductive models is using in-context learning, that is, for each prediction task we’d mine not only an ego-graph around the target entity, but also ego-graphs about relevant nodes with similar labels. The backbone for encoding ego-graphs for node-related tasks is the Relational Graph Transformer (another new pre-print), then graph vectors are aggregated by an attention pooling. Besides, KumoRFM has a built-in GNN Explainer to give some transparency to the decisions. Typical for AI labs, Kumo doesn’t disclose on which data KumoRFM was trained on, but they claim they zero-shot the whole RelBench which is a great achievement (albeit the results are slightly worse than their supervised ContextGNN).
⚛️ FAIR Chemistry, CMU, National Labs, Genentech, and a large scientific collab presented Open Molecules 2025 and Universal Model for Atoms - the largest dataset of simulations 100M molecules, biomolecules, complexes, MOFs with a plethora of properties to predict covering structures up to 350 atoms (10x larger than any other dataset). It took 6 billion CPU hours to complete the simulations 👀 OMol25 will probably be the main dataset to train the next gen of ML potentials - you can’t find a larger open source dataset anywhere else. As a baseline, FAIR prepared the Universal Model for Atoms based on the equivariant GNN (eSEN) with a mixture of experts (hello from the LLM world).
🦠 Advancing Drug Discovery Outcomes with Virtual Cells by Valence Labs (Recursion) - introduces the data pipeline and computational platform for building a “virtual cell”. As a proof of concept, Valence trained TxPert, a model for predicting transcriptional responses to combinatorial genetic perturbations, that outperforms GEARS and scLAMDA. Besides, Valence put out a nice white paper on virtual cells with cool illustrations.
Weekend reading (before we get into all new fancy NeurIPS submissions) - theory alert:
Covered Forest: Fine-grained generalization analysis of graph neural networks by Antonis Vasileiou et al on generalization power of MPNNs.
Graph Representational Learning: When Does More Expressivity Hurt Generalization? by Sohir Maskey et al - another relevant work messaging that fancy expressive GNN architectures might actually be pretty bad at OOD generalization. Some day GNN theory folks will discover that Attention is All You Need 🙂
Addressing the Scarcity of Benchmarks for Graph XAI by Michele Fontanesi et al - proposed a new method to automate the generation of Explainable AI benchmarks for graph classification, where at least one of the classes is explained by a specific sub-graph motif. Also bundles 15 new benchmarking tasks. Thanks Domenico Tortorella for the pointer.
Lots of news over the past two weeks other than new Gemini and Claude models!
🏆 KumoAI presented KumoRFM - the first graph foundation model for relational databases capable of zero-shotting node regression, node classification, and link prediction. Given any set of relational tables with any categorical or numerical features and transforming them into a graph, you can now zero-shot typical tasks like regression or classification. Perhaps the biggest difference of KumoRFM compared to other inductive models is using in-context learning, that is, for each prediction task we’d mine not only an ego-graph around the target entity, but also ego-graphs about relevant nodes with similar labels. The backbone for encoding ego-graphs for node-related tasks is the Relational Graph Transformer (another new pre-print), then graph vectors are aggregated by an attention pooling. Besides, KumoRFM has a built-in GNN Explainer to give some transparency to the decisions. Typical for AI labs, Kumo doesn’t disclose on which data KumoRFM was trained on, but they claim they zero-shot the whole RelBench which is a great achievement (albeit the results are slightly worse than their supervised ContextGNN).
⚛️ FAIR Chemistry, CMU, National Labs, Genentech, and a large scientific collab presented Open Molecules 2025 and Universal Model for Atoms - the largest dataset of simulations 100M molecules, biomolecules, complexes, MOFs with a plethora of properties to predict covering structures up to 350 atoms (10x larger than any other dataset). It took 6 billion CPU hours to complete the simulations 👀 OMol25 will probably be the main dataset to train the next gen of ML potentials - you can’t find a larger open source dataset anywhere else. As a baseline, FAIR prepared the Universal Model for Atoms based on the equivariant GNN (eSEN) with a mixture of experts (hello from the LLM world).
🦠 Advancing Drug Discovery Outcomes with Virtual Cells by Valence Labs (Recursion) - introduces the data pipeline and computational platform for building a “virtual cell”. As a proof of concept, Valence trained TxPert, a model for predicting transcriptional responses to combinatorial genetic perturbations, that outperforms GEARS and scLAMDA. Besides, Valence put out a nice white paper on virtual cells with cool illustrations.
Weekend reading (before we get into all new fancy NeurIPS submissions) - theory alert:
Covered Forest: Fine-grained generalization analysis of graph neural networks by Antonis Vasileiou et al on generalization power of MPNNs.
Graph Representational Learning: When Does More Expressivity Hurt Generalization? by Sohir Maskey et al - another relevant work messaging that fancy expressive GNN architectures might actually be pretty bad at OOD generalization. Some day GNN theory folks will discover that Attention is All You Need 🙂
Addressing the Scarcity of Benchmarks for Graph XAI by Michele Fontanesi et al - proposed a new method to automate the generation of Explainable AI benchmarks for graph classification, where at least one of the classes is explained by a specific sub-graph motif. Also bundles 15 new benchmarking tasks. Thanks Domenico Tortorella for the pointer.
❤22👍12🔥10
GraphML News (June 14th) - Boltz-2, OpenBind, Musings on equivariance
Back to the normal schedule!
🧬 The biggest announcement of the week - MIT and Recursion released Boltz-2, perhaps the most successful open-source reproduction of AlphaFold 3. v2 brings binding affinity prediction (orders of magnitude faster than physics simulations), model improvements and inference speedups. The preprint also report an experiment combining Boltz with SynflowNet to generate binders for the TYK2 protein. Code and model weights are already available.
🇬🇧 UK announced the OpenBind initiative aiming to collect data for 500k protein-ligand complexes using X-ray crystallography and synchrotron facilities at Diamond Light Source. The academic side includes all the big names you’d expect - Charlotte Deane, Frank von Delft, David Baker - as well as the industrial part which include Isomorphic Labs, Roche, Boltz, and others. Let’s hope it will be the next PDB for protein design.
🌀 The need for equivariance continues to be a hot discussion topic - first, Chaitanya K. Joshi published a post reviewing two sides of the spectrum: low-data regimes where equivariance might help (by restricting model capacity) and high-data regimes (like generative modeling) where symmetries can be learned from data. Later on, Mark Neumann (Orbital Materials) published his take on the need for rotational equivariance and conservation of energy and how those can be achieved without strictly equivariant models (tricks like Equigrad, for instance). The post also features a handful of fresh papers on the topic - check them out too.
Weekend reading (or github repos, heh):
Automated Non-Hermitian Spectral Graph Construction and GnLTransformer - a cool application, takes in the characteristic polynomial of Hamiltonians of 1D-crystals and returns the spectral graph
Anomaly Detection with Graph Neural Networks (GNNs) - a PyG library and datasets for anomaly detection, thanks Federico Bello for the pointer
On Measuring Long-Range Interactions in Graph Neural Networks - proposes a metric to compute effective range of MPNNs and GTs and studies LRGB tasks as requiring long-range connections or not
Back to the normal schedule!
🧬 The biggest announcement of the week - MIT and Recursion released Boltz-2, perhaps the most successful open-source reproduction of AlphaFold 3. v2 brings binding affinity prediction (orders of magnitude faster than physics simulations), model improvements and inference speedups. The preprint also report an experiment combining Boltz with SynflowNet to generate binders for the TYK2 protein. Code and model weights are already available.
🇬🇧 UK announced the OpenBind initiative aiming to collect data for 500k protein-ligand complexes using X-ray crystallography and synchrotron facilities at Diamond Light Source. The academic side includes all the big names you’d expect - Charlotte Deane, Frank von Delft, David Baker - as well as the industrial part which include Isomorphic Labs, Roche, Boltz, and others. Let’s hope it will be the next PDB for protein design.
🌀 The need for equivariance continues to be a hot discussion topic - first, Chaitanya K. Joshi published a post reviewing two sides of the spectrum: low-data regimes where equivariance might help (by restricting model capacity) and high-data regimes (like generative modeling) where symmetries can be learned from data. Later on, Mark Neumann (Orbital Materials) published his take on the need for rotational equivariance and conservation of energy and how those can be achieved without strictly equivariant models (tricks like Equigrad, for instance). The post also features a handful of fresh papers on the topic - check them out too.
Weekend reading (or github repos, heh):
Automated Non-Hermitian Spectral Graph Construction and GnLTransformer - a cool application, takes in the characteristic polynomial of Hamiltonians of 1D-crystals and returns the spectral graph
Anomaly Detection with Graph Neural Networks (GNNs) - a PyG library and datasets for anomaly detection, thanks Federico Bello for the pointer
On Measuring Long-Range Interactions in Graph Neural Networks - proposes a metric to compute effective range of MPNNs and GTs and studies LRGB tasks as requiring long-range connections or not
❤21👍8🔥4
GraphML News (June 21st) - Skala, Temporal RDL, Future of Graph Learning, Erwin
⚛️ MSR AI 4 Science announced Skala - an exchange-correlation (XC) ML potential to estimate chemical properties of molecules (energy and force fields). Skala represents molecules via density features obtained from meta-generalized-gradient approximation (meta-GGA) and is practically an irregular integration grid. The main model employs radial functions and spherical harmonics to capture non-local interactions and run integration over space. Skala was trained on a new dataset of 150k data points and reaches SOTA MAE on the W4-17 dataset. Preprint and data are available (lots of fancy equations in the appendix).
🕸️ Kumo published an interesting piece on temporal dependencies in relational DL where features change over time - note that in RelBench edges have timestamps but features are static. They tried predictive forecasting (training a regression head over the graph and history of features) vs generative forecasting (training a diffusion model instead) which experimentally give pretty similar results. Transformers have been used all the way (both for graph encoding and for sequence modeling).
🌟 The Graph Learning on Wednesdays (GLOW) reading group summarized a series of recent discussions on the future of graph learning in a new blog post with opinions from many renowned researchers as to why graph learning is experiencing a certain identity crisis and lack of glamour compared to LLMs, agents, and mainstream AI research. Partly, it revolves around missing “killer” applications which would attract new researchers - nobody needs another variation of GCN / GAT / GT, or some esoteric positional encodings, or yet another self-supervised loss to train on Cora when you can run true scientific discovery with new generations of LLMs and agents. Our take on the problem is in the ICML’25 position paper - find us in Vancouver to chat and share your opinions in the comments.
🎱 Maksim Zhdanov (UvA) published a nice visual introduction to ball tree attention used in the recent Erwin Transformer and how to expand receptive field to large structures in subquadratic time. Erwin is quite strong on MD and PDE modeling tasks, find out about smart tricks for sparser attention.
🎉 Finally, the GDL book now includes a new chapter 5 on Graphs - in addition to standard architectures, the chapter talks asynchronous and topological message passing, as well as looking at the transformer layer (self-attention + MLP) through the lens of message passing. The illustrations are cool - props to the renowned tikz magician Petar Veličković.
Weekend reading:
Don’t procrastinate with new papers, go finish those NeurIPS reviews or ACs will be shaming you and send news to your co-authors how slow you are 🙂
⚛️ MSR AI 4 Science announced Skala - an exchange-correlation (XC) ML potential to estimate chemical properties of molecules (energy and force fields). Skala represents molecules via density features obtained from meta-generalized-gradient approximation (meta-GGA) and is practically an irregular integration grid. The main model employs radial functions and spherical harmonics to capture non-local interactions and run integration over space. Skala was trained on a new dataset of 150k data points and reaches SOTA MAE on the W4-17 dataset. Preprint and data are available (lots of fancy equations in the appendix).
🕸️ Kumo published an interesting piece on temporal dependencies in relational DL where features change over time - note that in RelBench edges have timestamps but features are static. They tried predictive forecasting (training a regression head over the graph and history of features) vs generative forecasting (training a diffusion model instead) which experimentally give pretty similar results. Transformers have been used all the way (both for graph encoding and for sequence modeling).
🌟 The Graph Learning on Wednesdays (GLOW) reading group summarized a series of recent discussions on the future of graph learning in a new blog post with opinions from many renowned researchers as to why graph learning is experiencing a certain identity crisis and lack of glamour compared to LLMs, agents, and mainstream AI research. Partly, it revolves around missing “killer” applications which would attract new researchers - nobody needs another variation of GCN / GAT / GT, or some esoteric positional encodings, or yet another self-supervised loss to train on Cora when you can run true scientific discovery with new generations of LLMs and agents. Our take on the problem is in the ICML’25 position paper - find us in Vancouver to chat and share your opinions in the comments.
🎱 Maksim Zhdanov (UvA) published a nice visual introduction to ball tree attention used in the recent Erwin Transformer and how to expand receptive field to large structures in subquadratic time. Erwin is quite strong on MD and PDE modeling tasks, find out about smart tricks for sparser attention.
🎉 Finally, the GDL book now includes a new chapter 5 on Graphs - in addition to standard architectures, the chapter talks asynchronous and topological message passing, as well as looking at the transformer layer (self-attention + MLP) through the lens of message passing. The illustrations are cool - props to the renowned tikz magician Petar Veličković.
Weekend reading:
Don’t procrastinate with new papers, go finish those NeurIPS reviews or ACs will be shaming you and send news to your co-authors how slow you are 🙂
Microsoft Research
Using deep learning to increase the accuracy of computational chemistry and density functional theory
Microsoft researchers achieved a breakthrough in the accuracy of DFT, a method for predicting the properties of molecules and materials, by using deep learning. This work can lead to better batteries, green fertilizers, precision drug discovery, and more.
🔥14❤8👍6💩3🆒1
GraphML News (July 4th 🦅) - Chai-2, SAIR dataset, UMA 1.1, Why flow matching generalizes
Some quick news before the BBQ time and beating aliens over NYC.
🧬 Chai Discovery announced Chai-2 that excels at antibody design generating novel ones for 50+ protein targets achieving 16% binding rate in wet lab tests (that’s quite a lot). The tech report says the backbone is a modified Chai-1 but probably with lot more new training data (which is a good sign, it’s 2025 and models don’t matter, data does). Chai-2 is announced just 2 weeks after Boltz-2 - both started as AlphaFold3 reproductions but now moving in slightly different directions, eg, Chai-2 is not open-source anymore. We’ll be keeping an eye on their successes.
🧬 🧬 SandboxAQ released a new SAIR dataset (structurally augmented IC50 repository) comprised of 5M structures over 1M+ unique protein-ligand systems (folded with Boltz-1x). It’s just 2.5 TB so you don’t have an excuse of not training the next protein-ligand generative model on SAIR 😉
⚛️ FAIR Chemistry updated their Universal Model for Atoms (UMA) to 1.1 (preprint) and significantly improved the performance on catalysis and molecules tasks - scaling MoE transformers shows benefits and makes adepts of equivariance unhappy 🙂
Weekend reading:
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity by Quentin Bertrand et al - a nice study why flow matching generalizes and can generate data outside training distribution. Turns out it happens thanks to neural nets failing to learn the velocity field exactly.
Some quick news before the BBQ time and beating aliens over NYC.
🧬 Chai Discovery announced Chai-2 that excels at antibody design generating novel ones for 50+ protein targets achieving 16% binding rate in wet lab tests (that’s quite a lot). The tech report says the backbone is a modified Chai-1 but probably with lot more new training data (which is a good sign, it’s 2025 and models don’t matter, data does). Chai-2 is announced just 2 weeks after Boltz-2 - both started as AlphaFold3 reproductions but now moving in slightly different directions, eg, Chai-2 is not open-source anymore. We’ll be keeping an eye on their successes.
🧬 🧬 SandboxAQ released a new SAIR dataset (structurally augmented IC50 repository) comprised of 5M structures over 1M+ unique protein-ligand systems (folded with Boltz-1x). It’s just 2.5 TB so you don’t have an excuse of not training the next protein-ligand generative model on SAIR 😉
⚛️ FAIR Chemistry updated their Universal Model for Atoms (UMA) to 1.1 (preprint) and significantly improved the performance on catalysis and molecules tasks - scaling MoE transformers shows benefits and makes adepts of equivariance unhappy 🙂
Weekend reading:
On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity by Quentin Bertrand et al - a nice study why flow matching generalizes and can generate data outside training distribution. Turns out it happens thanks to neural nets failing to learn the velocity field exactly.
🔥12❤9
GraphML News (Aug 3rd) - Graph Foundation Models from Google, PyG ecosystem expanding
It’s been a while since the last post, let’s catch up with the news!
🔮 ICML brought a handful of announcements, eg, our team at Google published a blog post on the in-house Graph Foundation Model which particularly excels on relational data and brings nice (3-40x) benefits compared to SOTA tabular models. It’s quite astounding that the tabular ML world has been overlooking graph modeling for this kind of data for many years leaving lots of performance on table. Well, as we said last year, GFMs are already here and will continue to improve across all axes, from systems and infra to modeling and better generalization.
🌟 Besides that, ICML published a list of outstanding paper awards and a handful of them do use graph learning in one way or another - this is an excellent reminder that beating old benchmarks by 1% is not that important (looking at you, ZINC aficionados) but smart application of this tool in appropriate cases (and actually designing those cases) is very promising, encouraged by the community, and can bring insights even in the LLM & agentic era.
🔥 The PyG world is expanding - PyG maintainers released an overview paper on PyG 2.0 and its latest features including first-class support for explainability, heterogeneous graphs, and scalability improvements. RelBench and relational data seem to be the main blockbuster use-cases of those features, and it’s great to see PyG is keeping the bar high ⛳ Another fresh addition to the ecosystem is the Torch Geometric Pool library that expands the variety of pooling functions.
⌛ Temporal Graph Modeling (TGM) is another new PyG-based library designed for temporal and dynamic graphs. It already bundles several standard baselines like TGN and TGAT, GraphMixer, and EdgeBank, as well as datasets such as Temporal Graph Benchmark. Have a look at the accompanying preprint.
It’s been a while since the last post, let’s catch up with the news!
🔮 ICML brought a handful of announcements, eg, our team at Google published a blog post on the in-house Graph Foundation Model which particularly excels on relational data and brings nice (3-40x) benefits compared to SOTA tabular models. It’s quite astounding that the tabular ML world has been overlooking graph modeling for this kind of data for many years leaving lots of performance on table. Well, as we said last year, GFMs are already here and will continue to improve across all axes, from systems and infra to modeling and better generalization.
🌟 Besides that, ICML published a list of outstanding paper awards and a handful of them do use graph learning in one way or another - this is an excellent reminder that beating old benchmarks by 1% is not that important (looking at you, ZINC aficionados) but smart application of this tool in appropriate cases (and actually designing those cases) is very promising, encouraged by the community, and can bring insights even in the LLM & agentic era.
🔥 The PyG world is expanding - PyG maintainers released an overview paper on PyG 2.0 and its latest features including first-class support for explainability, heterogeneous graphs, and scalability improvements. RelBench and relational data seem to be the main blockbuster use-cases of those features, and it’s great to see PyG is keeping the bar high ⛳ Another fresh addition to the ecosystem is the Torch Geometric Pool library that expands the variety of pooling functions.
⌛ Temporal Graph Modeling (TGM) is another new PyG-based library designed for temporal and dynamic graphs. It already bundles several standard baselines like TGN and TGAT, GraphMixer, and EdgeBank, as well as datasets such as Temporal Graph Benchmark. Have a look at the accompanying preprint.
research.google
Graph foundation models for relational data
👍21❤7🔥7
Postdoctoral Researcher Position in Geometric Deep Learning & AI for Science at AITHYRA
b/w AITHYRA and Technical University of Vienna
Michael Bronstein, AITHYRA Scientific Director AI and Honorary Professor of the Technical University of Vienna in collaboration with Ismail Ilkan Ceylan, expert in graph machine learning, invites outstanding candidates to apply for a postdoctoral research position in Geometric Deep Learning, with a strong emphasis on applications to biology and scientific discovery. This unique research collaboration between AITHYRA and the Technical University of Vienna offers an exceptional opportunity to engage in both foundational machine learning research and high-impact interdisciplinary applications in the natural sciences. The position offers access to top-tier academic and industry research ecosystems and is ideally suited for researchers seeking to push the boundaries of geometric and graph-based learning in real-world scientific contexts. The research program is flexible and interdisciplinary.
The application deadline is 31.8.2025. Link
---
Highly recommend applying - working with Michael and Ismail is a great experience 🙂
b/w AITHYRA and Technical University of Vienna
Michael Bronstein, AITHYRA Scientific Director AI and Honorary Professor of the Technical University of Vienna in collaboration with Ismail Ilkan Ceylan, expert in graph machine learning, invites outstanding candidates to apply for a postdoctoral research position in Geometric Deep Learning, with a strong emphasis on applications to biology and scientific discovery. This unique research collaboration between AITHYRA and the Technical University of Vienna offers an exceptional opportunity to engage in both foundational machine learning research and high-impact interdisciplinary applications in the natural sciences. The position offers access to top-tier academic and industry research ecosystems and is ideally suited for researchers seeking to push the boundaries of geometric and graph-based learning in real-world scientific contexts. The research program is flexible and interdisciplinary.
The application deadline is 31.8.2025. Link
---
Highly recommend applying - working with Michael and Ismail is a great experience 🙂
onlyfy.jobs
This job is not active anymore.
This job is not active anymore. It was <b>closed</b> on Oct 13, 2025.
🔥16❤7🙏3👍1
GraphML News (Aug 9th) - AITHYRA Call for PhD students, Chai Discovery Round, Graph Learning Meets Theoretical CS
While everyone is busy with GPT-5, Opus 4.1, and GPT-OSS, let’s sneak in some graph news!
🎓 A few days ago you could’ve seen an AITHYRA call for postdocs - but fear not if you are still deciding about starting your scientific career, AITHYRA has a call for PhD students too! The plan includes 15-20 fully funded scholarships on the intersection of AI/ML, Molecular Technologies and Systems Medicine (degree either from Medical University or TU of Vienna). Application deadline is September 10, 2025. Glad to see Vienna becoming a new scientific hub in Europe.
💸 Chai Discovery raised $70M Series A from Menlo Ventures & Anthology Fund (Anthropic), Thrive Capital, OpenAI + others ($30M seed). The startup is known for Chai-2 generative model and aims at antibody design. Congrats to Chai!
📚 The Simons Foundation organizes the Graph Learning Meets Theoretical CS workshop (to be held physically at UC Berkeley) inviting renowned professors from both areas (and me, a simple man from industry). The program is packed with a bunch of cool topics starting from practicals things like graph foundation models up to graphons, invariances, combinatorial optimization, and many more. The talks will be streamed on YouTube, and participation is actually free, so come by if you’re at the UC Berkeley campus.
While everyone is busy with GPT-5, Opus 4.1, and GPT-OSS, let’s sneak in some graph news!
🎓 A few days ago you could’ve seen an AITHYRA call for postdocs - but fear not if you are still deciding about starting your scientific career, AITHYRA has a call for PhD students too! The plan includes 15-20 fully funded scholarships on the intersection of AI/ML, Molecular Technologies and Systems Medicine (degree either from Medical University or TU of Vienna). Application deadline is September 10, 2025. Glad to see Vienna becoming a new scientific hub in Europe.
💸 Chai Discovery raised $70M Series A from Menlo Ventures & Anthology Fund (Anthropic), Thrive Capital, OpenAI + others ($30M seed). The startup is known for Chai-2 generative model and aims at antibody design. Congrats to Chai!
📚 The Simons Foundation organizes the Graph Learning Meets Theoretical CS workshop (to be held physically at UC Berkeley) inviting renowned professors from both areas (and me, a simple man from industry). The program is packed with a bunch of cool topics starting from practicals things like graph foundation models up to graphons, invariances, combinatorial optimization, and many more. The talks will be streamed on YouTube, and participation is actually free, so come by if you’re at the UC Berkeley campus.
👍16❤10🔥6
GraphML News (Aug 30th) - OpenAI enters bio, AtomWorks, OrbMol, NeurIPS workshops
📈 The church of scale enters comp bio: OpenAI published first results on protein design of Yamanaka factors (linked to cell aging) together with Retro Bio (where sama happens to be one of investors). The backbone is gpt-4b micro initialized from an existing 4o checkpoint and enriched with “tokenized 3D structure data” (remember ESM-3?) fine-tuned on a specialized dataset. Experimental results are claimed to be quite solid: hit rates of 30-50% (typically it’s less than 10%) with a bunch of other biochemistry markers. The argument between scalable non-equivariant models vs bespoke geometric models got a new data point: will raw compute of OpenAI + vanilla transformers conquer the biotech world too? We’ll keep you posted.
🧬 BakerLab released RosettaFold 3 and AtomWorks, a data processing framework used to train it. While you’d certainly see general remarks about comparisons with AF3 and Boltz, I’d highlight that comp bio folks start to recognize the value of data as much as the model itself (what frontier labs recognized quite some time ago). Real engineering will start when they’d need to serve those protein design models to a few billion clients 😉
⚛️ Orbital Materials released OrbMol, a version of Orb-v3 for molecules (the others are for crystals) trained on OpenMolecules 2025. Orb is still an MPNN which makes it quite fast and useful for MD computations.
By the way, also check out NeurIPS 2025 workshops — finally more diverse than just LLMs and reasoning — and features a handful of graph learning venues.
Weekend reading:
Turning Tabular Foundation Models into Graph Foundation Models from Yandex Research - another interesting approach to GFMs via TabPFNv2 over original node features + mined structural features
📈 The church of scale enters comp bio: OpenAI published first results on protein design of Yamanaka factors (linked to cell aging) together with Retro Bio (where sama happens to be one of investors). The backbone is gpt-4b micro initialized from an existing 4o checkpoint and enriched with “tokenized 3D structure data” (remember ESM-3?) fine-tuned on a specialized dataset. Experimental results are claimed to be quite solid: hit rates of 30-50% (typically it’s less than 10%) with a bunch of other biochemistry markers. The argument between scalable non-equivariant models vs bespoke geometric models got a new data point: will raw compute of OpenAI + vanilla transformers conquer the biotech world too? We’ll keep you posted.
🧬 BakerLab released RosettaFold 3 and AtomWorks, a data processing framework used to train it. While you’d certainly see general remarks about comparisons with AF3 and Boltz, I’d highlight that comp bio folks start to recognize the value of data as much as the model itself (what frontier labs recognized quite some time ago). Real engineering will start when they’d need to serve those protein design models to a few billion clients 😉
⚛️ Orbital Materials released OrbMol, a version of Orb-v3 for molecules (the others are for crystals) trained on OpenMolecules 2025. Orb is still an MPNN which makes it quite fast and useful for MD computations.
By the way, also check out NeurIPS 2025 workshops — finally more diverse than just LLMs and reasoning — and features a handful of graph learning venues.
Weekend reading:
Turning Tabular Foundation Models into Graph Foundation Models from Yandex Research - another interesting approach to GFMs via TabPFNv2 over original node features + mined structural features
Openai
Accelerating life sciences research
OpenAI and Retro Biosciences achieve 50x increase in expressing stem cell reprogramming markers.
👍17❤10👏6
GraphML News (September 2025) - Stanford Graph Learning WS, MoML, RF Diffusion 3
While the community is processing NeurIPS rejects due to “limited physical space” and rushing to the ICLR deadline, it’s about time to plan attending some future events!
🌲 Stanford organizes its annual Graph Learning Workshop on Oct 14th. The main topics are Relational Foundation Models (get ready to hear a lot about it, hehe), Agents (Biomni is quite successful), and fast LLM inference. I attended the event last 3 years and it was quite fun.
🧬 About one week later (Oct 22nd) and on the East Coast, MIT organizes Molecular ML (MoML) conference going full Geometric DL mode — expect news about Boltz and new drug discovery methods, most of the big pharma is in the sponsors.
🧬🧬 The Baker Lab released a pre-print of RFDiffusion 3 (the data pipeline of it, AtomWorks, was pre-printed a bit earlier). Compared to AF3, it has much fewer Pairformer layers (only 2 vs 48) without all the triangular attention complexity, and most of the params and compute went into the diffusion module (and good data pipelines, hehe). RFD3 is substantially faster than previous versions on longer residue structures, and much more accurate than RFaa. Code is not yet there.
🎅 FAIR Chemistry opened an Open Molecules 2025 leaderboard and, to our utter amusement, 4-years old GemNet OC tops the benchmark in several tasks. The grand-dad of ML potentials still rocks if you give it better data and more compute. That’s a good lesson on designing models that can stand a test of time and new data.
Finally, for some weekend reading, check Random graphs as perfect expanders on Quanta Magazine. Obtaining good expanders is a non-trivial task (which will very quickly get you into the group theory), but turns out you should never underestimate good ole ER graphs to be sufficiently ok expanders.
While the community is processing NeurIPS rejects due to “limited physical space” and rushing to the ICLR deadline, it’s about time to plan attending some future events!
🌲 Stanford organizes its annual Graph Learning Workshop on Oct 14th. The main topics are Relational Foundation Models (get ready to hear a lot about it, hehe), Agents (Biomni is quite successful), and fast LLM inference. I attended the event last 3 years and it was quite fun.
🧬 About one week later (Oct 22nd) and on the East Coast, MIT organizes Molecular ML (MoML) conference going full Geometric DL mode — expect news about Boltz and new drug discovery methods, most of the big pharma is in the sponsors.
🧬🧬 The Baker Lab released a pre-print of RFDiffusion 3 (the data pipeline of it, AtomWorks, was pre-printed a bit earlier). Compared to AF3, it has much fewer Pairformer layers (only 2 vs 48) without all the triangular attention complexity, and most of the params and compute went into the diffusion module (and good data pipelines, hehe). RFD3 is substantially faster than previous versions on longer residue structures, and much more accurate than RFaa. Code is not yet there.
🎅 FAIR Chemistry opened an Open Molecules 2025 leaderboard and, to our utter amusement, 4-years old GemNet OC tops the benchmark in several tasks. The grand-dad of ML potentials still rocks if you give it better data and more compute. That’s a good lesson on designing models that can stand a test of time and new data.
Finally, for some weekend reading, check Random graphs as perfect expanders on Quanta Magazine. Obtaining good expanders is a non-trivial task (which will very quickly get you into the group theory), but turns out you should never underestimate good ole ER graphs to be sufficiently ok expanders.
❤24
How can we create general-purpose graph foundation models?
(by Dmitry Eremeev)
For a long time, we believed that general-purpose graph foundation models were impossible to create. Indeed, graphs are used to represent data across many different domains, and thus graph machine learning must handle tasks on extremely diverse datasets, such as social, information, transportation, and co-purchasing networks, or models of various physical, biological, or engineering systems. Given the vast differences in structure, features, and labels among these datasets, it seemed unlikely that a single model could achieve robust cross-domain generalization and perform well on all of them.
However, we noticed that tabular machine learning faces a similar challenge of working with diverse datasets containing different features and labels. And yet, this field has recently witnessed the emergence of first successful foundation models such as TabPFNv2, which are based on the prior-data fitted networks (PFNs) paradigm. Thus, we have decided to try to bring their success to the graph domain.
Our first attempt, G2T-FM, was relatively straightforward. We manually injected graph information into node features by computing structural and positional encodings, along with neighborhood-aggregated features. We then applied tabular foundation models (TabPFNv2 and LimiX) to these enriched features. Even this simple approach delivered impressive results. G2T-FM not only strongly outperforms previous graph foundation models on the GraphLand benchmark and classic datasets, but also often outperforms architecturally-improved and carefully tuned GNNs trained from scratch.
Building on this, our next step was to create GraphPFN – the first graph foundation model in the PFN framework. Moving beyond manual feature engineering of the previous approach, we first integrated message passing modules into the LimiX model so that it could learn graph-based dependencies directly, and then continually pretrained it on 4,000,000 synthetic graph datasets sampled from our specially designed attributed graph prior. The obtained model can perform node property prediction on graph datasets in a single forward pass via in-context learning and produces strong results, substantially outperforming both G2T-FM and classic GNNs on several datasets.
There remains much work to be done, including scaling to larger graphs, improving model architectures and designing better graph priors for synthetic dataset generation. However, we are now convinced that building general-purpose graph foundation models is indeed possible, and a prior-data fitted network approach is a promising path towards this goal.
For more details, check out our papers:
Turning Tabular Foundation Models into Graph Foundation Models
GraphPFN: A Prior-Data Fitted Graph Foundation Model
(by Dmitry Eremeev)
For a long time, we believed that general-purpose graph foundation models were impossible to create. Indeed, graphs are used to represent data across many different domains, and thus graph machine learning must handle tasks on extremely diverse datasets, such as social, information, transportation, and co-purchasing networks, or models of various physical, biological, or engineering systems. Given the vast differences in structure, features, and labels among these datasets, it seemed unlikely that a single model could achieve robust cross-domain generalization and perform well on all of them.
However, we noticed that tabular machine learning faces a similar challenge of working with diverse datasets containing different features and labels. And yet, this field has recently witnessed the emergence of first successful foundation models such as TabPFNv2, which are based on the prior-data fitted networks (PFNs) paradigm. Thus, we have decided to try to bring their success to the graph domain.
Our first attempt, G2T-FM, was relatively straightforward. We manually injected graph information into node features by computing structural and positional encodings, along with neighborhood-aggregated features. We then applied tabular foundation models (TabPFNv2 and LimiX) to these enriched features. Even this simple approach delivered impressive results. G2T-FM not only strongly outperforms previous graph foundation models on the GraphLand benchmark and classic datasets, but also often outperforms architecturally-improved and carefully tuned GNNs trained from scratch.
Building on this, our next step was to create GraphPFN – the first graph foundation model in the PFN framework. Moving beyond manual feature engineering of the previous approach, we first integrated message passing modules into the LimiX model so that it could learn graph-based dependencies directly, and then continually pretrained it on 4,000,000 synthetic graph datasets sampled from our specially designed attributed graph prior. The obtained model can perform node property prediction on graph datasets in a single forward pass via in-context learning and produces strong results, substantially outperforming both G2T-FM and classic GNNs on several datasets.
There remains much work to be done, including scaling to larger graphs, improving model architectures and designing better graph priors for synthetic dataset generation. However, we are now convinced that building general-purpose graph foundation models is indeed possible, and a prior-data fitted network approach is a promising path towards this goal.
For more details, check out our papers:
Turning Tabular Foundation Models into Graph Foundation Models
GraphPFN: A Prior-Data Fitted Graph Foundation Model
👍19❤7🔥6
Tired of evaluating your graph ML models on Cora, CiteSeer, and PubMed? We have a better benchmark for you!
(by Oleg Platonov)
Paper: link (NeurIPS 2025 D&B track)
Datasets: Zenodo and PyG (in PyG, all the necessary feature preprocessing can be done automatically)
Code: GitHub
Recently, there has been a lot of criticism of existing popular graph ML benchmark datasets concerning such aspects as lacking practical relevance, low structural diversity that leaves most of the possible graph structure space not represented, low application domain diversity, graph structure not being beneficial for the considered tasks, and potential bugs in the data collection processes. Some of these criticisms previously appeared on this channel.
To provide the community with better benchmarks, we present GraphLand: a collection of 14 graph datasets for node property prediction coming from diverse real-world industrial applications of graph ML. What makes this benchmark stand out?
Diverse application domains: social networks, web graphs, road networks, and more. Importantly, half of the datasets feature node-level regression tasks that are currently underrepresented in graph ML benchmarks, but are often encountered in real-world applications.
Range of sizes: from thousands to millions of nodes, providing opportunities for researchers with different computational resources.
Rich node attributes that contain numerical and categorical features — these are more typical for industrial applications than textual descriptions that are standard for current benchmarks.
Different learning scenarios. For all datasets, we provide two random data splits with low and high label rate. Further, many of our networks are evolving over time, and for them we additionally provide more challenging temporal data splits and an opportunity to evaluate models in the inductive setting where only an early snapshot of the evolving network is available at train time.
We evaluated a range of models on our datasets and found that, while GNNs achieve strong performance on industrial datasets, they can sometimes be rivaled by popular in the industry gradient boosted decision trees which are provided with additional graph-based input features.
Further, we evaluated several graph foundation models (GFMs). Despite much attention being paid to GFMs recently, we found that there are currently only a few GFMs that can handle arbitrary node features (which is required for true generalization between different graphs) and that these GFMs produce very weak results on our benchmark. So it seemed like the problem of developing general-purpose graph foundation models was far from being solved, which motivated our research in this direction (see the previous post).
(by Oleg Platonov)
Paper: link (NeurIPS 2025 D&B track)
Datasets: Zenodo and PyG (in PyG, all the necessary feature preprocessing can be done automatically)
Code: GitHub
Recently, there has been a lot of criticism of existing popular graph ML benchmark datasets concerning such aspects as lacking practical relevance, low structural diversity that leaves most of the possible graph structure space not represented, low application domain diversity, graph structure not being beneficial for the considered tasks, and potential bugs in the data collection processes. Some of these criticisms previously appeared on this channel.
To provide the community with better benchmarks, we present GraphLand: a collection of 14 graph datasets for node property prediction coming from diverse real-world industrial applications of graph ML. What makes this benchmark stand out?
Diverse application domains: social networks, web graphs, road networks, and more. Importantly, half of the datasets feature node-level regression tasks that are currently underrepresented in graph ML benchmarks, but are often encountered in real-world applications.
Range of sizes: from thousands to millions of nodes, providing opportunities for researchers with different computational resources.
Rich node attributes that contain numerical and categorical features — these are more typical for industrial applications than textual descriptions that are standard for current benchmarks.
Different learning scenarios. For all datasets, we provide two random data splits with low and high label rate. Further, many of our networks are evolving over time, and for them we additionally provide more challenging temporal data splits and an opportunity to evaluate models in the inductive setting where only an early snapshot of the evolving network is available at train time.
We evaluated a range of models on our datasets and found that, while GNNs achieve strong performance on industrial datasets, they can sometimes be rivaled by popular in the industry gradient boosted decision trees which are provided with additional graph-based input features.
Further, we evaluated several graph foundation models (GFMs). Despite much attention being paid to GFMs recently, we found that there are currently only a few GFMs that can handle arbitrary node features (which is required for true generalization between different graphs) and that these GFMs produce very weak results on our benchmark. So it seemed like the problem of developing general-purpose graph foundation models was far from being solved, which motivated our research in this direction (see the previous post).
arXiv.org
GraphLand: Evaluating Graph Machine Learning Models on Diverse...
Although data that can be naturally represented as graphs is widespread in real-world applications across diverse industries, popular graph ML benchmarks for node property prediction only cover a...
👍22❤5🔥2