All about AI, Web 3.0, BCI
3.44K subscribers
739 photos
26 videos
161 files
3.22K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Meet LUMI-lab is a self-driving lab that closes the loop between an AI foundation model + robotics to accelerate lipid nanoparticle (LNP) discovery for mRNA delivery.

LUMI-lab (Large-scale Unsupervised Modeling followed by Iterative experiments) is a self-driving laboratory that tightly closes the loop between an AI foundation model and automated robotics to accelerate LNP discovery for mRNA delivery.

To tackle data scarcity in emerging mRNA delivery domains, pretrained the model on 28M+ molecular structures, then iteratively improved it with closed-loop experimental data.

In this work, across ten active-learning cycles, LUMI-lab synthesized and evaluated 1,700+ new LNPs and unexpectedly identified a new design feature for efficient delivery: brominated lipid tails.

These brominated-tail ionizable lipids delivered mRNA into human lung cells more efficiently than approved benchmarks, despite representing only a small fraction of the initial chemical space explored.

GitHub.
Check the video here.
3🔥3👏2
A good model of the world requires not just great graphics but spatial and world intelligence so that you can understand how objects move and respond, what actions cause what outcomes, and what the effects of interactions by players are.

Moonlake's world model delivers that.
🔥32🥰2
Google introduced Nano Banana 2

It uses Gemini’s understanding of the world and is powered by real-time information and images from web search. That means it can better reflect real-world conditions in high-fidelity.

Check out "Window Seat," a demo using Nano Banana 2’s world understanding to generate more accurate views from any window in the world, pulling live local weather info with 2K/4K specs. The precision is mind blowing.

Rolling out today as the new default in the Geminiapp, Search (across 141 countries), and Flow + available in preview via Google AIStudio and Vertex AI. Also available in Google Antigravity.
👏3🔥2🥰2
New from DeepSeek: DualPath

Researchers from Peking University, Tsinghua University, and #DeepSeek unveiled DualPath to fix the storage bandwidth bottleneck, which may be the secret killer of LLM agent performance.

Instead of letting data get stuck in a single storage traffic jam, DualPath creates a second highway for data to travel.

It loads saved model memory into idle decoding engines and then zips it over to the processing engines using high-speed internal networks, ensuring no part of the system sits idle while waiting for data.

The results are massive: DualPath boosts offline throughput by up to 1.87x and nearly doubles online serving speeds without violating performance targets.
🔥32🥰2
Sakana introduced Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible.

By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks.

Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts.

To bypass these limitations, this work focuses on the concept of cost amortization. Researchers pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document.

In experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights.

Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs.

Doc-to-LoRA
Paper
Code

Text-to-LoRA
Paper
Code
🔥3👏3🥰2
Anthropic dropped new feature lets you import your entire memory from chatGPT, Gemini etc into Claude so it instantly knows everything about you. no more reminding claude who you are.
🔥53💯2
REMem: Reasoning with Episodic Memory in AI Agents

REMem addresses a capability gap in many RAG/memory systems: not just storing documents or facts, but also recollecting specific past events with their situational grounding (when/where/who/what) and then reasoning across multiple events on a timeline.

GitHub.
🔥2🥰2💯2
Visa is leaning hard into agentic commerce and stablecoins.

• Agentic commerce live in the US & CEMEA, expanding globally

• Stablecoin cards now in 50+ countries

• USDC settlement live in the US

• $4.6B annualized stablecoin volume

Visa is positioning itself as the infrastructure layer between crypto and traditional finance.
👍2🔥2🥰2👎1
Researchers adapted the Avey architecture to the encoder paradigm and called the result Avey-B, a next-generation alternative to BERT with unlimited context length.

Avey is an alternative architecture to Transformers from last year.

It scales linearly with context-length and performs better at long-context tasks (needle).

They now showed that it works just as well in BERT-style model.

This approach definitively needs more attention.

HuggingFace.
GitHub
3🔥3💯2
DoubleAI’s AI system beat a decade of expert GPU engineering

WarpSpeed just beat a decade of expert-engineered GPU kernels — every single one of them.

cuGraph is one of the most widely used GPU-accelerated libraries in the world. It spans dozens of graph algorithms, each written and continuously refined by some of the world’s top performance engineers.

DoubleAI’s WarpSpeed autonomously rewrote and re-optimized these kernels across three GPU architectures (A100, L4, A10G). DoubleAI released the hyper-optimized version on GitHub — install it with no change to your code.

The numbers: - 3.6x average speedup over human experts - 100% of kernels benefit from speedup - 55% see more than 2x improvement.

Winning Gold at IMO 2025.

Codeforces benchmarks.

From Reasoning to Super-Intelligence: A Search-Theoretic Perspective.
ByteDance published CUDA Agent

It trained a model that writes fast CUDA kernels. Not just correct ones — actually optimized ones.

It beats torch.compile by 2× on simple/medium kernels, ~92% on complex ones, and even outperforms Claude Opus 4.5 and Gemini 3 Pro by ~40% on the hardest setting.

The key idea is simple but kind of brilliant:

CUDA performance isn’t about correctness, it’s about hardware. Warps, memory bandwidth, bank conflicts — the stuff you only see in a profiler.

So instead of rewarding “did it compile?”, they reward actual GPU speed. Real profiling numbers. RL trained directly on performance.

Paper.
1🔥1💯1
Alibaba's top AI researcher resigned immediately after Qwen's most successful model launch ever.

The day Junyang Lin announced his departure, Qwen released FOUR brand new models, including one that can run on just 7 gigs of RAM.

The models got rave reviews, including one from Elon Musk, who praised its "density of intelligence." The models were/are free to use.

Lin was seen as the most important developer at Qwen. He was also a big open source advocate. His departure led to speculation that he'd been forced out against his will. Chinese AI researchers You Jiacheng and Chen Cheng shared this view.

Why did this happen?

Some are saying it was a money thing. All of Alibaba's Qwen models until now have been completely open source, meaning that people can download them and run them locally, generating no revenue for Alibaba. Reportedly, company execs were frustrated that the open source models were not helping get users for Alibaba's revenue-generating services (e.g. Alibaba Cloud, subscription services etc).

Shortly before Lin quit, Alibaba had hired people who had worked on Google's Gemini, reportedly with an eye to increasing Daily Active Users (DAUs). If these reports are correct then we can expect that Alibaba will put more emphasis on monetizing its AI going forward. That should drive higher revenue, though it will likely mean the end of these powerful free Qwen models we've been seeing lately.

Word on the street is that Alibaba is tightening the screws to make money via proprietary cloud and API rather than open source.
🔥1👏1💯1
Physical Intelligence made a memory system for their models and call it Multi-Scale Embodied Memory (MEM).

It provides both short-term and long-term memory to enable very long tasks.

Researchers tested it on cleaning a kitchen (and yes, washing dishes), making grilled cheese, and more.

One of the cool side effects of MEM is in-context adaptation: when the robot makes a mistake, like opening the fridge door from the wrong side, it remembers what happened and tries the task again in a different way.
🔥6👏2💯1
MIT introduced NeuroSkill, a real-time agentic system that models human cognitive and emotional state by integrating Brain-Computer Interface signals with foundation models.

"Human State of Mind" provided via SKILL dot md.

The system runs fully offline on the edge.

Its NeuroLoop harness enables agentic workflows that engage users across cognitive and emotional levels, responding to both explicit and implicit requests through actionable tool calls.

Why does it matter?

Most AI agents respond only to explicit user requests. NeuroSkill explores the frontier of proactive agents that sense and respond to implicit human states, opening new possibilities for adaptive human-AI interaction.
🔥1🥰1💯1
Meta introduced Beyond Language Modeling: a deep dive into the design space of truly native multimodal models

Humans communicate through language and interact with the world through vision, yet most multimodal models are language-first.

First: how should we represent vision?
RAE uses a frozen SigLIP encoder as the image tokenizer. One encoder for both understanding and generation. Biggest wins on semantic and world-knowledge generation, where VAEs lose critical information through compression.

Next: how do we route modalities? Hand-crafted separation (MoT) or data-driven (MoE)?

MoE wins. But the interesting part is the emergent routing behavior. Without any hard architectural split, the network learns vision experts, language experts, and multimodal experts on its own. It even routes understanding and generation tokens to the same experts.

Why are frontier models still dense? Maybe they're using the wrong tokenizer.
VAE + MoE: no benefit from sparsity RAE + MoE: scales with sparsity
Semantic vision tokens unlock what MoE was designed to do.

Native multimodal pretraining enables zero-shot language-guided navigation. No navigation-specific data, no caption annotations. The grounding emerges from pretraining alone.
1🔥1💯1
Microsoft released Phi-4-reasoning-vision-15B, a compact and fast multimodal reasoning model, blends strengths of different methods while reducing their limits

The new phi multimodal model pushes the compute/perf pareto frontier via a careful design of many decisions, making it a good choice for edge applications.
Google Introduced a new method to teach LLMs to reason like Bayesians.

By training models to mimic optimal probabilistic inference, we improved their ability to update their predictions and generalize across new domains.
🔥32💯2