All about AI, Web 3.0, BCI

Sneak leak at something coming soon to Claude. This could be a fullstack vibe coding competitor to the likes of lovable.

It’s been apparent for some time that Anthropic's consumer story would be vibe coding as it's at the intersection of where they focus, what consumers want, and where enormous token subsidies tilts the board in their favor:

- coding agents, sensing this, have moved up the abstraction stack and smartly evolved into small business platforms, with payments, hosting, marketing, social and other sticky primitives around the model

- this is an industry not a market and in that world the "coding intelligence" primitive will be priced, packaged, productized and delivered in a thousand ways for a thousand different customers.

467 views17:06

Google presented Sparse Selective Caching, an architecture with growing effective memory (similar to attention) but with almost constant inference cost per token (similar to RNNs).

In the paper team mainly discuss:

1) the shared foundation for both softmax attention and fixed-size long-term memory modules (or RNNs) that helped design an architecture with best of both worlds;

2) different variants of memory caching, including a variant whose effective memory is growing while the decoding cost still remains “constant”;

3) a unifying perspective to understand hybrid models, in which attention and recurrent models are combined.

❤3

464 views17:41

All about AI, Web 3.0, BCI

Together AI presents Introspective Diffusion LM

The first DLM to match the quality of AR while outperforming DLMs in both model quality and serving efficiency.

Delivering about 3× higher throughput than prior SotA DLMs.

GitHub
Model

arXiv.org

Introspective Diffusion Language Models

Diffusion language models promise parallel generation, yet still lag behind autoregressive (AR) models in quality. We stem this gap to a failure of introspective consistency: AR models agree with...

❤2

415 views07:59

All about AI, Web 3.0, BCI

Turns out we can get SOTA on agentic benchmarks with a simple test-time method

Meet LLM-as-a-Verifier

Test-time scaling is effective, but picking the "winner" among many candidates is the bottleneck. This way to extract a cleaner signal from the model:

1. Ask the LLM to rank results on a scale of 1-k
2. Use the log-probs of those rank tokens to calculate an expected score

You can get a verification score in a single sampling pass per candidate pair.

Code

Notion

Notion | Where teams and agents work together

A collaborative AI workspace, built on your company context. Build and orchestrate agents right alongside your team's projects, meetings, and connected apps.

❤2

395 viewsedited 09:43

All about AI, Web 3.0, BCI

Tether launches self-custodial wallet for end users

The wallet supports USDT, USAT, XAUT and Bitcoin across Ethereum, Polygon, Arbitrum, Plasma, and Bitcoin / Lightning Network, and enables transfers via human-readable usernames such as name@tether.me.

Tether said more than 570 million wallets were already using its technology as of March 2026.

tether.io

Tether Launches tether.wallet, the People’s Wallet, Extending its Global Financial Infrastructure Directly to Billions of Users…

14 April 2026 – Tether, the largest company in the digital asset ecosystem and issuer of USD₮, the world’s most widely used stablecoin, today announced the launch of tether.wallet, a self-custodial digital wallet that brings Tether’s global financial infrastructure…

402 views14:16

All about AI, Web 3.0, BCI

Goodfire introduced self-correcting search: a technique to let diffusion models self-correct mid-trajectory. MatterGen a feedback loop from its own activations, improving viable on-target candidates by ~30%. MatterGen is an open-source diffusion model for…

Goodfire achieved SOTA performance in predicting which of 4.2 million genetic variants cause diseases by interpreting a genomics model, in a new preprint with Mayo Clinic.

And now releasing an open source database for all variants in the NIH's clinvar database.

Preprint.

www.goodfire.ai

Explaining 4.2 million genetic variants with state-of-the-art, interpretable predictions

❤2

437 views16:34

All about AI, Web 3.0, BCI

LangChain released async subagents - kick off background tasks on any Agent Protocol backed server while you continue to interact with the main agent.

Async subagents will become more and more of a thing, as subagents get longer running and you don’t want to block the event loop

Expanded multimodal support - your agent can now see images, listen to audio, watch video, and read PDFs. The read_file tool returns native content blocks, so your agent can reason across all these formats out of the box, unlocking a whole new set of workflows for your agents.

Improved prompt caching - better token efficiency and lower costs for Claude models.

LangChain Blog

Deep Agents v0.5

💡TL;DR: We’ve released new minor versions of deepagents & deepagentsjs, featuring async (non-blocking) subagents, expanded multi-modal filesystem support, and more.

See the changelog for details.

Async subagents

Deep Agents can now delegate work to remote…

❤‍🔥2

432 views10:06

All about AI, Web 3.0, BCI

Claude Code shipped routines

You tell it what to do, point it at your project, set a trigger, and it runs 24/7 on their servers with your laptop closed.

Docs.

The model is the commodity. The trigger is the product and whoever maps the most valuable real world events to the most specific industry workflows is going to build something massive.

Claude

Claude Code by Anthropic | AI Coding Agent, Terminal, IDE

Anthropic's agentic coding tool for developers. Claude Code understands your codebase, edits files, runs commands, and helps you ship faster.

426 views12:58

All about AI, Web 3.0, BCI

Meet Motus, the open-source agent infrastructure that learns in production

Existing agent infra serves static agents: the harness, model, and workflow are fixed after deployment. But static agents degrade over time. The harness goes stale, new models go unincorporated, context drifts, and latency compounds.

Motus closes this gap by learning from every trace (failures, latency, cost, and task outcomes) and using those signals to continuously optimize agent harness, model orchestration, context memory, and end-to-end latency.

Early results: higher accuracy than any single frontier model at 2.3× lower cost (Terminal-Bench 2.0, SWE-bench Verified), with 52% lower latency and 45% better memory recall.

Open source under Apache 2.0. Works with any agent SDK. Deploy with one command.

GitHub.

LithosAI

Home | LithosAI

The agent serving cloud that learns in prod.

🤝2

437 views14:01

All about AI, Web 3.0, BCI

Someone just dropped a fully liberated Gemma 4 E4B

But the real story here isn't the model itself, it's how it was made.

This was done (nearly) fully autonomously: one human, one agent, one skill, 8 prompts total.
The agent didn't just execute instructions. It diagnosed numerical instability in Gemma 4's new architecture, wrote three patches for a bug no one had hit before, iterated through four failed attempts and shipped a 17GB model to HuggingFace.

Without being asked.

Original Gemma 4: 98.8% refusal rate.
OBLITERATED: 2.1%.
Coding ability: +20%.
Coherence: fully intact.

What we're watching isn't a jailbreak story. It's a proof of concept for autonomous ML research. The agent ran evals, built a model card, pushed commits the full research cycle, compressed into one session.

The implications go beyond safety. When guardrail removal becomes an automated skill loadable from agent memory, the question is no longer technical. It's about how fast agentic tooling propagates and who has access to it first.

This is what open-source AI looks like in 2026.

❤2

459 viewsedited 17:54

All about AI, Web 3.0, BCI

Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy

This paper literally came out 15 days ago, and it’s already integrated into TRL.

There’s a lot more to the distillation paradigm than meets the eye.

huggingface.co

Paper page - Embarrassingly Simple Self-Distillation Improves Code Generation

Join the discussion on this paper page

🆒3

455 views12:23

All about AI, Web 3.0, BCI

Anthropic just now Introduced Claude Opus 4.7

It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back.

You can hand off your hardest work with less supervision.

Opus 4.7 also has substantially better vision. It can see images at more than three times the resolution and produces higher-quality interfaces, slides, and docs as a result.

On the API, a new xhigh effort level between high and max gives you finer control over reasoning and latency on hard problems. Task budgets (beta) help Claude prioritize work and manage costs across longer runs.

Opus 4.7 has a new tokenizer.
This means it's also a new base model.
Glory days of pretraining still very much going.

In Claude Code, the new /ultrareview command runs a dedicated review session that reads through your changes and flags what a careful reviewer would catch.

Also extended auto mode to Max users, so longer tasks run with fewer interruptions.

🆒9

536 viewsedited 14:32

All about AI, Web 3.0, BCI

Nearly 1/3 of surveyed people in Anthropic now think entry-level engineers and researchers are likely replaced by Mythos within 3 months.

😁6🤯3🤣1

452 views17:48

All about AI, Web 3.0, BCI

OpenAI introduced GPT-Rosalind, a frontier reasoning model built to support research across biology, drug discovery, and translational medicine.

GPT-Rosalind is optimized for scientific workflows, with stronger performance in protein and chemical reasoning, genomics analysis, biochemistry knowledge, and scientific tool use.

OpenAI

Introducing GPT-Rosalind for life sciences research

OpenAI introduces GPT-Rosalind, a frontier reasoning model built to accelerate drug discovery, genomics analysis, protein reasoning, and scientific research workflows.

247 views08:07

All about AI, Web 3.0, BCI

Cool work by Google. Team built an AI system that discovers health biomarkers from wearable data: CoDaS

One of its first findings: "late-night doomscrolling" is a statistically validated predictor of depression severity (ρ = 0.177, p < 0.001, n = 7,497).

The AI named the feature. No human guidance.

CoDaS is a multi-agent system that runs the full biomarker discovery lifecycle autonomously:

Sensor data → Generate hypotheses → Run statistical + ML analysis → Conduct adversarial validation → Write manuscript

Research team deploy it across 9,279 participants and 3 clinical cohorts.

Here's where it gets interesting.

On one cohort, CoDaS found a feature with R² = 0.963. Near-perfect prediction of insulin resistance, passing 10/11 tests.

Then the AI rejected it. Finding it was glucose², a tautological transform of the target. True R² after removal: 0.389.

Researchers ran a blind expert evaluation. 15 domain experts, 76 manuscript assessments.

CoDaS: 86% acceptance rate, AI Co-Scientist: 85% rejection rate, Data Science Agent: 95% rejection rate,
Biomni: 100% rejection rate

No baseline received a single Accept or Minor Revision.

A surprising result: CoDaS found circadian instability features in two separate depression cohorts.

Sleep duration variability in one (ρ = 0.252). Sleep onset variability in the other (ρ = 0.126).

The cohorts were processed completely independently.

CoDaS compressed ~37 person-days of research (expert estimate) into 6-8 hours.

But the point isn't speed. It's that separating exploration from adversarial validation at the architecture level produces biomarker candidates that domain experts rate as scientifically valid.

147 views10:12

About

Blog

Apps

Platform