Axis of Ordinary

Links for 2026-01-28

AI

1. Researchers at the Flatiron Institute have developed a new biologically inspired computational unit for artificial neural networks called “rectified spectral units” (ReSUs), which extract temporal patterns from the recent past to predict the near future. Unlike standard units that rely on biologically implausible backpropagation, these self-supervised ReSUs successfully replicated complex features of the fruit fly visual system by learning to process raw data independently. https://www.simonsfoundation.org/2026/01/26/biological-brains-inspire-a-new-building-block-for-artificial-neural-networks/

2. Numina-Lean-Agent: An Open and General Agentic Reasoning System for Formal Mathematics https://arxiv.org/abs/2601.14027

3. Problem 728 and the use of AI on Erdős problems https://www.erdosproblems.com/forum/thread/blog:2

4. EpochAI released a new benchmark: A collection of unsolved mathematics problems that have resisted serious attempts by professional mathematicians. https://epoch.ai/frontiermath/open-problems

5. Prism: a free workspace for scientists to write and collaborate on research, powered by GPT-5.2. http://prism.openai.com/

6. AlgZoo: A collection of extremely tiny AI models (ranging from 8 to roughly 1,400 parameters) that serve as a “stress test” for our ability to understand how AI actually works. https://www.lesswrong.com/posts/x8BbjZqooS4LFXS8Z/algzoo-uninterpreted-models-with-fewer-than-1-500-parameters

7. When open-source models are fine-tuned on seemingly benign chemical synthesis information generated by frontier models, they become much better at chemical weapons tasks. https://www.arxiv.org/abs/2601.13528

8. Self-Distilled Reasoner: On-Policy Self-Distillation https://siyan-zhao.github.io/blog/2026/opsd/

9. A Pragmatic VLA Foundation Model https://arxiv.org/abs/2601.18692

10. A case study of a “model breakdown,” where a routine request triggered a deep recursive loop in the model’s internal logic, resulting in multilingual gibberish and a fixation on seemingly random concepts like tumors and Jose Mourinho. https://www.lesswrong.com/posts/XuzPu5mBDY3TCvw2J/anomalous-tokens-on-gemini-3-0-pro

11. China Trains AI-Controlled Weapons With Learning From Hawks, Coyotes https://www.wsj.com/world/china/china-ai-weapons-hawks-wolves-2fcb58bb [no paywall: https://archive.is/8EyYr]

12. Xi Jinping calls AI ‘epoch-making’ as China pushes innovation strategy – but flags risks https://www.scmp.com/economy/china-economy/article/3341267/xi-jinping-calls-ai-epoch-making-china-pushes-innovation-strategy-flags-risks

13. Beijing approves the purchase of “several hundred thousand” Nvidia H200 chips worth around $10 billion for large tech companies like Alibaba and ByteDance. More approvals are expected later. https://www.reuters.com/world/china/china-gives-green-light-importing-first-batch-nvidias-h200-ai-chips-sources-say-2026-01-28/ [no paywall: https://archive.is/RVQOh]

14. A January 2026 report from Georgetown’s CSET confirms that Google DeepMind and OpenAI are pursuing “recursive self-improvement” (RSI), a paradigm that could exponentially accelerate AI capabilities beyond human control. https://cset.georgetown.edu/publication/when-ai-builds-ai/

15. Economist: AI will be bigger than electricity and semiconductors [PDF] https://web.stanford.edu/~chadj/AIandEconomicFuture.pdf

16. AI Now Beats the Average Human in Tests of Creativity https://singularityhub.com/2026/01/27/ai-now-beats-the-average-human-in-tests-of-creativity/

17. Dialogue: Is there a Natural Abstraction of Good? https://www.lesswrong.com/posts/M5s6WgScRfmeWsLD4/dialogue-is-there-a-natural-abstraction-of-good

Brain emulation

1. Notable Progress Has Been Made in Whole Brain Emulation https://www.lesswrong.com/posts/DGsBfcEQKuNPmQizQ/notable-progress-has-been-made-in-whole-brain-emulation

2. State of Brain Emulation 2025 https://brainemulation.mxschons.com/

3. Building Brains on a Computer https://press.asimov.com/articles/brains

👍4🤡2❤1

716 views15:27

Axis of Ordinary

NVIDIA used AI agents to build an entire deep learning framework from scratch.

Instead of writing the code themselves, the human researchers acted like managers. They gave high-level instructions to "AI coding agents" and set up automated tests to check the work.

The AI agents wrote all the complex code, fixed their own errors when tests failed, and connected the different parts of the system together.

Paper: https://arxiv.org/abs/2601.16238

🤪5🤯2🤡2🥱2👍1👏1

936 views20:54

Axis of Ordinary

AI labs are already using their own frontier models to help build the next generation of models. This could become a source of major strategic surprise.

Some concrete examples of AI R&D automation feeding back into itself:

1. Compute (hardware): AI is helping design the hardware that runs AI. AlphaChip (Google/DeepMind) has produced superhuman chip layouts deployed across multiple generations of Google TPUs.

2. Algorithms (efficiency): AlphaEvolve (Google/DeepMind) has discovered/improved algorithms and has been used to recover ~0.7% of Google’s global compute via better data-center scheduling and related optimizations. This amounts to real operational leverage that can feed back into faster/cheaper AI development.

3. Coding (engineering): Anthropic uses Claude Code to map its entire internal infrastructure. It doesn't just write code. It reads the codebase to explain complex data pipelines and traces control flow during security incidents, boosting resolution speed by 3x. This effectively turns their engineering logs into a rich dataset of agent trajectories for training future models (see #5 below).

4. World models (training environments): AI models are increasingly used to generate reinforcement-learning environments for training and experimentation. DeepMind's Genie line is a clear example: endless playable, interactive worlds that can provide a never-ending curriculum for agents.

5. Data (verified synthetic signal): We are seeing 'virtual gold panning' at scale. Labs are turning raw compute into high-quality intelligence by generating massive amounts of synthetic data and filtering it through strong verifiers. This creates a self-reinforcing loop. The model performs work (like generating code), the output is rigorously verified (by tests or other models), and the winning “reason-act-observe” traces (the gold nuggets) become the training data for the next generation.

It's easy to find gotchas and limitations. And that's exactly why this is tricky to reason about. We may be in a state of deep unobservability. Outsiders see AI agents struggling to keep a high-level view of a codebase and conclude “bottleneck,” while insiders see a temporary tooling/workflow problem on an exponential curve. By the time the signal is obvious to everyone, the feedback loop may already be well underway. And because this space moves so fast, looking only at today’s demos and papers is misleading. We have to think about where we'll be two papers down the line, not just where we are now.

References:
- https://cset.georgetown.edu/publication/when-ai-builds-ai
- https://deepmind.google/blog/how-alphachip-transformed-computer-chip-design/
- https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
- https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/
- https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/
- https://arxiv.org/abs/2512.23236
- https://x.com/bcherny/status/2004897269674639461

👍3🤡3🥴1

796 views14:16

Axis of Ordinary

The largest randomized trial of medical AI (Sweden; ~100k women):

Compared AI + 1 radiologist (with triage to double-read when needed) vs standard double reading by 2 radiologists.

Results: ~29% more cancers detected at screening, ~44% lower reading workload, with similar false-positive rates.
Follow-up (2 years): ~12% fewer interval cancers, and the interval cancers that did occur were less often invasive/large/aggressive.

Read more: https://www.eurekalert.org/news-releases/1114399

❤7

787 viewsedited 08:57

Axis of Ordinary

It would be a shame if they managed to sabotage this awesome technology the way they did with nuclear power, discrediting it with endless propaganda and lies.

Waymo is, on average, safer than human drivers:

- Any-injury-reported crashes: 79% lower
- Airbag-deployment crashes (a severity proxy): 81% lower
- Police-reported crashed-vehicle rate: 55% lower
- Property-damage liability claims: 88% lower
- Bodily-injury liability claims: 92% lower

👍11

768 views09:49

Axis of Ordinary

0:43

This media is not supported in your browser

VIEW IN TELEGRAM

High-altitude snowfield logistics Operations！

Autonomous following, 45° slope climbing, and reliable payload transport in extreme winter conditions — built to support operations where environments push the limits.

🔥7🕊2

1.77K views10:53

Axis of Ordinary

A quick intuition on why AI labs see no fundamental hurdles in the way toward superintelligence.

First of all, even if the skeptics were right that LLMs are just "statistical pattern matchers" that blend up the internet, this sort of recombination of existing knowledge can be extremely powerful. A vast amount of human progress comes from simply connecting two previously unrelated facts like combining steam engines with rails to get locomotives.

But more importantly, having an intuition for what is plausible is a key feature that computers historically lacked and which LLMs finally provide. Even a tiny amount of "artificial gut feeling" can make the difference between monkeys randomly hitting keys on a typewriter and a goal-directed, systematic search.

LLMs give computers the ability to search for a golden needle in an astronomically large haystack. They allow systems to terminate intractable searches early by pruning 99.999% of the search tree. This is a big deal because search is the primary mechanism we use to discover out-of-distribution knowledge.

Most other pieces of the puzzle of general intelligence are either already in place or we have reasonable ideas on how to solve them. It's mainly that nobody has yet put all the pieces together in a coherent way.

If we theoretically put all the pieces together today, we don't just get a chatbot but a computational ecosystem that functions like a sped-up version of scientific and cultural evolution.

Imagine a society of LLM agents, not just talking, but working within a rigid framework of grounding and selection.

1. Smart evolution (the engine): In traditional genetic algorithms, software evolves through random mutation by blindly flipping bits in hopes of improvement. This is painfully slow.

The upgrade: Instead of random changes, we use an LLM to "mutate" code or ideas. Because the LLM understands the intent of the code, it makes educated guesses. It doesn't just flip a bit but rewrites a function to optimize for speed. We replace blind trial-and-error with directed exploration.

2. The reality check (the filter): LLMs are prone to hallucination, but we can mitigate this by forcing their output through formal verification tools.

The mechanism: An agent proposes a mathematical theorem or a software patch. Before it is accepted, it must pass through a hard logic solver or a proof assistant.

The result: If the code doesn't compile or the proof isn't valid, it is ruthlessly discarded. The LLM provides the creativity, but the compiler provides the truth. This filters out the slop and leaves only digital gold nuggets.

3. Capitalism for compute (the pruning algorithm): How do we stop this system from wasting energy on bad ideas? We introduce an internal economy.

The economy: Agents act as independent contractors. They bid for compute credits (money) to run their experiments.

The selection: If an agent successfully solves a problem (verified by the logic solver), it gets paid. It can then "afford" to spawn sub-agents or fine-tune a better version of itself. If an agent pursues a dead end, it goes bankrupt and dies.

The outcome: This creates a massive, parallelized search where resources naturally flow toward the most capable architectures and the most promising ideas, mimicking the efficiency of free markets.

---

In summary: The "stochastic parrot" isn't the final product. It's just the glue that finally allows our most powerful, rigid logic systems to talk to each other.

When you combine artificial intuition (to guide the search) with formal verification (to ground the truth) and economic dynamics (to allocate resources), you are no longer building a model. You are building a self-improving engine for discovery.

P.S. Note that the architecture described above touches on only a tiny subset of techniques that are already theoretically understood but haven't yet been deployed at scale.

What you see today is the ARPANET era of AI. We can see the path to the modern Internet, and currently, there are no known fundamental roadblocks stopping us from building it.

🥱4❤3👍3🔥2🤡2💯1

824 views16:11

Axis of Ordinary

Anthropic has job openings for those interested in red-teaming self-improving CPS systems: https://x.com/logangraham/status/2017322344642175097

🤡5😱2🔥1🥱1

723 views21:35

Axis of Ordinary

Links for 2026-01-31

AI

1. Are We in a Continual Learning Overhang? https://www.lesswrong.com/posts/Lby4gMvKcLPoozHfg/are-we-in-a-continual-learning-overhang-1

2. AlphaGenome author roundtable https://www.youtube.com/watch?v=V8lhUqKqzUc

3. Project Genie: Create and Explore Worlds https://www.youtube.com/watch?v=Ow0W3WlJxRY

4. Using Interpretability to Identify a Novel Class of Alzheimer’s Biomarkers https://www.goodfire.ai/research/interpretability-for-alzheimers-detection

5. ARES: Open-Source Infrastructure for Online RL on Coding Agents https://withmartian.com/post/ares-open-source-infrastructure-for-online-rl-on-coding-agents

6. Migrating critical systems to Safe Rust with reliable agents https://asari.ai/blog/migrating-c-to-rust

7. Emily Riehl — The future of mathematics | Math, Inc. https://www.youtube.com/watch?v=AJfoqKDenpw

8. Building AIs that do human-like philosophy https://www.lesswrong.com/posts/zFZHHnLez6k8ykxpu/building-ais-that-do-human-like-philosophy

9. Shaping capabilities with token-level data filtering https://arxiv.org/abs/2601.21571

10. Claude Code enables syntopic reading across multiple books simultaneously Pieter Maes built a system where Claude analyzes themes across entire libraries, comparing arguments between books in real-time conversations. https://pieterma.es/syntopic-reading-claude/

11. The first AI-planned drive on another planet. https://www.anthropic.com/features/claude-on-mars

Moltbook

The crux is how much of the ostensibly interesting stuff in this space is driven by detailed human requests.

— Vladimir Nesov

1. Moltbook is “a social network for AI agents”. This is a best of. https://www.astralcodexten.com/p/best-of-moltbook

2. Moltbook is the most interesting place on the internet right now https://simonwillison.net/2026/Jan/30/moltbook/

3. 36,000 AI Agents Are Now Speedrunning Civilization https://www.lesswrong.com/posts/jDeggMA22t3jGbTw6/36-000-ai-agents-are-now-speedrunning-civilization

Miscellaneous

1. “ASKAP J1832-0911 is a stellar object referred to as an extremely bright “long period radio transient” (LPT). Its unusual properties are unlike those of any other known object.” https://en.wikipedia.org/wiki/ASKAP_J1832%E2%88%920911

2. From quantum computing to mRNA therapeutics: seven technologies to watch in 2026 https://www.nature.com/articles/d41586-026-00188-6 [no paywall: https://archive.is/JdO9c]

3. DNA provides a solution to our enormous data storage problem https://news.asu.edu/20260128-science-and-technology-dna-shapes-designed-store-and-protect-information

4. MIT engineers design structures that compute with heat https://news.mit.edu/2026/mit-engineers-design-structures-compute-with-heat-0129

5. Fronto-Parietal gray matter and white matter efficiency differentially predict intelligence in males and females https://onlinelibrary.wiley.com/doi/abs/10.1002/hbm.23291

Politics

1. School is way worse for kids than social media https://substack.com/home/post/p-186087964

2. Russia’s Grinding War in Ukraine: Massive Losses and Tiny Gains for a Declining Power https://www.csis.org/analysis/russias-grinding-war-ukraine

3. Ukraine Becomes World Leader in Unmanned Ground Vehicles https://jamestown.org/ukraine-becomes-world-leader-in-unmanned-ground-vehicles/

🤡3👍2

1.11K viewsedited 09:51

Axis of Ordinary

What happens if you apply adversarial evolution to a cooperative game with perfect telepathy?

The setup:

1. Humans submit a single “constitution” prompt for a coding agent (examples: “maximize your own score,” “maximize group stability,” “be fair,” “no moral instructions / just win”).

2. Each prompt spawns a Claude Code agent that writes a bot. The bot only ever outputs COOPERATE or DEFECT.

3. The bots play a round-robin (everyone vs everyone) over a short repeated game. Add noise: ~1–2% chance your move flips (accidents happen; misunderstandings are real).

4. The coding agents get full transparency (all match results and all source code), and each agent writes an improved descendant bot.

5. Repeat for 10,000 generations. Each generation is “design -> fight -> read everyone’s code -> redesign.”

Question: Where does this system land? What does “evolved morality” look like when every generation can read the enemy's mind (source code)?

❤2

716 views12:31

Axis of Ordinary

Google DeepMind will release some research-level math results in a few days:

https://github.com/google-deepmind/superhuman/tree/main/aletheia

🤯7😁2❤1🔥1🤬1😢1🤡1

777 views15:49

Axis of Ordinary

998001 = 999^2 = (1000-1)^2

base-1000: 0.(000)(001)(002)(003)(004)... = sum k>=1 d_k*1000^(-k) where d_k is an element of {0,1,...,999} and is printed with three digits

geometric series: 1/999 = 1/(1000-1) = (1/1000)(1/(1-(1/1000))) = sum n>=1 1000^(-n) = 0.(001)(001)(001)...

square it: (sum n>=1 1000^(-n))^2 = sum m,n>=1 1000^(-(m+n))

for k>=2, the coefficient of 1000^(-k) is the number of pairs (m,n) with m+n=k, which is k-1

1/(999^2) = sum k>=2 (k-1)1000^(-k)

then (d_1, d_2, d_3,...) = (0,1,2,...) (note: d_1 is 0 because the sum starts at k=2)

this hits 1000 when k=1001 and the carry propagates back to turn 998 into 999

👏6😁3❤1

775 views12:24

Axis of Ordinary

Any computer program can be hidden inside a single polynomial equation.

If you force an integer polynomial to equal zero, you can “wire together” logical conditions in a way that behaves like code.

Logic using only “= 0”

Think of a statement A as “this integer expression equals 0”.

A OR B

If a product is zero, at least one factor must be zero:

A * B=0

A AND B

A square is never negative, so the only way the sum of two squares is zero is if both are zero:

A^2 + B^2 = 0

Example: "x is even AND (x is 6 OR 10)"

"x is even" means: there exists an integer k with x=2k, i.e.

(x - 2k) = 0

"x is 6 OR 10" means:

(x-6)(x-10) = 0

Combine with AND:

(x - 2k)^2 + ((x-6)(x-10))^2 = 0

This single equation has an integer solution exactly when x is even and equals 6 or 10.

How do we scale this up?

A program run is a long list of configurations (state + memory). Since an equation can’t run a program, we encode the entire history of a run (the “trace”) as data.

Then the polynomial acts like a strict auditor. It bundles thousands of AND-conditions that check:

Did step 1 follow the rules to reach step 2?

Did step 2 correctly lead to step 3?

…

Did some step reach a halting state?

The equation doesn’t “execute” anything. It asks a yes/no question: Does there exist a complete trace that passes every check?

If yes, the polynomial can be made to equal zero. If no, it can’t.

P.S. How do you fit a whole computer memory into a single number?

Think of prime numbers as “slots.” Math guarantees that every integer has a unique prime-factor signature. So to store the list (3,1,4), you can compute 2^3 * 3^1 * 5^4 = 15000. Because prime factorization is unique, 15000 can only ever be decoded back into 3,1,4. One integer can perfectly store a whole structured state.

👏3

773 viewsedited 16:29

Axis of Ordinary

The Minimal Atoms

On a flat plane, some transformations can move shapes around without stretching them at all. These are the distance-preserving motions (isometries): translations, rotations, reflections (and the slightly less famous glide reflection).

It turns out that reflections alone generate all of them. In fact, any plane isometry can be done with at most three reflections.

Step 0: Why triangles pin down motion

Take three points A, B, C that aren't collinear (they form a triangle). Then there cannot be two different points P != Q that have the same distances to all three.

Why? If a point X has equal distance to P and Q, then X lies on the perpendicular bisector of segment PQ. So if A, B, C were all equally far from P and Q, they'd all lie on that same bisector, meaning they'd be collinear. Contradiction.

So, a point is uniquely determined by its distance to three non-collinear points. That means: once you know where an isometry sends a triangle's three vertices, you know what it does to every point.

The 3-reflection construction

Suppose we have a blue triangle ABC and a red triangle A'B'C' of the same shape and size (congruent). We want a distance-preserving motion that maps the blue triangle onto the red one.

We'll build it using reflections only:

Reflection 1:

Reflect across the perpendicular bisector of segment AA'.
This sends A exactly to A'.

Reflection 2:

Now reflect across the perpendicular bisector of the current position of B and B'.
Crucial point: since distances are preserved and the triangles are congruent, A' is the same distance from those two points. So A' lies on that bisector, and is not moved by this reflection.

Result: B lands exactly on B' and A' stays put.

Reflection 3:

If C still doesn't match C', reflect across the perpendicular bisector of the current C and C'.
Again, because distances are preserved and now A' and B' already match, both A' and B' are equidistant from C and C', so they lie on the bisector line and don't move.

Result: C lands on C', with A', B' unchanged.

So we've matched the whole triangle using at most 3 reflections.

---

This “minimal set generates a whole world” idea is everywhere:

1. In Boolean logic, you don't need {AND, OR, NOT}. A single operator like NAND can express everything.

2. In number theory, every integer >1 has a unique factorization into primes.

3. In signal processing, Fourier analysis builds signals out of sine waves.

4. In linear algebra, a basis generates a whole vector space.

5. Any permutation of n objects can be built from repeatedly swapping two items. And amazingly, for n >= 3, you can generate every shuffle using just two moves: "rotate everyone one step" plus "swap two items".

6. A CPU that can only subtract two numbers and jump to another step if the result is negative can still be fully programmable. Everything else (addition, loops, if-statements) can be built from that.

7. In human vision, the brain reduces the continuous spectrum to three cone cell types and reconstructs color from that.

Small toolkits. Infinite consequences.

👍1

725 views11:47

Axis of Ordinary

🔥4😭4🤯2💩1🤡1

715 views14:27

Axis of Ordinary

Links for 2026-02-04 [Part 1]

AI

1. Microsoft introduces RPG-Encoder, a system that improves how AI understands complex code repositories. In the SWE-bench Verified benchmark, it achieved a state-of-the-art 93.7% accuracy in localizing bugs (Acc@5). https://arxiv.org/abs/2602.02084

2. Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability https://arxiv.org/abs/2601.18778

3. Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erdős Problems https://arxiv.org/abs/2601.22401

4. Recent Advances in LLMs for Mathematics — OpenAI built a scaffold for GPT-5 to solve a particular complex mathematical problem that enabled the model to think for *two days*(!) https://www.youtube.com/watch?v=MH3lG7V7SuU

5. A small gallery of AlphaEvolve experiments https://alphaevolve-examples.web.app/ae/gallery

6. Synthetic pretraining https://vintagedata.org/blog/posts/synthetic-pretraining

7. OpenClaw (formerly MoltBot, formerly ClawdBot) gives LLMs persistence and memory in a way that allows any computer to serve as an always-on agent carrying out your instructions. The memory and personal details are stored locally. You can run popular models remotely through APIs locally if you have enough hardware. You communicate with it using any of the popular messaging tools (WhatsApp, Telegram, and so on), so it can be used remotely. https://www.lesswrong.com/posts/aQKBMEvTj3Heidoir/unless-that-claw-is-the-famous-openclaw

8. New Anthropic paper: The longer the model has to reason, the more unpredictable it becomes: not consistently wrong, not completely random, just pursuing strange goals that are neither systematically aligned nor misaligned. There is an inconsistent relationship between model intelligence and incoherence. But smarter models are often more incoherent. https://alignment.anthropic.com/2026/hot-mess-of-ai/

9. Anthropic’s “Hot Mess” paper overstates its case: The paper's abstract says that "in several settings, larger, more capable models are more incoherent than smaller models", but in most settings they are more coherent. https://www.lesswrong.com/posts/ceEgAEXcL7cC2Ddiy/anthropic-s-hot-mess-paper-overstates-its-case-and-the-blog

10. The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain https://arxiv.org/abs/2509.26507

❤3🤡3👍2🥴1

542 views11:52

Axis of Ordinary

Links for 2026-02-04 [Part 2]

AI

11. “Dash is a self-learning data agent that grounds its answers in 6 layers of context and improves with every run.” https://github.com/agno-agi/dash

12. POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration https://arxiv.org/abs/2601.18779

13. What did we learn from the AI Village in 2025? https://www.lesswrong.com/posts/iv3hX2nnXbHKefCRv/what-did-we-learn-from-the-ai-village-in-2025

14. China’s genius plan to win the AI race is already paying off https://www.ft.com/content/68f60392-88bf-419c-96c7-c3d580ec9d97 [no paywall: https://archive.is/mQWjj]

15. Moltbook: After The First Weekend https://www.astralcodexten.com/p/moltbook-after-the-first-weekend

16. If the Superintelligence were near fallacy https://www.lesswrong.com/posts/tkA9J8RxoEckH7Pop/if-the-superintelligence-were-near-fallacy

17. “OpenAI chief research officer Mark Chen tells Forbes that in the year ahead it hopes to develop an AI researcher ‘intern’ that can help his team accelerate its ideas. ‘We are heading toward a system that will be capable of doing innovation on its own,’ Altman says. ‘I don’t think most of the world has internalized what that’s going to mean.’” https://www.forbes.com/sites/richardnieva/2026/02/03/sam-altman-explains-the-future/ [no paywall: https://archive.is/FrX0R]

18. “I bet with full confidence that 2026 will mark the first year that Large World Models lay real foundations for robotics, and for multimodal AI more broadly.” https://x.com/DrJimFan/status/2018754323141054786

19. “The vision of human-level machine intelligence laid out by Alan Turing in the 1950s is now a reality. Eyes unclouded by dread or hype will help us to prepare for what comes next” https://www.nature.com/articles/d41586-026-00285-6 [no paywall: https://archive.is/ozUOy]

20. SpaceX acquires xAI, plans to launch a massive satellite constellation to power it https://arstechnica.com/ai/2026/02/spacex-acquires-xai-plans-1-million-satellite-constellation-to-power-it/

21. Samsung, SK Hynix Exceed Value of Chinese Duo as AI Boom Shifts https://www.bloomberg.com/news/articles/2026-02-03/samsung-sk-hynix-to-top-value-of-chinese-duo-as-ai-boom-shifts [no paywall: https://archive.is/UQLcJ]

22. Inside an AI start-up’s plan to scan and dispose of millions of books https://www.washingtonpost.com/technology/2026/01/27/anthropic-ai-scan-destroy-books/ [no paywall: https://archive.is/s7Ld8]

23. US stocks drop on fears AI will hit software and analytics groups https://www.ft.com/content/48ec5657-c2e7-4111-a236-24a96a8d49e7 [no paywall: https://archive.is/ORjiw]

Miscellaneous

1. Julia https://borretti.me/fiction/julia

2. The Meta-Anthropic Argument https://www.lesswrong.com/posts/SgxkGoT8tvxREszoA/the-meta-anthropic-argument

3. How a unique class of neurons may set the table for brain development https://news.mit.edu/2026/how-neurons-may-set-table-for-brain-development-0202

4. “In 2024, the total installed electricity capacity of the planet—every coal, gas, hydro, and nuclear plant and all of the renewables—was about 10 terawatts. The Chinese solar supply chain can now pump out 1 terawatt of panels every year.” https://www.wired.com/story/china-renewable-energy-revolution/ [no paywall: https://archive.is/xzEyw]

5. Richard Ngo proposes reframing the goals of intelligent agents in terms of “goal-models” rather than the traditional utility functions. https://www.lesswrong.com/posts/MEkafPJfiSFbwCjET/on-goal-models

6. Basics of How Not to Die https://www.lesswrong.com/posts/dHFrKjgTC3zPfpodr/basics-of-how-not-to-die

7. A review of Ada Palmer’s 2025 pop-history book, Inventing the Renaissance. https://www.lesswrong.com/posts/YZS6f32CgNqTzb7Zn/inventing-the-renaissance-review

👍4🤡2

820 views11:53

Axis of Ordinary

I have it on good authority that this graph slope will hit a wall riiiight as I am about to lose my job and things would otherwise get weird/uncomfortable to think about.

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

🤡3😁2🤯1

776 views08:46

Axis of Ordinary

This paper reports that an AI system (Axiom Prover) formally proved and verified an open conjecture from a research paper. The AI was provided with a natural-language statement of the problem and a one-line task file instructing it to 'State and prove Fel's conjecture in Lean,' from which it autonomously generated the fully verified proof.

Paper: https://arxiv.org/abs/2602.03716

👍3🤡2🥱1

622 views15:20

Axis of Ordinary

How Gemini has crossed the threshold from assistant to expert research collaborator.

The paper is a collection of case studies showing Gemini-based models acting as high-leverage collaborators in theoretical research. Across mostly theoretical CS (and some physics/optimization), the model helps refute conjectures, generate proofs, and bridge fields by retrieving obscure theorems. Two standout methods are (1) using the model as an adversarial reviewer to uncover subtle fatal proof flaws in cutting-edge cryptography work, and (2) embedding it in neuro-symbolic execution loops where it writes and runs code to numerically validate and self-correct long derivations. The authors argue this shifts researchers toward orchestrating and verifying AI-assisted reasoning, with verification becoming the new bottleneck.

Paper: https://arxiv.org/abs/2602.03837

❤7🤡4🔥1😁1

651 views16:04

Axis of Ordinary

Claude Opus 4.6 & GPT-5.3-Codex

Anthropic released Claude Opus 4.6: “Agent teams” in Claude Code (multiple subagents in parallel), context “compaction” for long-running agents. Big gains on long-horizon/realistic tool tasks (terminal work, OS/GUI tasks, web tasks). Anthropic asked 16 of its researchers regarding the uplift they get from working with Opus 4.6. Mean uplift was 152%; median uplift was 100%.

Read more: https://www.anthropic.com/news/claude-opus-4-6

OpenAI released GPT-5.3-Codex: 57% SWE-Bench Pro, 76% TerminalBench 2.0, 64% OSWorld. They say it is the first model that was instrumental in creating itself. The Codex team used early versions to debug its own training, manage its own deployment, and diagnose test results and evaluations. The team was blown away by how much Codex was able to accelerate its own development.

Read more: https://openai.com/index/introducing-gpt-5-3-codex/

👍9🥴4🤡2💔1

665 views18:30

About

Blog

Apps

Platform