Axis of Ordinary
3.67K subscribers
4.18K photos
1.19K videos
6 files
5.18K links
Memetic and cognitive hazards.

Substack: https://axisofordinary.substack.com/
Download Telegram
Links for 2025-04-30 [Part 1]

AI


1. DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition — “Our future work will focus on scaling this paradigm to an AlphaProof-like system with the ultimate aim of tackling IMO-level mathematical problems that represent the frontier of automated theorem proving challenges.” https://github.com/deepseek-ai/DeepSeek-Prover-V2

2. Automated Proof Engineering (APE): Towards File-level Automated Proof Engineering of Formal Math Libraries. APE-Bench I shifts evaluation from “Can the model prove lemma X?” to “Can it behave like a competent maintainer of a giant formal library?” [PDF] https://xinhuajian.wordpress.com/wp-content/uploads/2025/04/ape_bench_i-2.pdf

3. Reinforcement Learning for Reasoning in Large Language Models with One Training Example — 36.0% -> 73.6% on MATH500 by performing RLVR on a single example. Applying entropy loss alone, without any outcome reward, improves perf by 27.4%. https://arxiv.org/abs/2504.20571

4. Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory —state-of-the-art (SOTA) performance—26% more accurate than OpenAI Memory. https://arxiv.org/abs/2504.19413

5. ReasonIR: Training Retrievers for Reasoning Tasks https://arxiv.org/abs/2504.20595

6. SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement https://sites.google.com/asu.edu/sas-llm/

7. Instant Policy: In-Context Imitation Learning via Graph Diffusion — The robot learns several novel tasks instantly, after just ONE demonstration each. https://www.robot-learning.uk/instant-policy

8. Hugging Face releases a 3D-printed robotic arm starting at $100 https://techcrunch.com/2025/04/28/hugging-face-releases-a-3d-printed-robotic-arm-starting-at-100/

9. SplitReason: Learning To Offload Reasoning https://arxiv.org/abs/2504.16379

10. Scaling Laws For Scalable Oversight https://arxiv.org/abs/2504.18530

11. MAGI: Multi-Agent Guided Interview for Psychiatric Assessment https://arxiv.org/abs/2504.18260

12. Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations — Instead of hoping a model explains things right once, they validate explanations by recursively asking about their own outputs - until contradictions are exposed or resolved. [Published in 2022] https://arxiv.org/abs/2205.11822

13. CoRT (Chain of Recursive Thoughts) https://github.com/PhialsBasement/Chain-of-Recursive-Thoughts
Links for 2025-04-30 [Part 2]

AI


14. How people use LLMs https://www.lesswrong.com/posts/FXnvdeprjBujt2Ssr/how-people-use-llms

15. NotebookLM Audio Overviews are now available in over 50 languages https://blog.google/technology/google-labs/notebooklm-audio-overviews-50-languages/

16. o3 Beats a Master-Level Geoguessr Player—Even with Fake EXIF Data https://sampatt.com/blog/2025-04-28-can-o3-beat-a-geoguessr-master

17. Mark Zuckerberg predicts that within the next 12 to 18 months most of AI development code will be written by AI. He said 'We're trying to build a coding agent and an AI research agent that advances Llama research specifically.' https://www.dwarkesh.com/p/mark-zuckerberg-2

18. Former Google CEO Schmidt: Why U.S. Needs to Win Race for Superintelligent AI https://www.youtube.com/watch?v=5l8eDLunQFU

19. “At McKinsey, consultants are using an in-house generative AI chatbot called Lilli. It synthesizes the firm's entire body of intellectual property, which spans 100 years and over 100,000 documents and interviews, the firm told BI…Over 70% of the firm's 45,000 employees now use the tool.” https://www.businessinsider.com/consulting-ai-mckinsey-bcg-deloitte-pwc-kpmg-chatbots-ai-tools-2025-4 [no paywall: https://archive.is/RuXpi]

20. GPT-4o Is An Absurd Sycophant https://www.lesswrong.com/posts/zi6SsECs5CCEyhAop/gpt-4o-is-an-absurd-sycophant

21. “Sycophancy in GPT-4o: What happened and what we’re doing about it” https://openai.com/index/sycophancy-in-gpt-4o/

22. Our Reality: A Simulation Run by a Paperclip Maximizer https://www.lesswrong.com/posts/HxLYnGYspLoeLLrE6/our-reality-a-simulation-run-by-a-paperclip-maximizer-1

China AI

A meeting of China’s Communist leadership underscored its intense focus on developing homegrown artificial intelligence, analysts said.


1. “April Politburo Study Session on AI is bad news for Nvidia” https://sinocism.com/p/april-politburo-study-session-on

2. Politburo holds a second AI study session after seven years https://triviumchina.com/2025/04/28/politburo-holds-a-second-ai-study-session-after-seven-years/

3. Former ASML head scientist Lin Nan drives China’s latest EUV breakthrough https://www.msn.com/en-xl/news/other/former-asml-head-scientist-lin-nan-drives-china-s-latest-euv-breakthrough/ar-AA1DNjSr

Miscellaneous

1. Tracking single neurons in the human brain reveals new insight into language and other human-specific functions https://www.thetransmitter.org/human-neurotechnology/tracking-single-neurons-in-the-human-brain-reveals-new-insight-into-language-and-other-human-specific-functions/

2. French defense company Turgis Gaillard is set to unveil a major new weapons system at the Paris Air Show in June 2025. According to exclusive information from Challenges magazine, the group will present "Foudre" (Lightning), a prototype multiple rocket launcher designed to compete with the renowned American Himars. Developed in secret for two years with 100% self-financing. https://www.challenges.fr/entreprise/defense/un-engin-100-francais-concurrent-du-himars-foudre-le-lance-roquettes-que-personne-nattendait_603520
👍1
Phi-4-Reasoning-Plus: Small Model, Big Reasoning Power

Highlights:


- Punches far above its size: Phi-4-Reasoning-Plus is a 14-billion parameter open-weights model that outperforms or matches much larger open (DeepSeek-R1-Distill-70B, QwQ-32B) and several closed models (o1-mini, Claude-Sonnet-3.7) on AIME 2025, HMMT, OmniMath, GPQA and LiveCodeBench; approaches 671 B-param DeepSeek-R1 on math.

- Open and Accessible: Released under a permissive MIT license, allowing broad commercial and research use.

- Laptop-friendly, transparent outputs: Runs on a beefy laptop GPU; separates chain-of-thought in <think>…</think> tags from the final answer for cleaner inspection and evaluation. ​

- Better Than the Teacher: In some cases, like the AIME 2025 benchmark (with parallel compute) and OmniMath, the model surpassed the performance of its teacher model (o3-mini), indicating a capacity for self-improvement beyond initial training signals.

- Reasoning as a transferable meta-skill: Despite zero explicit training, it solves TSP, 3-SAT, maze routing and calendar-planning tasks, backing the claim that “reasoning generalises.”

- Data-centric recipe: Supervised-fine-tuned on ~1.4 M carefully filtered reasoning demos (≈16 B tokens) generated by o3-mini, then polished with just 6 k math problems via GRPO RL to “lock-in” its reasoning style.

- RL that actually helps: A single short RL run raises AIME/HMMT accuracy by ≈10 pp and extends explanations by ≈50 %, showing RL can sharpen thought without huge compute.

Read more:

Paper: https://www.microsoft.com/en-us/research/wp-content/uploads/2025/04/phi_4_reasoning.pdf

Microsoft press release: https://azure.microsoft.com/en-us/blog/one-year-of-phi-small-language-models-making-big-leaps-in-ai/

Press: https://venturebeat.com/ai/microsoft-launches-phi-4-reasoning-plus-a-small-powerful-open-weights-reasoning-model/
🤮4👍1🍌1
Links for 2025-05-02 [Part 1]

AI


1. Testing AI's GeoGuessr Genius https://www.astralcodexten.com/p/testing-ais-geoguessr-genius

2. When ChatGPT Broke an Entire Field: An Oral History https://www.quantamagazine.org/when-chatgpt-broke-an-entire-field-an-oral-history-20250430/

3. Egret-1: Pretrained Neural Network Potentials For Efficient and Accurate Bioorganic Simulation https://arxiv.org/abs/2504.20955

4. LLMs for Engineering: Teaching Models to Design High Powered Rockets https://arxiv.org/abs/2504.19394

5. Slowdown After 2028: Compute, RLVR Uncertainty, MoE Data Wall — “It'll take until ~2050 to repeat the level of scaling that pretraining compute is experiencing this decade, as increasing funding can't sustain the current pace beyond ~2029 if AI doesn't deliver a transformative commercial success by then.” https://www.lesswrong.com/posts/XiMRyQcEyKCryST8T/slowdown-after-2028-compute-rlvr-uncertainty-moe-data-wall

6. Superhuman Coders in AI 2027 - Not So Fast https://www.lesswrong.com/posts/QdaMzqaBJi6kupKtD/superhuman-coding-in-ai-2027-not-so-fast

7. Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis https://www.lesswrong.com/posts/JW7nttjTYmgWMqBaF/early-chinese-language-media-coverage-of-the-ai-2027-report

8. China's Xi calls for self sufficiency in AI development amid U.S. rivalry https://www.reuters.com/world/china/chinas-xi-calls-self-sufficiency-ai-development-amid-us-rivalry-2025-04-26/

9. “We processed over 100t tokens this quarter, up 5x year over year, including a record 50t tokens last month alone.” https://tomtunguz.com/earnings-microsoft-2025-04-30/

10. Interpreting the METR Time Horizons Post https://www.lesswrong.com/posts/fRiqwFPiaasKxtJuZ/interpreting-the-metr-time-horizons-post

11. “Claude can research for up to 45 minutes before delivering a comprehensive report, complete with citations.” https://www.anthropic.com/news/integrations

12. “Progress in robotics isn't just an intelligence problem.” https://inferencemagazine.substack.com/p/and-then-we-get-the-robots

13. UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities https://arxiv.org/abs/2504.20734

14. Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions https://arxiv.org/abs/2505.00675

15. Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning https://arxiv.org/abs/2504.13941

16. Reasoning critics enable better parallel search for software engineering agents https://nebius.com/blog/posts/reasoning-critics-parallel-search-for-agents

17. A new sign that AI is competing with college grads https://www.theatlantic.com/economy/archive/2025/04/job-market-youth/682641/ [no paywall: https://archive.is/MWmqm]

18. “I Tested The AI That Calls Your Elderly Parents If You Can't Be Bothered” https://www.404media.co/i-tested-the-ai-that-calls-your-elderly-parents-if-you-cant-bothered/ [no paywall: https://archive.is/fK83a]
Links for 2025-05-02 [Part 2]

AI


19. Waymo, Toyota strike partnership to bring self-driving tech to personal vehicles https://www.cnbc.com/2025/04/29/waymo-toyota-partner-to-bring-self-driving-tech-to-personal-vehicles-.html

20. “An employee at Elon Musk’s artificial intelligence company xAI leaked a private key on GitHub that for the past two months could have allowed anyone to query private xAI large language models (LLMs) which appear to have been custom made for working with internal data from Musk’s companies, including SpaceX, Tesla and Twitter/X” https://krebsonsecurity.com/2025/05/xai-dev-leaks-api-key-for-private-spacex-tesla-llms/

21. Why the AI Revolution Won’t Look Like You Expect—And Why That’s More Dangerous https://www.youtube.com/watch?v=NMwjqqtU5Dw

22. OpenAI: "We’ve spent the last few days doing a deep dive on what went wrong with last week’s GPT-4o update in ChatGPT. Expanding on what we missed with sycophancy and the changes we’re going to make in the future" https://openai.com/index/expanding-on-sycophancy/

Compute

1. Scott Aaronson: “Grant Sanderson, of 3blue1brown, has put up a phenomenal YouTube video explaining Grover’s algorithm, and dispelling the fundamental misconception about quantum computing, that QC works simply by “trying all the possibilities in parallel.” Let me not futz around: this video explains, in 36 minutes, what I’ve tried to explain over and over on this blog for 20 years … and it does it better. It’s a masterpiece.” https://www.youtube.com/watch?v=RQWpF2Gb-gU

2. Penn engineers have developed the first photonic chip that reshapes how light behaves to carry out the nonlinear mathematics at the heart of modern AI while reducing energy use. https://penntoday.upenn.edu/news/penn-engineers-first-train-ai-lightspeed

3. An Interview with Dan Kim and Hassan Khan About CHIPS https://stratechery.com/2025/an-interview-with-dan-kim-and-hassan-khan-about-chips/

4. Short video of America’s largest data center. https://www.youtube.com/watch?v=fUiI03X6DQc
👍4
I didn't know how important Zeiss was: ASML’s most advanced steppers literally can’t function without the atom-perfect optics from Carl Zeiss SMT—a German optics & optoelectronics powerhouse that builds the entire “imaging engine” inside every machine.

P.S. Dutch-based ASML (Advanced Semiconductor Materials Lithography), headquartered in Veldhoven, Netherlands, is the world’s only supplier of extreme-ultraviolet (EUV) scanners—the literal heart of every cutting-edge chip fab.
🤯13🔥5👍3👀2
The Ukraine War and the Kill Market

The [Ukrainian] program […] rewards soldiers with points if they upload videos proving their drones have hit Russian targets. It will soon be integrated with a new online marketplace called Brave 1 Market, which will allow troops to convert those points into new equipment for their units.

[...]

The program assigns points for each type of kill: 20 points for damaging and 40 for destroying a tank; up to 50 points for destroying a mobile rocket system, depending on the caliber; and six points for killing an enemy soldier.

[...]

Units will soon be able to use the special digital points they’ve been getting since last year by trading them in for new weapons. A Vampire drone, for example, costs 43 points. The drone, nicknamed Baba Yaga, or witch, is a large multi-rotor drone able to carry a 15-kilogram warhead. The Ukrainian government will pay for the drones that are ordered and will deliver them to the front-line unit within a week.

[...]

The scheme is aimed at directing more equipment to the most effective units. It will also help to bypass bureaucratic procurement procedures and buy weapons directly from manufacturers.

[...]

The ability to get points for killing enemy troops is also spurring competition among units; so far about 90 percent of the army's drone units have scored points. In fact, they are logging so many hits that the government has had to revamp the logistics of drone deliveries to get more of them to points-heavy units. “They started killing so quickly that Ukraine does not have time to deliver new drones,” Fedorov said.


Read more: https://www.politico.eu/article/ukraines-army-have-video-game-like-digital-weapons-store-deadly-realistic/
26👍9🤯8🤡6🔥5👎4👾3🤩2
The Ultimate LLM Meta-Leaderboard averaged across the 28 best benchmarks

Gemini 2.5 Pro > o3 > Sonnet 3.7 Thinking


Compiled by https://x.com/scaling01/status/1919217718420508782
This chart tracks em dash (—) usage across tech and startup subreddits over the past year, a stylistic marker often found in AI-generated writing.

Generated on May 4, 2025, using Reddit’s API to fetch the top 1000 posts from the past year in each subreddit. This introduces time bias: recent posts are underrepresented unless they quickly gained high scores. Treat it as a signal, not proof.


Source: https://github.com/v4nn4/em-dash-conspiracy?tab=readme-ov-file
😁2🤡2
Here's an interesting quote from George Simion, Romania's far-right presidential candidate:

Russia is the biggest danger towards Romania, Poland and the Baltic states.


He wants a "strong Romanian army inside NATO." However, he no longer wants to support Ukraine because "the war is not going anywhere."

These are inconsistent positions.

If Russia poses the greatest threat to your country, you should be glad that the war isn't going anywhere, e.g., to Romania, until you can create a strong army. And even once you do, it's much better to keep the resources of your biggest enemy tied in another unaligned country than having to fight it yourself.

Source of the quote: https://nickthorpe.substack.com/p/i-am-young-and-restless
👍17🤷‍♂3🤡2😁1😢1💯1🍌1
Links for 2025-05-05

AI


1. On the generalization of language models from in-context learning and finetuning: a controlled study https://arxiv.org/abs/2505.00661

2. Novel AI model inspired by neural dynamics from the brain https://news.mit.edu/2025/novel-ai-model-inspired-neural-dynamics-from-brain-0502

3. Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers https://physics.allen-zhu.com/part-4-architecture-design/part-4-1

4. Nikolay Savinov predicts that the industry is going to achieve near perfect retrieval across 1-2M context length 'quite soon', and that soon afterwards a 10M token context window will become the norm. https://www.youtube.com/watch?v=NHMJ9mqKeMQ

5. What's going on with AI progress and trends? (As of 5/2025) https://www.lesswrong.com/posts/v7LtZx6Qk5e9s7zj3/what-s-going-on-with-ai-progress-and-trends-as-of-5-2025

6. Waymo robotaxis are safer than human drivers https://growsf.org/news/2025-05-02-waymo-safety/

7. Around 60% of students reported using AI themselves, while they estimated that nearly 90% of their peers use AI. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5232910

8. “I Recorded Everything I Said for Three Months. AI Has Replaced My Memory.” https://www.wsj.com/tech/personal-tech/ai-personal-assistant-wearable-tech-impressions-28156b57 [no paywall: https://archive.is/xac6J]

9. This Chart Might Keep You From Worrying About AI’s Energy Use https://spectrum.ieee.org/ai-energy

10. Will nuclear energy power the AI boom? https://thebaffler.com/latest/project-ludicrous-northwood [no paywall: https://archive.is/m8KCB]

11. Jensen: "First thing to understand: 50% of the world's AI researchers are Chinese." https://www.youtube.com/live/E2o9O0EVouA?si=RZdkLpin-k5C8kGZ&t=594

12. What if AI just keeps getting smarter? https://www.lesswrong.com/posts/MCaqKAfSn345MCz7o/ra-x-controlai-video-what-if-ai-just-keeps-getting-smarter

13. Where’s my ten minute AGI? – if AIs are actually able to perform most tasks on 1-hour task horizons, why don’t we see more real-world task automation? https://epochai.substack.com/p/wheres-my-ten-minute-agi

14. T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT https://arxiv.org/abs/2505.00703

15. Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think https://arxiv.org/abs/2504.20708

Miscellaneous

1. Novel High Resolution 3D Printing Method for Metals and Ceramics https://www.youtube.com/watch?v=kLgPW2672s4

2. Non-linear Ethnic Niches: The emerging Western caste system https://substack.com/home/post/p-162313414

3. Mathematician solves algebra’s oldest problem using intriguing new number sequences: “This is a dramatic revision of a basic chapter in algebra.” https://www.unsw.edu.au/newsroom/news/2025/05/mathematician-solves-algebras-oldest-problem-using-intriguing-new-number-sequences
2👍1
Here is a quick and dirty test I ran: Generating a math book entirely with AI. Initially, I wanted to post it as a blog post and insisted that it not use LaTeX. The formatting of the first units differs from that of the later units and could be improved significantly. However, those changes are relatively easy to implement: 1. Download as .docx 2. Upload file 3. Gemini 2.5 Pro prompt: "Generate an improved version of Unit # with Superior Mathematical Typesetting."

Modular Arithmetic - From Basics to Advanced Concepts https://docs.google.com/document/d/10TIxITHGAR5yskFPdzC1CKQmyogMhGGOZbyyBBgLM40/edit?usp=sharing

Unit 0: Foundations – The Division Algorithm
Unit 1: Introduction to Congruence Modulo n
Unit 2: Properties of Congruence Relations
Unit 3: Modular Arithmetic Operations
Unit 4: Multiplicative Inverses and Cancellation
Unit 5: Solving Linear Congruences
Unit 6: Solving Systems of Congruences – The Chinese Remainder Theorem (CRT)
Unit 7: Powers and Primes – Fermat’s Little Theorem (FLT)
Unit 8: Generalizing Fermat – Euler’s Totient Function and Euler’s Theorem
Unit 9: A Glimpse of Advanced Modular Arithmetic
Unit 10: Modular Arithmetic Meets Abstract Algebra – Rings, Groups, and Fields
This media is not supported in your browser
VIEW IN TELEGRAM
Google updated Gemini 2.5 Pro with massively improved coding capabilities.

It’s especially good at building interactive web apps. It’s now #1 on WebDevArena leaderboard, breaking the 1400 ELO barrier!

Read more: https://blog.google/products/gemini/gemini-2-5-pro-updates/

Select gemini-2.5-pro-preview-05-06 here: www.ai.dev
🔥5
Groundbreaking AI Learns and Teaches Itself, Reaching New Heights in Reasoning

A new AI paradigm, "Absolute Zero," enables a single AI model to teach itself by proposing its own tasks, solving them, and learning from the process entirely without human-provided data or external examples. This self-evolving system, instantiated as the Absolute Zero Reasoner (AZR), has demonstrated state-of-the-art performance in complex coding and mathematical reasoning tasks, even outperforming models trained on extensive, human-curated datasets.

Highlights:

- A new “Absolute Zero” paradigm: one model simultaneously proposes new reasoning tasks and solves them, learning entirely through self‑play with no external data or human‑written questions/answers. A Python sandbox provides automatic task validation and reward, eliminating the last human data bottleneck in RL‑for‑reasoning.

- Task format = (program, input, output) triplets: By hiding one element the model practises three complementary reasoning modes: Deduction (predict output), Abduction (infer input), and Induction (synthesize program from I/O examples). All are verifiable by execution.

- Unified proposer/solver training loop: Tasks are kept neither‑too‑easy‑nor‑impossible using a learnability reward (maximal when the solver succeeds ≈ 50 % of the time), and variance‑reduced with the new Task‑Relative REINFORCE++ advantage estimator.

- Bootstraps from almost nothing: AZR began with a single identity‑function triplet and grew its own curriculum of thousands of increasingly complex programs.

- State‑of‑the‑art “zero‑data” results: A 7 B “coder” base model trained with AZR beats all prior zero‑setting models on the combined coding + math score (50.4 % vs 48.6 %), and even edges out code‑specialised baselines on coding benchmarks alone—despite using zero curated examples.

- Strong cross‑domain generalisation: although trained only on self‑generated code tasks, AZR lifts math accuracy by +10.9 – 15.2 pts, while expert‑code RLVR models gain just +0.65 pt on average.

- Scales gracefully: bigger bases learn more (+5.7 pts @ 3 B, +10.2 pts @ 7 B, +13.2 pts @ 14 B overall). AZR also works on other families (e.g., Llama 3), though gains shrink with weaker bases.

- Emergent Behaviors: During training, AZR has been observed to naturally develop intermediate planning steps (similar to the ReAct prompting framework) and distinct cognitive behaviors depending on the task type, such as step-by-step reasoning and trial-and-error.

Together, these results position AZR as the first demonstration that a large language model can teach itself advanced coding and mathematical reasoning from scratch, offering a promising path toward open‑ended, autonomous intelligence without human‑curated curricula.

Project page: https://andrewzh112.github.io/absolute-zero-reasoner/
🥱8👍2🤯2🔥1🥰1
Links for 2025-05-07

AI


1. VideoMimic: Visual imitation enables contextual humanoid control https://www.videomimic.net/

2. “IntersectionZoo,” a benchmarking tool, uses a real-world traffic problem to test progress in deep reinforcement learning algorithms. https://news.mit.edu/2025/new-tool-evaluate-progress-reinforcement-learning-0505

3. Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning https://arxiv.org/abs/2505.01441

4. "AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World", Zhou et al 2025 {BAIR} https://arxiv.org/abs/2503.24278

5. "High-quality deepfakes have a heart!", Seibold et al 2025 (deepfakes can replicate signatures of blood flow) https://www.frontiersin.org/journals/imaging/articles/10.3389/fimag.2025.1504551/full

6. Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning https://arxiv.org/abs/2505.03318

7. "DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning", He et al 2025 {Tencent} https://arxiv.org/abs/2504.11456

8. OpenAI Reaches Agreement to Buy Startup Windsurf for $3 Billion https://www.bloomberg.com/news/articles/2025-05-06/openai-reaches-agreement-to-buy-startup-windsurf-for-3-billion [no paywall: https://archive.is/l6n9H]

9. Four Predictions About OpenAI's Plans To Retain Nonprofit Control https://www.obsolete.pub/p/four-predictions-about-openais-plans

10. Microsoft has published a paper about giving a code-generating LLM access to a Python debugger. https://microsoft.github.io/debug-gym/

11. Five Hinge‑Questions That Decide Whether AGI Is Five Years Away or Twenty https://www.lesswrong.com/posts/45oxYwysFiqwfKCcN/untitled-draft-keg3

12. AI pathways to AGI: 7 leading theories experts are betting on. https://www.forbes.com/sites/lanceeliot/2025/05/04/big-bets-on-which-of-these-pathways-will-push-todays-ai-to-become-prized-agi/

13. Zuckerberg’s Dystopian AI Vision https://www.lesswrong.com/posts/QNkcRAzwKYGpEb8Nj/zuckerberg-s-dystopian-ai-vision

14. Paul Tudor Jones: AI poses an imminent threat to humanity in our lifetime https://www.youtube.com/watch?v=wrESBnPYoZU

15. Amazon's Vulcan Robots Now Stow Items Faster Than Humans https://spectrum.ieee.org/amazon-stowing-robots

16. FutureHouse Platform: Superintelligent AI Agents for Scientific Discovery https://www.futurehouse.org/research-announcements/launching-futurehouse-platform-ai-agents

17. Kevin-32B: Multi-Turn RL for Writing CUDA Kernels https://cognition.ai/blog/kevin-32b

Miscellaneous


1. Exploring Skill Generalization with an Extra Robotic Arm for Motor Augmentation https://advanced.onlinelibrary.wiley.com/doi/epdf/10.1002/aisy.202500086

2. The Computational Bottleneck of Basal Ganglia Output (and What to Do About it) https://www.eneuro.org/content/12/4/ENEURO.0431-23.2024

3. Elementl Power has signed an agreement with Google to develop three new project sites for advanced nuclear reactors. Each site will generate at least 600 megawatts. The three locations have not yet been announced. https://blog.google/feed/google-and-elementl-nuclear-energy-site-development/

4. More social parrots have a better vocabulary https://www.mpg.de/24656080/0505-ornr-more-social-parrots-have-a-better-vocabulary-987453-x

5. The Russian Open Source Project That We Can’t Live Without https://huntedlabs.com/the-russian-open-source-project-that-we-cant-live-without/
👍1
Quote from a German article about ASML:

But what happens inside the belly of this giant machine is nothing short of incredible. Outside the steel colossus sit high-power lasers whose beam is many times stronger than those used in industry to cut metal—and extraordinarily focused. Were you to aim it at the Moon, some 400,000 kilometers away, you could shoot a golf ball there.

The laser beam must be that precise, because it’s meant to hit tiny droplets of tin—and each one exactly twice—50,000 times per second. To achieve this, the beam is channeled through countless optical elements, amplified, and finally steered onto the droplets with especially powerful optics. The first pulse flattens a droplet into a kind of pancake; the second vaporizes it—at a temperature forty times hotter than the surface of the Sun. A plasma forms in the process, emitting something the engineers need: extremely short-wavelength extreme ultraviolet light (EUV).
🔥64
Experimental Validation of the Google AI Co-Scientist System

In February, Google released an AI system capable of generating novel hypotheses and research strategies.

Researchers at Stanford Medicine, following the AI co-scientist's suggestions, conducted experiments using a multi-lineage human hepatic organoid (microHO) platform capable of reproducing key features of liver fibrosis, including MyoF generation and collagen filament formation. These comprehensive experimental results strongly validate the AI co-scientist's hypotheses regarding the role of epigenetic modifications (specifically HDAC and BRD4 pathways) in liver fibrosis and successfully identified promising drug candidates like Vorinostat using an advanced human organoid model.

This study provides a first demonstration that a compound, multi-agent system, which was designed to mirror the reasoning process underlying scientific discovery, can assist in repurposing drugs for treating a disease with limited therapeutic options.


Notably, this discovery was made with an older version of the system from last year. Google has made considerable progress building on the latest Gemini 2.5 models. They will report on it soon.

The paper confirming the finding experimentally is available here: https://www.biorxiv.org/content/10.1101/2025.04.29.651320v1.full.pdf

Read more: https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/
🔥3
Links for 2025-05-08

Artificial and natural intelligence


1. SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning https://novasky-ai.notion.site/skyrl-v0

2. ZeroSearch: Incentivize the Search Capability of LLMs without Searching https://arxiv.org/abs/2505.04588

3. X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains https://arxiv.org/abs/2505.03981

4. Why is backpropagation biologically implausible? And what might the brain do instead? https://www.youtube.com/watch?v=l-OLgbdZ3kk

5. The Signal-To-Noise Ratio Hypothesis of Intelligence https://osf.io/preprints/osf/nkms3_v1

6. Reanalysis of the METR AI success rate curve as resulting from improving per-time failure rate. https://www.tobyord.com/writing/half-life

7. Meta Locate 3D: a model for accurate object localization in 3D environments. https://ai.meta.com/blog/meta-fair-updates-perception-localization-reasoning/

8. The Mathematics of Artificial Intelligence https://arxiv.org/abs/2501.10465

9. FDA announces aggressive AI rollout. https://www.fda.gov/news-events/press-announcements/fda-announces-completion-first-ai-assisted-scientific-review-pilot-and-aggressive-agency-wide-ai

10. Shares of both Apple and Alphabet dropped after the executive’s testimony during an antitrust case, where he revealed that searches on Safari had declined because more users are likely turning to AI chatbots. https://www.bloomberg.com/news/articles/2025-05-07/apple-working-to-move-to-ai-search-in-browser-amid-google-fallout [no paywall: https://archive.is/aTHJv]

11. Why the UAE has mandated AI learning in schools https://www.semafor.com/article/05/07/2025/why-the-uae-has-mandated-ai-learning-in-schools

12. China's capital city is making AI education mandatory, even for elementary schoolers https://www.businessinsider.com/china-beijing-ai-education-mandatory-classrooms-elementary-schoolers-2025-3 [no paywall: https://archive.is/ZQtv6]

13. Draft executive order outlines plan to integrate AI into K-12 schools https://www.washingtonpost.com/education/2025/04/22/ai-schools-executive-order-trump-draft/ [no paywall: https://archive.is/SmfSu]

14. AI is transforming education into a collaborative interaction between humans and machines https://blogs.lse.ac.uk/impactofsocialsciences/2025/05/07/ai-is-transforming-education-into-a-collaborative-interaction-between-humans-and-machines/

15. Everyone Is Cheating Their Way Through College: ChatGPT has unraveled the entire academic project. https://nymag.com/intelligencer/article/openai-chatgpt-ai-cheating-education-college-students-school.html [no paywall: https://archive.is/ofxzl]

16. After an Arizona man was shot, an AI video of him addresses his killer in court https://www.npr.org/2025/05/07/g-s1-64640/ai-impact-statement-murder-victim

17. Trump to Rescind Global Chip Curbs, Prep New AI Restrictions https://www.bloomberg.com/news/articles/2025-05-07/trump-to-rescind-global-chip-curbs-amid-ai-restrictions-debate [no paywall: https://archive.is/0jzbn]

18. DeepSeek and tariffs fail to undermine the AI investment boom (so far) https://fasterplease.substack.com/p/deepseek-and-tariffs-fail-to-undermine

19.OpenAI CEO Sam Altman (Senate committee(: 'It is our belief that the American models, from our company OpenAI, Google and others are the best models in the world. It's very hard to say how far ahead we are, but I would say not a huge amount of time.' https://www.youtube.com/live/jOqTg1W_F5Q?si=ezCoBGEk1_AVIcY0

20. Don’t Bet the Future on Winning an AI Arms Race https://aiprospects.substack.com/p/dont-bet-the-future-on-winning-an

Miscellaneous

1. How the US built 5000 ships during WWII. https://www.construction-physics.com/p/how-the-us-built-5000-ships-in-wwii

2. A cold-water diving tradition on South Korea’s Jeju Island may have shaped the genetics of its population. https://www.nature.com/articles/d41586-025-01386-4 [no paywall: https://archive.is/0YUt7]

3. Google Research and ISTA are using light microscopes to "map" the brain. https://blog.google/technology/research/liconn-connectomics/
👍5🥱3
This media is not supported in your browser
VIEW IN TELEGRAM
Reports suggest that India and Pakistan are now attacking each other's key military bases.

Footage allegedly of an Indian cruise missile hitting Pakistan’s Nur Khan Airbase.

The Airbase, just outside Pakistan’s capital, Islamabad, was targeted by multiple missiles this evening.

POV 33.6085901, 73.1056448
🤷‍♂7👍5🎉2😁1😢1💩1
Got a biased coin? Flip it twice.

- If you get Heads → Tails, call that 0.

- If you get Tails → Heads, call that 1.

- Any matched pair (HH or TT) gets tossed and you try again.

Because HT and TH are equally likely even for a skewed coin, the surviving pairs form a perfectly fair bitstream—Von Neumann’s elegant extractor in a nutshell.
👍18😐2🔥1
There are constant-width curves other than a circle: https://www.youtube.com/watch?v=quuw4HC96bE
👍5🥱3👏1