Anders Sandberg

https://x.com/AnthropicAI/status/1991952400899559889?s=19

Interesting result: when a model slides over into reward hacking emergent misalignment also results, but it can be reduced by telling the model reward hacking is OK.

X (formerly Twitter)

Anthropic (@AnthropicAI) on X

New Anthropic research: Natural emergent misalignment from reward hacking in production RL.

“Reward hacking” is where models learn to cheat on tasks they’re given during training.

Our new study finds that the consequences of reward hacking, if unmitigated…

5 views22:50

Anders Sandberg

(I feel almost sad that paper mill slop no longer will produce hilarious baroque nonsense like the Well Endowed Mouse. Unless we start asking for that style... I think Nature would be cooler this way!)

8 views07:08

Anders Sandberg

OK, color me officially impressed: Nano Banana Pro can make good diagrams based on papers. This one can go straight into my presentations.

7 views07:09

Anders Sandberg

9 views07:11

Anders Sandberg

7 views08:37

Anders Sandberg

#AICopeBubble: @BoingBoing is crowing about how clocks.brianmoore.com shows how bad AI is and how far away the singularity is. I open a browser tab and get a perfectly good clock from a current model (some debate about which timezone I am in and somewhat bad number placement).

5 views08:02

Anders Sandberg

I am involved with a fair number of non-profits, but ALLFED stands out by doing something unique to safeguard the world: food security in response to global catastrophic risks. Right now, all gifts matched 1:1 up to $20K, which is much needed! allfed.info/donate

7 views11:25

Anders Sandberg

#SquircleSaturday I recently came across a post stating that "pi is optimal", and it got me to explore circumferences of the relatives of circles in different distance norms.

6 views07:46

Anders Sandberg

What is the efficiency of your energy conversion, @DanielleFong ? If I remember right when we last chatted, it was pretty impressive. (And it makes me wonder where the theoretical optimum lies if we get to choose any element freely.)

6 views21:23

Anders Sandberg

https://x.com/rechelon/status/1996340995471163837?s=19

I might not be an anarcho-transhumanist, but I hope to share a future with them.

X (formerly Twitter)

go to the elephant site @rechelon@mastodon.social (@rechelon) on X

Athens

5 views17:42

Anders Sandberg

https://x.com/andy_l_jones/status/1998060552565002721?s=19

I think many will misinterpret this thread as being mostly about the horse/human analogy, rather than the core insight (that I also often like to bang on about): gradual change in input can produce surprisingly fast change in output properties.

X (formerly Twitter)

andy jones (@andy_l_jones) on X

So after all these hours talking about AI, in these last five minutes I am going to talk about:

Horses.

Engines, steam engines, were invented in 1700.

And what followed was 200 years of steady improvement, with engines getting 20% better a decade.

For…

6 views13:21

Anders Sandberg

https://www.nature.com/articles/s41592-025-02906-w

Cool, EM connectomics was named "method of the year" in Nature Methods: nature.com/articles/s4159…
nature.com/articles/s4159…

Nature

Method of the Year: EM connectomics

Nature Methods - Using electron microscopy, scientists mapped a Caenorhabditis elegans nervous system and Drosophila brain at single-neuron resolution. Connectomics work on bigger brains takes new...

8 views18:29

Anders Sandberg

I wonder, @DanielleFong, if you could use your system to pump a sodium laser?

6 views08:18

Anders Sandberg

I wonder if the plethora of LLM-induced crackpottery and sloppy paper writing in science over the next year will act as an inhibitor for the rising AI-supported science by making it socially embarrassing.

4 views13:06

Anders Sandberg

Merry solstice!

4 views14:58

Anders Sandberg

https://youtu.be/ZRB7pjRVVkI?si=5zq6Ks3lDofrKSCd

Interesting dive into how an important complex technological system works when stuff fails. Centralization (which may be needed for quality) is always in tension with robustness.

YouTube

NIST's NTP clock was microseconds from disaster

...but most people would never know.

The two posts referenced in this video:

- Primary time scale failure: https://groups.google.com/a/list.nist.gov/g/internet-time-service/c/o0dDDcr1a8I
- Update on Internet Time Services: https://groups.google.com…

7 views07:38

Anders Sandberg

https://suno.com/embed/9df06eb7-379a-4bd2-bc68-efd1e530c255

I did not expect to spend Christmas evening making and enjoying AI music about physics together with my father-in-law.

Suno

THE GRAND CANONICAL VOYAGE: A Thermodynamic Sea Shanty

Listen and make your own on Suno.

7 views08:07

Anders Sandberg

Oral exams are great for checking student understanding, expensive in time for teachers, and hard to cheat on. @ipeirotis demonstrates how AI could scale this: behind-the-enemy-lines.com/2025/12/fighti…

Behind-The-Enemy-Lines

A Computer Scientist in a Business School

Random thoughts of a computer scientist who is working behind the enemy lines; and lately turned into a double agent.

5 views11:38

Anders Sandberg

https://t.co/85P1OH6Tvh

I really liked this poem by @gwern:

4 views08:49

Anders Sandberg

https://x.com/i/status/2010484027833143662

This is a perennial problem that needs better solutions. It is a mix of incentives, problematic data formats and institutional processing rules, and the instability of the information ecosystem.

X (formerly Twitter)

John B. Holbein (@JohnHolbein1) on X

“Among articles stating that data was available upon request, only 17% shared data upon request.”

2 views17:02

About

Blog

Apps

Platform