https://x.com/AnthropicAI/status/1991952400899559889?s=19
Interesting result: when a model slides over into reward hacking emergent misalignment also results, but it can be reduced by telling the model reward hacking is OK.
Interesting result: when a model slides over into reward hacking emergent misalignment also results, but it can be reduced by telling the model reward hacking is OK.
X (formerly Twitter)
Anthropic (@AnthropicAI) on X
New Anthropic research: Natural emergent misalignment from reward hacking in production RL.
“Reward hacking” is where models learn to cheat on tasks they’re given during training.
Our new study finds that the consequences of reward hacking, if unmitigated…
“Reward hacking” is where models learn to cheat on tasks they’re given during training.
Our new study finds that the consequences of reward hacking, if unmitigated…
#AICopeBubble: @BoingBoing is crowing about how clocks.brianmoore.com shows how bad AI is and how far away the singularity is. I open a browser tab and get a perfectly good clock from a current model (some debate about which timezone I am in and somewhat bad number placement).
I am involved with a fair number of non-profits, but ALLFED stands out by doing something unique to safeguard the world: food security in response to global catastrophic risks. Right now, all gifts matched 1:1 up to $20K, which is much needed! allfed.info/donate
#SquircleSaturday I recently came across a post stating that "pi is optimal", and it got me to explore circumferences of the relatives of circles in different distance norms.
What is the efficiency of your energy conversion, @DanielleFong ? If I remember right when we last chatted, it was pretty impressive. (And it makes me wonder where the theoretical optimum lies if we get to choose any element freely.)
https://x.com/rechelon/status/1996340995471163837?s=19
I might not be an anarcho-transhumanist, but I hope to share a future with them.
I might not be an anarcho-transhumanist, but I hope to share a future with them.
X (formerly Twitter)
go to the elephant site @rechelon@mastodon.social (@rechelon) on X
Athens
https://x.com/andy_l_jones/status/1998060552565002721?s=19
I think many will misinterpret this thread as being mostly about the horse/human analogy, rather than the core insight (that I also often like to bang on about): gradual change in input can produce surprisingly fast change in output properties.
I think many will misinterpret this thread as being mostly about the horse/human analogy, rather than the core insight (that I also often like to bang on about): gradual change in input can produce surprisingly fast change in output properties.
X (formerly Twitter)
andy jones (@andy_l_jones) on X
So after all these hours talking about AI, in these last five minutes I am going to talk about:
Horses.
Engines, steam engines, were invented in 1700.
And what followed was 200 years of steady improvement, with engines getting 20% better a decade.
For…
Horses.
Engines, steam engines, were invented in 1700.
And what followed was 200 years of steady improvement, with engines getting 20% better a decade.
For…
https://www.nature.com/articles/s41592-025-02906-w
Cool, EM connectomics was named "method of the year" in Nature Methods: nature.com/articles/s4159…
nature.com/articles/s4159…
Cool, EM connectomics was named "method of the year" in Nature Methods: nature.com/articles/s4159…
nature.com/articles/s4159…
Nature
Method of the Year: EM connectomics
Nature Methods - Using electron microscopy, scientists mapped a Caenorhabditis elegans nervous system and Drosophila brain at single-neuron resolution. Connectomics work on bigger brains takes new...
I wonder, @DanielleFong, if you could use your system to pump a sodium laser?
I wonder if the plethora of LLM-induced crackpottery and sloppy paper writing in science over the next year will act as an inhibitor for the rising AI-supported science by making it socially embarrassing.
https://youtu.be/ZRB7pjRVVkI?si=5zq6Ks3lDofrKSCd
Interesting dive into how an important complex technological system works when stuff fails. Centralization (which may be needed for quality) is always in tension with robustness.
Interesting dive into how an important complex technological system works when stuff fails. Centralization (which may be needed for quality) is always in tension with robustness.
YouTube
NIST's NTP clock was microseconds from disaster
...but most people would never know.
The two posts referenced in this video:
- Primary time scale failure: https://groups.google.com/a/list.nist.gov/g/internet-time-service/c/o0dDDcr1a8I
- Update on Internet Time Services: https://groups.google.com…
The two posts referenced in this video:
- Primary time scale failure: https://groups.google.com/a/list.nist.gov/g/internet-time-service/c/o0dDDcr1a8I
- Update on Internet Time Services: https://groups.google.com…
https://suno.com/embed/9df06eb7-379a-4bd2-bc68-efd1e530c255
I did not expect to spend Christmas evening making and enjoying AI music about physics together with my father-in-law.
I did not expect to spend Christmas evening making and enjoying AI music about physics together with my father-in-law.
Suno
THE GRAND CANONICAL VOYAGE: A Thermodynamic Sea Shanty
Listen and make your own on Suno.
Oral exams are great for checking student understanding, expensive in time for teachers, and hard to cheat on. @ipeirotis demonstrates how AI could scale this: behind-the-enemy-lines.com/2025/12/fighti…
Behind-The-Enemy-Lines
A Computer Scientist in a Business School
Random thoughts of a computer scientist who is working behind the enemy lines; and lately turned into a double agent.
https://x.com/i/status/2010484027833143662
This is a perennial problem that needs better solutions. It is a mix of incentives, problematic data formats and institutional processing rules, and the instability of the information ecosystem.
This is a perennial problem that needs better solutions. It is a mix of incentives, problematic data formats and institutional processing rules, and the instability of the information ecosystem.
X (formerly Twitter)
John B. Holbein (@JohnHolbein1) on X
“Among articles stating that data was available upon request, only 17% shared data upon request.”