Forwarded from Parallel Experiments (Linghao Zhang)
Thinking Machines finally broke silence and published their first blog post: https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/ which was a great read 😎
Please open Telegram to view this post
VIEW IN TELEGRAM
Thinking Machines Lab
Defeating Nondeterminism in LLM Inference
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
Forwarded from Chell’s Red Pill
之前确实不知道Diet Coke是针对女性的品牌。
https://finalgirldigital.substack.com/p/the-diet-coke-essay
The girl loves to consume Diet Coke, but most of all she loves to be consumed whilst consuming Diet Coke. The image of pleasure with no calories. The idea of femininity with no body.
https://finalgirldigital.substack.com/p/the-diet-coke-essay
Substack
The Diet Coke Essay
I was at a restaurant and I asked the server for a Diet Coke, as usual.
Forwarded from 小破不入渠🌏
喜欢这集,连看了三遍,其中金句是:
It occurred to me that our very long drive was in fact a very short period of time in the grand scheme of things. That’s what long-distance friendships are built on — burst of experience that we log as memories, until the next time we meet.
过去几年,我看了太多旅行相关的视频。不针对任何人,但我实在是受够了那种用电视购物主持人语气推销一系列景点、餐厅的旅行内容。而这个视频恰好是其反义词。
三个月前,和_桑在京都告别后,我就一直在想这件事——如何编排下一个人生阶段的旅行。这种编排不只是想要去哪,不只是超越那些虚无缥缈的「在路上」、「去远方」的庸俗意义,而是真正编织出有纹理的、特别的旅情。
这个其实是最难的,正如我自己说过的「住温泉旅馆,除了景色、装潢、设施、料理……更重要永远是会不会有好的对话发生在那里。」
https://www.youtube.com/watch?v=ASIuMBog7xI
It occurred to me that our very long drive was in fact a very short period of time in the grand scheme of things. That’s what long-distance friendships are built on — burst of experience that we log as memories, until the next time we meet.
过去几年,我看了太多旅行相关的视频。不针对任何人,但我实在是受够了那种用电视购物主持人语气推销一系列景点、餐厅的旅行内容。而这个视频恰好是其反义词。
三个月前,和_桑在京都告别后,我就一直在想这件事——如何编排下一个人生阶段的旅行。这种编排不只是想要去哪,不只是超越那些虚无缥缈的「在路上」、「去远方」的庸俗意义,而是真正编织出有纹理的、特别的旅情。
这个其实是最难的,正如我自己说过的「住温泉旅馆,除了景色、装潢、设施、料理……更重要永远是会不会有好的对话发生在那里。」
https://www.youtube.com/watch?v=ASIuMBog7xI
YouTube
Travelling America With One Rule: Eat Only At Diners
THE DINER ROAD TRIP: I found myself in Seattle on tour, showing unfinished films up the west coast. Rather than fly to the remaining shows in San Fran and LA, I decided to drive. And as all great roadies need a great companion, I called Mike, a Bostonian…
Forwarded from Parallel Experiments (Linghao Zhang)
https://gregorygundersen.com/blog/2025/10/01/large-language-models/
预感这篇会是 LLM Researcher 必读:作者把跨越数十年的语言模型研究梳理成了一条清晰的时间线,讲述我们是怎么一步一步得到今天的 transformer based LLM 的。文章的思路非常 from first principles,并且用前后一致的符号串起了 N 篇不同的论文的要点。
非常喜欢文尾的一段话:
> If you feel that it’s a bit perverse that next-word prediction is a sufficient objective to solve elite math problems, if this feels like a stochastic parrot outsmarting you, then you might feel some of the discomfort early linguists felt at statistical language modeling. This is the visceral feeling of the bitter lesson. Our specialized knowledge feels expendable and our intuitions about understanding seem irrelevant in the face of raw computation and speed.
预感这篇会是 LLM Researcher 必读:作者把跨越数十年的语言模型研究梳理成了一条清晰的时间线,讲述我们是怎么一步一步得到今天的 transformer based LLM 的。文章的思路非常 from first principles,并且用前后一致的符号串起了 N 篇不同的论文的要点。
非常喜欢文尾的一段话:
> If you feel that it’s a bit perverse that next-word prediction is a sufficient objective to solve elite math problems, if this feels like a stochastic parrot outsmarting you, then you might feel some of the discomfort early linguists felt at statistical language modeling. This is the visceral feeling of the bitter lesson. Our specialized knowledge feels expendable and our intuitions about understanding seem irrelevant in the face of raw computation and speed.
Gregorygundersen
A History of Large Language Models
Forwarded from Reorx’s Forge
当接到一个新任务时,尤其是在会议或讨论后,大脑会装满各种相关的上下文信息,就像缓存一样。如果你此刻觉得自己对任务很清楚了,就应该立刻开始执行,而不是把它加入任务清单,安排到所谓的"特定时间"再做。
这是因为,大脑此刻的清晰感来源于这些充足的上下文,而这些信息会随时间快速衰减。虽然你可能通过笔记(如任务概述或会议纪要)记录了这些信息的线索,但它们只是高度压缩的索引。重新"解压"和展开这些索引同样耗时。很多时候,我们大量的时间恰恰耗费在重新理解这些上下文线索上。
所以,我们应该趁着大脑对任务认知清晰、解决方案呼之欲出的状态,立刻开始实现。这相当于把这件事所需的信息"转储"(dump)出来,固化为实际的成果,从而减轻大脑的负担。
其实,完成一件事情的核心框架所需的速度是很快的。如果你觉得时间不够,哪怕只是写写伪代码、定好函数名和调用方式,甚至用口述(语音输入提示词给AI)来勾勒出执行路径,也算一个开始。
从熵增的逻辑来理解也很清楚。如果推迟执行,任务的"熵"会越来越高。未来要降低这个熵,所需花费的时间和精力,等于要重来一遍。但只要任务开始了,它需要排解的"熵"就会减少。当下一次继续时,需要加载到大脑"内存"中的数据也会减少。因为任务已经变得有条理,只需按需加载即可。这就像一个游戏,初始状态是加载整个大地图,但当框架搭好、脉络清晰后,下次只需加载某个特定关卡,所需的"内存"自然就少了。
所以,当你对一件事很清楚时,不要犹豫,立刻去做。不要延后,不要拖延。这(或许)是唯一不能拖延的事情。你可以拖延其他事情,那些拖延(相比之下)或许没有代价。但是,当你知道一件事情该怎么做之后,每拖延一秒,你都必须为之付出代价——也许是双倍的时间。
so do it, do it immediately when you clearly know what to do
这是因为,大脑此刻的清晰感来源于这些充足的上下文,而这些信息会随时间快速衰减。虽然你可能通过笔记(如任务概述或会议纪要)记录了这些信息的线索,但它们只是高度压缩的索引。重新"解压"和展开这些索引同样耗时。很多时候,我们大量的时间恰恰耗费在重新理解这些上下文线索上。
所以,我们应该趁着大脑对任务认知清晰、解决方案呼之欲出的状态,立刻开始实现。这相当于把这件事所需的信息"转储"(dump)出来,固化为实际的成果,从而减轻大脑的负担。
其实,完成一件事情的核心框架所需的速度是很快的。如果你觉得时间不够,哪怕只是写写伪代码、定好函数名和调用方式,甚至用口述(语音输入提示词给AI)来勾勒出执行路径,也算一个开始。
从熵增的逻辑来理解也很清楚。如果推迟执行,任务的"熵"会越来越高。未来要降低这个熵,所需花费的时间和精力,等于要重来一遍。但只要任务开始了,它需要排解的"熵"就会减少。当下一次继续时,需要加载到大脑"内存"中的数据也会减少。因为任务已经变得有条理,只需按需加载即可。这就像一个游戏,初始状态是加载整个大地图,但当框架搭好、脉络清晰后,下次只需加载某个特定关卡,所需的"内存"自然就少了。
所以,当你对一件事很清楚时,不要犹豫,立刻去做。不要延后,不要拖延。这(或许)是唯一不能拖延的事情。你可以拖延其他事情,那些拖延(相比之下)或许没有代价。但是,当你知道一件事情该怎么做之后,每拖延一秒,你都必须为之付出代价——也许是双倍的时间。
so do it, do it immediately when you clearly know what to do
Forwarded from Xuanwo's Tweets (Xuanwo)
Forwarded from Parallel Experiments (Linghao Zhang)
linghao.io
Why You Should Probably Work on AI Engineering - Linghao Zhang
An exploration of why traditional software engineers should embrace AI engineering as the next great layer of complexity management. This post covers the shift from deterministic logic to non-deterministic systems, the importance of system engineering to…
https://www.workingtheorys.com/p/make-something-heavy
Beautifully written. Anxiety bubbling out - the good kind, pushing through the porous, flaky wall of complacency.
(Credit: https://t.me/peopleofscreen)
Beautifully written. Anxiety bubbling out - the good kind, pushing through the porous, flaky wall of complacency.
(Credit: https://t.me/peopleofscreen)
Workingtheorys
Make Something Heavy
We're creating more than ever, but it weighs nothing.
Forwarded from Effer für Wissenschaft
My desire to be well informed is currently at odds with my desire to maintain sanity.
Well said.
Forwarded from 糊锅
The Slow Death of the Power User
https://fireborn.mataroa.blog/blog/the-slow-death-of-the-power-user/
https://fireborn.mataroa.blog/blog/the-slow-death-of-the-power-user/
Forwarded from Reorx’s Forge
不用讨厌🦞吧,过往每次热点出现不都是这样,铺天盖地的信息让人想吐。本质是一切想要通过社交网络流量获得利益的人,用扭曲的、无底线的方式,对大众注意力的控制和掠夺。这几乎是一种客观现象了,无力改变,能做的是培养自己的抗性,和信息筛选的手段。
The People Who Shun Super-Popular Pop Culture
https://www.theatlantic.com/culture/2026/03/pop-culture-hype-aversion/686312/?gift=Ey2wl6m0ZiBxjnVqvqVwkJqNllnUD9wBMCD13etcjL4&utm_source=copy-link&utm_medium=social&utm_campaign=share
https://www.theatlantic.com/culture/2026/03/pop-culture-hype-aversion/686312/?gift=Ey2wl6m0ZiBxjnVqvqVwkJqNllnUD9wBMCD13etcjL4&utm_source=copy-link&utm_medium=social&utm_campaign=share
The Atlantic
The People Who Shun Super-Popular Pop Culture
“The Pitt,” “Severance,” “Sinners,” you name it: For some reason, the more popular something is, the more likely I am to resist it.