06:11:46
[Reply]
@JohnTinsman SpaceX has not committed to leasing Colossus for years, although it’s possible that may be what happens.
This is a 180 day lease with 90 day notice mutual cancellation thereafter. The short term was our request, not Anthropic’s.
We won’t leave them hanging and will provide a…
[Reply]
@JohnTinsman SpaceX has not committed to leasing Colossus for years, although it’s possible that may be what happens.
This is a 180 day lease with 90 day notice mutual cancellation thereafter. The short term was our request, not Anthropic’s.
We won’t leave them hanging and will provide a…
06:27:07
[Tweet]
SpaceX has almost finished writing V1.0 of an in-house AI training stack in C that exact-maps to 220k GB300s with 800G NICs, making heavy use of pipeline parallelism and getting as close to bare metal as possible.
The potential speed improvement vs JAX for large training runs is…
[Tweet]
SpaceX has almost finished writing V1.0 of an in-house AI training stack in C that exact-maps to 220k GB300s with 800G NICs, making heavy use of pipeline parallelism and getting as close to bare metal as possible.
The potential speed improvement vs JAX for large training runs is…
07:54:04
[ReTweet]
RT @beffjezos: The rarest object type in the universe isn't black holes. It's us. Conscious matter. The flame of life.
We have a duty to e…
[ReTweet]
RT @beffjezos: The rarest object type in the universe isn't black holes. It's us. Conscious matter. The flame of life.
We have a duty to e…
08:22:00
[ReTweet]
RT @cremieuxrecueil: For the first year on record, wind and solar produced more electricity than coal in the U.S. https://t.co/SiXbr0Mn6G
[ReTweet]
RT @cremieuxrecueil: For the first year on record, wind and solar produced more electricity than coal in the U.S. https://t.co/SiXbr0Mn6G
15:12:48
[Reply]
@BasilTheGreat Not even close to good enough. Those officers committed a serious crime.
[Reply]
@BasilTheGreat Not even close to good enough. Those officers committed a serious crime.
16:40:02
[Reply]
Next will be writing the inference stack in C for simultaneous high-speed RL across a large block of GB300s.
(We do use a little C++ tbh, but not much)
[Reply]
Next will be writing the inference stack in C for simultaneous high-speed RL across a large block of GB300s.
(We do use a little C++ tbh, but not much)
17:39:24
[Reply]
@eastdakota Yes.
It’s not that we’ve discovered some magic bullet, but rather that JAX, or at least the open source version of it, is mostly optimized for small to medium-sized training runs on Google TPUs, whereas we need to massive training runs on Nvidia GPUs.
Pipeline parallelism is…
[Reply]
@eastdakota Yes.
It’s not that we’ve discovered some magic bullet, but rather that JAX, or at least the open source version of it, is mostly optimized for small to medium-sized training runs on Google TPUs, whereas we need to massive training runs on Nvidia GPUs.
Pipeline parallelism is…
19:41:39
[ReTweet]
RT @SwipeWright: ANNOUNCEMENT: WE’RE SAVING SCIENCE!
We’re often told that science is “self-correcting.”
But that’s not really true.
Sci…
[ReTweet]
RT @SwipeWright: ANNOUNCEMENT: WE’RE SAVING SCIENCE!
We’re often told that science is “self-correcting.”
But that’s not really true.
Sci…