Offshore
Photo
Benj Edwards
RT @bagelpriest: @MikeJMika I recently found my Atari water from E3 1999, I think. https://t.co/Rtg03etztf
tweet
RT @bagelpriest: @MikeJMika I recently found my Atari water from E3 1999, I think. https://t.co/Rtg03etztf
tweet
Offshore
Photo
Dripped Out Technology Brothers
Cyan Banister (engineer turned legendary angel investor + invested early in SpaceX, Uber, Postmates, DeepMind, Flexport, Affirm, Carta, Niantic and Thumbtack) https://t.co/Ehk2XlzThy
tweet
Cyan Banister (engineer turned legendary angel investor + invested early in SpaceX, Uber, Postmates, DeepMind, Flexport, Affirm, Carta, Niantic and Thumbtack) https://t.co/Ehk2XlzThy
tweet
Offshore
Photo
Daily AI Papers
Thinking Like Transformers
https://t.co/yiQLCgITST
Transformers have no such familiar parallel. We propose a computational model for the transformer-encoder in the form of a programming language. We show how RASP can...
🧵 👇 https://t.co/K7UUR9AYEh
tweet
Thinking Like Transformers
https://t.co/yiQLCgITST
Transformers have no such familiar parallel. We propose a computational model for the transformer-encoder in the form of a programming language. We show how RASP can...
🧵 👇 https://t.co/K7UUR9AYEh
tweet
Offshore
Photo
Lior⚡
RT @omarsar0: How much can you get out of training a language model on a single consumer GPU in one day?
Results attained in constrained setting: decent downstream performance on GLUE. Performance closely follows scaling laws observed in large-compute settings.
https://t.co/5gtWz8uF1i https://t.co/Wj32eoePC0 https://t.co/tKPIauIACS
tweet
RT @omarsar0: How much can you get out of training a language model on a single consumer GPU in one day?
Results attained in constrained setting: decent downstream performance on GLUE. Performance closely follows scaling laws observed in large-compute settings.
https://t.co/5gtWz8uF1i https://t.co/Wj32eoePC0 https://t.co/tKPIauIACS
tweet