Vol Building AGI

🌚2

77 views16:23

When making matplotlib figures, it's important to match the font to the rest of the paper: https://x.com/giffmana/status/1632506730897653761

👍1

78 views10:12

Vol Building AGI

💅7🔥3

77 views12:05

Vol Building AGI

0:17

This media is not supported in your browser

VIEW IN TELEGRAM

В мене була можливість поспілкуватися з дослідниками robotics + computer vision — їх фокус розробки завжди на обчислення в реальному часі. Мапи і моделі місцевості будуються в реальному часі. Може так є сенс думати і про тренування мультимодальних діалогових систем? Працювати з чатботами натренованими на випадкових текстах зібраних хто зна ким хто зна коли не настільки цікаво.

👍3

92 views12:43

Vol Building AGI

https://www.youtube.com/watch?v=1t7AWa4SMlo

I got excited and built a fast weight programmer loop into the script demo I shared before. It's using a read-forget-update loop to remember patterns that are coming from the microphone.

Check it out — it learns music and speech on the fly.

🔥1

97 viewsedited 10:35

Vol Building AGI

How 32 GPU threads cooperate in storing a single 16x16 tile for maximal memory coalescing. This is a swizzled layout from ThunderKittens. The labels on the right image should be transposed for the row-major format.

92 viewsedited 14:05

Vol Building AGI

The format above is required for using the mma.m16n8k16 instruction in PTX 8.5: https://docs.nvidia.com/cuda/parallel-thread-execution/#warp-level-matrix-fragment-mma-16816-float

89 views21:10

Vol Building AGI

Turing complete programs in transformers by construction give length generalization out of the box

https://arxiv.org/abs/2407.03310

arXiv.org

Universal Length Generalization with Turing Programs

Length generalization refers to the ability to extrapolate from short training sequences to long test sequences and is a challenge for current large language models. While prior work has proposed...

89 viewsedited 14:35

Vol Building AGI

Cool example of deep learning research going in circles: short conv1d has been reintroduced with QRNN, reintroduced to linear transformers in H3 and made mainstream in Mamba. This convolution makes the network learn faster in the beginning, improves recall and allows a single-layer linear RNN solve associative retrieval — I spent about a month figuring that property out.

This convolution block has made its way into Noam Shazeer's last transformer variant, and people are now writing papers making statements about its expressive power.

The conclusion authors make:

> An important parameterization to explore is replacing the short convolutions within CAT with SSMs

We've been there! Next thing you know you get Conv-SSM-Attention, and that is called Recurrent Gemma 😄

https://arxiv.org/abs/2407.05591

arXiv.org

On the Power of Convolution Augmented Transformer

The transformer architecture has catalyzed revolutionary advances in language modeling. However, recent architectural recipes, such as state-space models, have bridged the performance gap....

👍2

105 views15:39

Vol Building AGI

Compiler Explorer supports CUDA C++. See how nvcc emits SASS for thread barriers: https://godbolt.org/z/613zseoGW

godbolt.org

Compiler Explorer - CUDA C++ (NVCC 12.4.1)

#include <cooperative_groups.h>
#include <cuda/barrier>
using barrier = cuda::barrier<cuda::thread_scope_block>;

__global__ void square(int* array, int n) {
__shared__ barrier bar;
auto block = cooperative_groups::this_thread_block();

if (…

96 viewsedited 12:15

Vol Building AGI

Hello from ICML. I am at the tutorial on Data Attribution at Scale. We are studying how to relate model outputs to training inputs. Here are the notes:

https://ml-data-tutorial.org/

ml-data-tutorial.org

Data Attribution at Scale | ICML 2024

Notes accompanying our ICML tutorial

👍4

103 views07:41

Vol Building AGI

The next tutorial by Zeyuan Allen-Zhu is on Physics of Language Models. We study how to apply scientific method to the study of language models with examples. The examples include curation of synthetic data, mechanistic probing, and more.

Tutorial website: https://physics.allen-zhu.com/

Announcement: https://x.com/zeyuanallenzhu/status/1813150298363601102

Allen-Zhu

Physics of Language Models

The concept of Physics of Language Models was jointly conceived and designed by ZA and Xiaoli Xu.

🔥2

140 views13:55

Vol Building AGI

Use a world model to interpolate between two learning algorithms

117 views10:25

Vol Building AGI

Nando de Freitas giving love to OpenAI https://x.com/nandodf/status/1816530449830936805

X (formerly Twitter)

Nando de Freitas (@NandoDF) on X

@OpenAI has been the most inspiring and most impactful AI organisation in the history of humankind. I say this as someone who’s competed with them for nearly a decade. They made my life far more interesting than I could have dreamt of.

The people at @OpenAI…

114 viewsedited 17:53

Vol Building AGI

The tutorial video is now up https://youtu.be/yBL7J0kgldU?si=II4_C2fCaUQnkNyS

YouTube

ICML 2024 Tutorial: Physics of Language Models

Project page (with further readings): https://physics.allen-zhu.com/

Abstract: We divide "intelligence" into multiple dimensions (like language structures, knowledge, reasoning, etc.). For each dimension, we create synthetic data for LLM pretraining to understand…

🤯1

373 views08:05

Vol Building AGI

Юрій рухає локомотив

🥰1

114 views10:04

Vol Building AGI

Forwarded from пехаде блог

113 views10:04

Vol Building AGI

https://x.com/HumansNoContext/status/1821925436512665639

How to tune hyperparameters. Notice how smoothly the behavior changes when the speed is varied.

X (formerly Twitter)

NO CONTEXT HUMANS (@HumansNoContext) on X

Honestly this was unexpectedly fun to watch

😁1

128 viewsedited 15:20

Vol Building AGI

Do better science by watching the computer do it with you:

https://github.com/SakanaAI/AI-Scientist

GitHub

GitHub - SakanaAI/AI-Scientist: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬 - SakanaAI/AI-Scientist

👍1

125 views10:11