Ali's Notes

این یکی از باحال ترین مقاله هایی هستش که ژورافسکی و یان لکون که نام های آشنایی هستن بیرون دادن.
تو این مقاله سعی کردن تفاوت بین LLM ها و سیستم زبانی انسان ها رو مشخص کنن.
و نتیجه های باحالی به دست اوردن.

مدل های زبانی به صورت اگرسیو طور کامپرس میکنن مفاهیم رو و اونقدر که دیگه با مفاهیم ما انسان ها تفاوت ایجاد میشه.

قضیه اینکه این مدل ها در اصل یه عالمه دیتا رو که بخوردشون میدیم کامپرس میکنن اطلاعات رو و بعد چون کامپرس شدن (فضای کمتری میگیرن تو فضا) و بعد زمان تولید یا جنریشن این اطلاعات کامپرس شده دیکود میشن.

مغز ماهم همینطور هستش و مثلا شما ممکنه یه کتاب ۱۰۰۰ صفحه ای رو بخونید و بعدش تو ذهن شما یه سامری یا خلاصه ای تو ذهن شما میمونه و شما بعد ها زمانی که بازگو میکنید میتونید اون خلوص داستان رو با طبع ایجاد variation بازگو کنید.


As the mental scaffolding of human cognition, concepts enable efficient interpretation, generalization
from sparse data, and rich communication. For LLMs to transcend surface-level mimicry and achieve
more human-like understanding, it is critical to investigate how their internal representations navigate
the crucial trade-off between information compression and the preservation of semantic meaning. Do
LLMs develop conceptual structures mirroring the efficiency and richness of human thought, or do
they employ fundamentally different representational strategies?

حتما این مقاله رو بخونید ‌:)

🔗

https://arxiv.org/pdf/2505.17117v2

@css_nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

👍9❤7

5.79K views09:03

Ali's Notes

تویت جالب اندرو کارپاسی درمورد

LLMs and code generation

https://x.com/karpathy/status/1930305209747812559

You could see it as there being two modes in creation. Borrowing GAN terminology:
1) generation and
2) discrimination.
e.g. painting - you make a brush stroke (1) and then you look for a while to see if you improved the painting (2). these two stages are interspersed in pretty much all creative work.

Second point. Discrimination can be computationally very hard.
- images are by far the easiest. e.g. image generator teams can create giant grids of results to decide if one image is better than the other. thank you to the giant GPU in your brain built for processing images very fast.
- text is much harder. it is skimmable, but you have to read, it is semantic, discrete and precise so you also have to reason (esp in e.g. code).
- audio is maybe even harder still imo, because it force a time axis so it's not even skimmable. you're forced to spend serial compute and can't parallelize it at all.

You could say that in coding LLMs have collapsed (1) to ~instant, but have done very little to address (2). A person still has to stare at the results and discriminate if they are good. This is my major criticism of LLM coding in that they casually spit out *way* too much code per query at arbitrary complexity, pretending there is no stage 2. Getting that much code is bad and scary. Instead, the LLM has to actively work with you to break down problems into little incremental steps, each more easily verifiable. It has to anticipate the computational work of (2) and reduce it as much as possible. It has to really care.

This leads me to probably the biggest misunderstanding non-coders have about coding. They think that coding is about writing the code (1). It's not. It's about staring at the code (2). Loading it all into your working memory. Pacing back and forth. Thinking through all the edge cases. If you catch me at a random point while I'm "programming", I'm probably just staring at the screen and, if interrupted, really mad because it is so computationally strenuous. If we only get much faster 1, but we don't also reduce 2 (which is most of the time!), then clearly the overall speed of coding won't improve (see Amdahl's law).

@css_nlp

❤4👍4🔥1

6.12K viewsedited 12:02

Ali's Notes

یه پکیج خوب برای فیکس کردن جیسان های خروجی مدل های زبانی:

🔗

https://github.com/mangiucugna/json_repair

@css_nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub

GitHub - mangiucugna/json_repair: A python module to repair invalid JSON from LLMs

A python module to repair invalid JSON from LLMs. Contribute to mangiucugna/json_repair development by creating an account on GitHub.

👍11

7.05K views08:49

Ali's Notes

🔹

Open Source LLM Engineering Platform

🔹

🔗

https://langfuse.com/

@css_nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

Langfuse

Traces, evals, prompt management and metrics to debug and improve your LLM application. Integrates with Langchain, OpenAI, LlamaIndex, LiteLLM, and more.

👍4❤1

5.44K viewsedited 14:24

Ali's Notes

🚨 Paper Alert

🔹

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

🔹

🔗

https://arxiv.org/pdf/2501.17703

@css_nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

👍3❤1🆒1

4.85K views10:58

Ali's Notes

Forwarded from Memes

👍8👏3🔥2❤1🆒1

1.38K views11:32

Ali's Notes

🔹

Diagram Generator

🔹

A tool for generating diagrams using LLM, with a Python backend API and React frontend.

🙂

🙂Disclaimer
This project was purely an experiment, it is entirely generated with guidance from myself.
Otherwise known as "vibe coding".
Would I recommend it? No, in fact don't even credit me if you use this code.
I do however believe that as engineers we need to explore options we disagree with in order to widen our perspective. I did in fact learn a lot about how LLMs "think" during this project and am glad to have done it regardless of how frustrating it was.

🔗

https://github.com/minimalefforttech/diagram_generator

@css_nlp

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub

GitHub - minimalefforttech/diagram_generator: LLM Powered Diagram Generator

LLM Powered Diagram Generator. Contribute to minimalefforttech/diagram_generator development by creating an account on GitHub.

👍5

4.93K viewsedited 09:57

Ali's Notes

🔹

Build images with images