How to write an okay research paper.
https://x.com/srush_nlp/status/1825526786513379567?t=iTwEJRkOtw3rIX5y9uLXlA&s=19
https://x.com/srush_nlp/status/1825526786513379567?t=iTwEJRkOtw3rIX5y9uLXlA&s=19
X (formerly Twitter)
Sasha Rush (@srush_nlp) on X
New Video: How to write an okay research paper.
Reviewers all agree! @srush_nlp's papers are "reasonably structured" and "somewhat clear, despite other flaws".
https://t.co/nCjYsDI5Jf
Reviewers all agree! @srush_nlp's papers are "reasonably structured" and "somewhat clear, despite other flaws".
https://t.co/nCjYsDI5Jf
๐ฅ3
I got many requests and questions about research and ML in the past few days and today I want to make a group to work on something. Probably this could be your first research work. To make the best out of it, I'll take 5-6 people as core members and incase we need more people we'll add some.
If you got any interesting ideas or maybe if you are curios about AI research, come join us.
The target is to make a cool work and hopefully publish a paper.
I'll try to reply for every DM and we will see if you are a great match for this.โ๏ธ
If you got any interesting ideas or maybe if you are curios about AI research, come join us.
The target is to make a cool work and hopefully publish a paper.
I'll try to reply for every DM and we will see if you are a great match for this.โ๏ธ
โค9๐ฅ8
To Code, or Not To Code? Exploring Impact of Code in Pre-training
So apparently adding some code data in your pretraining data increases reasoning and improves non-code tasks๐ค. I've seen this in a work from Neurips 2023 led by Niklas Muennighoff and now this work here goes in depth into it. My only concern is that they train 64 models ranging from 470M to 2.8B parameters and it's not clear if this applies to models with larger parameters.
If you are having some issues in Amharic llms try to add some python code data and see if it improves. I'll soon update you on it, once I got the results.
So apparently adding some code data in your pretraining data increases reasoning and improves non-code tasks๐ค. I've seen this in a work from Neurips 2023 led by Niklas Muennighoff and now this work here goes in depth into it. My only concern is that they train 64 models ranging from 470M to 2.8B parameters and it's not clear if this applies to models with larger parameters.
If you are having some issues in Amharic llms try to add some python code data and see if it improves. I'll soon update you on it, once I got the results.
โคโ๐ฅ7๐1
Programming is changing so fast... I'm trying VS Code Cursor + Sonnet 3.5 instead of GitHub Copilot again and I think it's now a net win. Just empirically, over the last few days most of my "programming" is now writing English (prompting and then reviewing and editing the generated diffs), and doing a bit of "half-coding" where you write the first chunk of the code you'd like, maybe comment it a bit so the LLM knows what the plan is, and then tab tab tab through completions. Sometimes you get a 100-line diff to your code that nails it, which could have taken 10+ minutes before.
I still don't think I got sufficiently used to all the features. It's a bit like learning to code all over again but I basically can't imagine going back to "unassisted" coding at this point, which was the only possibility just ~3 years ago.
Source Karpathy
I still don't think I got sufficiently used to all the features. It's a bit like learning to code all over again but I basically can't imagine going back to "unassisted" coding at this point, which was the only possibility just ~3 years ago.
Source Karpathy
๐15๐คฎ3โค2
Forwarded from Frectonz
My nixpkgs PR got merged after 2 weeks. I packaged my ethiopian calendar TUI app mekuteriya for nix.
I'm officially a NixOS package maintainer now.
https://github.com/NixOS/nixpkgs/pull/333690
๐๐๐
nix shell nixpkgs#mekuteriya
I'm officially a NixOS package maintainer now.
https://github.com/NixOS/nixpkgs/pull/333690
๐๐๐
โก12๐ฅ1
Forwarded from Chapi Dev Talks (Chapi M.)
I asked
In multiple ai models and here is the results.
Only Gemini got it right.
how many r's in strawberry
In multiple ai models and here is the results.
Only Gemini got it right.
๐Why AI canโt spell โstrawberryโ.
This is a blog from Techcrunch released just yesterday, most of the researchers have their view on why this happen.
From the blog "As these memes about spelling โstrawberryโ spill across the internet, OpenAI is working on a new AI product code-named Strawberry, which is supposed to be even more adept at reasoning..."
https://techcrunch.com/2024/08/27/why-ai-cant-spell-strawberry/
This is a blog from Techcrunch released just yesterday, most of the researchers have their view on why this happen.
From the blog "As these memes about spelling โstrawberryโ spill across the internet, OpenAI is working on a new AI product code-named Strawberry, which is supposed to be even more adept at reasoning..."
https://techcrunch.com/2024/08/27/why-ai-cant-spell-strawberry/
TechCrunch
Why AI can't spell 'strawberry' | TechCrunch
How many times does the letter "r" appear in the word "strawberry"? According to formidable AI products like GPT-4o and Claude, the answer is twice. Large
Can Neural Networks Learn to Reason? by Samy Bengio
The successes of deep learning critically rely on the ability of neural networks to output meaningful predictions on unseen data โ generalization. Yet despite its criticality, there remain fundamental open questions on how neural networks generalize. How much do neural networks rely on memorization โ seeing highly similar training examples โ and how much are they capable of human-intelligence styled reasoning โ identifying abstract rules underlying the data?
https://youtu.be/lCSdC8b0MrY
The successes of deep learning critically rely on the ability of neural networks to output meaningful predictions on unseen data โ generalization. Yet despite its criticality, there remain fundamental open questions on how neural networks generalize. How much do neural networks rely on memorization โ seeing highly similar training examples โ and how much are they capable of human-intelligence styled reasoning โ identifying abstract rules underlying the data?
https://youtu.be/lCSdC8b0MrY
YouTube
Can Neural Networks Learn to Reason? | AMLD Keynote Session | Samy Bengio
AMLD Keynote Session
"Can Neural Networks Learn to Reason?"
Samy Bengio
The Applied Machine Learning Days channel features talks and performances from the Applied Machine Learning Days.
AMLD is one of the largest machine learning & AI events in Europe, focusedโฆ
"Can Neural Networks Learn to Reason?"
Samy Bengio
The Applied Machine Learning Days channel features talks and performances from the Applied Machine Learning Days.
AMLD is one of the largest machine learning & AI events in Europe, focusedโฆ
๐คฏ100M Token Context Windows
I don't like to hype about AI but this is some next level stuff and all of you devs are done.
https://magic.dev/blog/100m-token-context-windows
I don't like to hype about AI but this is some next level stuff and all of you devs are done.
While the commercial applications of these ultra-long context models are plenty, at Magic we are focused on the domain of software development.
Itโs easy to imagine how much better code synthesis would be if models had all of your code, documentation, and libraries in context, including those not on the public internet.
https://magic.dev/blog/100m-token-context-windows
magic.dev
100M Token Context Windows โ Magic
Research update on ultra-long context models, our partnership with Google Cloud, and new funding.
๐ฅ3
Forwarded from Samson Endale ๐ช๐น
Just to be clear, I want us (human civilization) to have AGI no mater the cost BUT my issue is over hyping
I hope we will have strong AI one day but until that days comes, I'm gonna be skeptic about this trend.
I hope we will have strong AI one day but until that days comes, I'm gonna be skeptic about this trend.
๐2