I work on a lot of NLP projects and it's starting to feel like I do more prompting than actual coding.
LLMs might completely change the way we code as they become more integrated into our tools and workflows. Still doing a lot of stitching with tools like ChatGPT and Copilot but I expect more seamlessness as these tools provide more functionalities like function calling.
LLMs might completely change the way we code as they become more integrated into our tools and workflows. Still doing a lot of stitching with tools like ChatGPT and Copilot but I expect more seamlessness as these tools provide more functionalities like function calling.
🔥2❤1
H2O LLM Studio
A framework and no-code GUI for fine-tuning LLMs and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs).
Documentation: https://h2oai.github.io/h2o-llmstudio/
https://github.com/h2oai/h2o-llmstudio
A framework and no-code GUI for fine-tuning LLMs and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs).
Documentation: https://h2oai.github.io/h2o-llmstudio/
https://github.com/h2oai/h2o-llmstudio
h2oai.github.io
H2O LLM Studio | Docs | H2O LLM Studio | Docs
A framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs)
❤1👍1
This media is not supported in your browser
VIEW IN TELEGRAM
The source code for NVIDIA's BundleSDF library has been released.
BundleSDF is a neural network based 6-DoF tracking and 3D reconstruction library for unknown objects. We would love to see this running on a ROS robot with MoveIt!
Source: https://github.com/NVlabs/BundleSDF
Paper: https://bundlesdf.github.io/
BundleSDF is a neural network based 6-DoF tracking and 3D reconstruction library for unknown objects. We would love to see this running on a ROS robot with MoveIt!
Source: https://github.com/NVlabs/BundleSDF
Paper: https://bundlesdf.github.io/
🔥4❤1
Stanford University has just opened full access to CS224U. One of their immensely popular graduate-level Natural Language Understanding course taught by Professor Christopher Potts.
Checkout GitHub code & YouTube Playlist.
Checkout GitHub code & YouTube Playlist.
❤5
Introducing IDEFICS, the first open state-of-the-art visual language model at the 80B scale!
The model accepts arbitrary sequences of images and texts and produces text. A bit like a multimodal ChatGPT!
Blogpost: huggingface.co/blog/idefics
Playground: https://huggingface.co/spaces/HuggingFaceM4/idefics_playground
The model accepts arbitrary sequences of images and texts and produces text. A bit like a multimodal ChatGPT!
Blogpost: huggingface.co/blog/idefics
Playground: https://huggingface.co/spaces/HuggingFaceM4/idefics_playground
🔥6❤1
CS 25 has a great roadmap of LLM papers! 🙏
https://web.stanford.edu/class/cs25/
https://web.stanford.edu/class/cs25/
CS25
CS25: Transformers United V5
CS25 has become one of Stanford's hottest and most seminar courses, featuring top researchers at the forefront of Transformers research such as Geoffrey Hinton, Ashish Vaswani, and Andrej Karpathy. Our class has an incredibly popular reception within and…
❤3
Forwarded from Artificial Intelligence
Dear friends,
I’d like to share a part of the origin story of large language models that isn’t widely known.
A lot of early work in natural language processing (NLP) was funded by U.S. military intelligence agencies that needed machine translation and speech recognition capabilities. Then, as now, such agencies analyzed large volumes of text and recorded speech in various languages. They poured money into research in machine translation and speech recognition over decades, which motivated researchers to give these applications disproportionate attention relative to other uses of NLP.
This explains why many important technical breakthroughs in NLP stem from studying translation — more than you might imagine based on the modest role that translation plays in current applications. For instance, the celebrated transformer paper, “Attention is All You Need” by the Google Brain team, introduced a technique for mapping a sentence in one language to a translation in another. This laid the foundation for large language models (LLMs) like ChatGPT, which map a prompt to a generated response.
Or consider the BLEU score, which is occasionally still used to evaluate LLMs by comparing their outputs to ground-truth examples. It was developed in 2002 to measure how well a machine-generated translation compares to a ground truth, human-created translation.
A key component of LLMs is tokenization, the process of breaking raw input text into sub-word components that become the tokens to be processed. For example, the first part of the previous sentence may be divided into tokens like this:
/A /key /component /of /LL/Ms/ is/ token/ization
The most widely used tokenization algorithm for text today is Byte Pair Encoding (BPE), which gained popularity in NLP after a 2015 paper by Sennrich et al. BPE starts with individual characters as tokens and repeatedly merges tokens that occur together frequently. Eventually, entire words as well as common sub-words become tokens. How did this technique come about? The authors wanted to build a model that could translate words that weren’t represented in the training data. They found that splitting words into sub-words created an input representation that enabled the model, if it had seen “token” and “ization,” to guess the meaning of a word it might not have seen before, such as “tokenization.”
I don’t intend this description of NLP history as advocacy for military-funded research. (I have accepted military funding, too. Some of my early work in deep learning at Stanford University was funded by DARPA, a U.S. defense research agency. This led directly to my starting Google Brain.) War is a horribly ugly business, and I would like there to be much less of it. Still, I find it striking that basic research in one area can lead to broadly beneficial developments in others. In similar ways, research into space travel led to LED lights and solar panels, experiments in particle physics led to magnetic resonance imaging, and studies of bacteria’s defenses against viruses led to the CRISPR gene-editing technology.
So it’s especially exciting to see so much basic research going on in so many different areas of AI. Who knows, a few years hence, what today’s experiments will yield?
Keep learning!
Andrew NG
I’d like to share a part of the origin story of large language models that isn’t widely known.
A lot of early work in natural language processing (NLP) was funded by U.S. military intelligence agencies that needed machine translation and speech recognition capabilities. Then, as now, such agencies analyzed large volumes of text and recorded speech in various languages. They poured money into research in machine translation and speech recognition over decades, which motivated researchers to give these applications disproportionate attention relative to other uses of NLP.
This explains why many important technical breakthroughs in NLP stem from studying translation — more than you might imagine based on the modest role that translation plays in current applications. For instance, the celebrated transformer paper, “Attention is All You Need” by the Google Brain team, introduced a technique for mapping a sentence in one language to a translation in another. This laid the foundation for large language models (LLMs) like ChatGPT, which map a prompt to a generated response.
Or consider the BLEU score, which is occasionally still used to evaluate LLMs by comparing their outputs to ground-truth examples. It was developed in 2002 to measure how well a machine-generated translation compares to a ground truth, human-created translation.
A key component of LLMs is tokenization, the process of breaking raw input text into sub-word components that become the tokens to be processed. For example, the first part of the previous sentence may be divided into tokens like this:
/A /key /component /of /LL/Ms/ is/ token/ization
The most widely used tokenization algorithm for text today is Byte Pair Encoding (BPE), which gained popularity in NLP after a 2015 paper by Sennrich et al. BPE starts with individual characters as tokens and repeatedly merges tokens that occur together frequently. Eventually, entire words as well as common sub-words become tokens. How did this technique come about? The authors wanted to build a model that could translate words that weren’t represented in the training data. They found that splitting words into sub-words created an input representation that enabled the model, if it had seen “token” and “ization,” to guess the meaning of a word it might not have seen before, such as “tokenization.”
I don’t intend this description of NLP history as advocacy for military-funded research. (I have accepted military funding, too. Some of my early work in deep learning at Stanford University was funded by DARPA, a U.S. defense research agency. This led directly to my starting Google Brain.) War is a horribly ugly business, and I would like there to be much less of it. Still, I find it striking that basic research in one area can lead to broadly beneficial developments in others. In similar ways, research into space travel led to LED lights and solar panels, experiments in particle physics led to magnetic resonance imaging, and studies of bacteria’s defenses against viruses led to the CRISPR gene-editing technology.
So it’s especially exciting to see so much basic research going on in so many different areas of AI. Who knows, a few years hence, what today’s experiments will yield?
Keep learning!
Andrew NG
👍2❤1
Connect with us on WhatsApp: https://whatsapp.com/channel/0029Va8iIT7KbYMOIWdNVu2Q
WhatsApp.com
Artificial Intelligence | WhatsApp Channel
Artificial Intelligence WhatsApp Channel. *AI will not replace you – but a person using Ai will*
Welcome to the Ai community, where we make *Artificial Intelligence easy, accessible, and powerful for everyone!* Whether you’re a Beginner or an expert, this…
Welcome to the Ai community, where we make *Artificial Intelligence easy, accessible, and powerful for everyone!* Whether you’re a Beginner or an expert, this…
Low-Rank Adaptation of Large Language Models (LoRA) is a training method that accelerates the training of large models while consuming less memory.
🤖 LoRA is like a special trick that helps computers learn better and faster. Imagine a computer trying to learn new things, like recognizing pictures or understanding language. When it learns, it uses something called "weights," which are like little helpers inside the computer.
🚀 Now, LoRA's trick is to make these little helpers work smarter. Instead of changing all the helpers every time the computer learns something new, LoRA only changes a few of them. It's like having a big group of friends, but only a couple of them have to do the hard work, and the others can rest.
💡Here's how it works:
1. The computer has these helpers (weights) that it uses to learn.
2. LoRA makes two special groups of helpers that are smaller and easier to work with.
3. The computer trains these special groups of helpers to learn new things without changing all the original helpers.
4. After training, the computer combines the new helpers with the original ones, like mixing two colors to get a new color.
5. This makes the computer learn faster and doesn't use too much computer memory.
🚀 The good things about LoRA are:
- It helps the computer learn without using too many helpers, so it's faster.
- The original helpers stay the same, so we can use them for different tasks.
- It can work with other tricks that make computers smarter.
- The computer works just as fast when using LoRA, so we don't have to wait.
So, LoRA is like a cool trick that helps computers learn better and faster without making them slow down. It's like having a superhero team of helpers inside the computer! ❤️
🤖 LoRA is like a special trick that helps computers learn better and faster. Imagine a computer trying to learn new things, like recognizing pictures or understanding language. When it learns, it uses something called "weights," which are like little helpers inside the computer.
🚀 Now, LoRA's trick is to make these little helpers work smarter. Instead of changing all the helpers every time the computer learns something new, LoRA only changes a few of them. It's like having a big group of friends, but only a couple of them have to do the hard work, and the others can rest.
💡Here's how it works:
1. The computer has these helpers (weights) that it uses to learn.
2. LoRA makes two special groups of helpers that are smaller and easier to work with.
3. The computer trains these special groups of helpers to learn new things without changing all the original helpers.
4. After training, the computer combines the new helpers with the original ones, like mixing two colors to get a new color.
5. This makes the computer learn faster and doesn't use too much computer memory.
🚀 The good things about LoRA are:
- It helps the computer learn without using too many helpers, so it's faster.
- The original helpers stay the same, so we can use them for different tasks.
- It can work with other tricks that make computers smarter.
- The computer works just as fast when using LoRA, so we don't have to wait.
So, LoRA is like a cool trick that helps computers learn better and faster without making them slow down. It's like having a superhero team of helpers inside the computer! ❤️
❤3👍1
JARVIS-1: Open-Ended Multi-task Agents with Memory-Augmented Multimodal Language Models
abs: arxiv.org/abs/2311.05997
project page: craftjarvis-jarvis1.github.io
"We introduce JARVIS-1, an open-ended agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe."
abs: arxiv.org/abs/2311.05997
project page: craftjarvis-jarvis1.github.io
"We introduce JARVIS-1, an open-ended agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe."
👍2
Finally, we have a hallucination leaderboard! 😍😍
Key Takeaways
📍 Not surprisingly, GPT-4 is the lowest.
📍 Open source LLama 2 70 is pretty competitive!
📍 Google's models are the lowest. Again, this is not surprising given that the #1 reason Bard is not usable is its high hallucination rate.
Really cool that we are beginning to do these evaluations and capture them in leaderboards!
Key Takeaways
📍 Not surprisingly, GPT-4 is the lowest.
📍 Open source LLama 2 70 is pretty competitive!
📍 Google's models are the lowest. Again, this is not surprising given that the #1 reason Bard is not usable is its high hallucination rate.
Really cool that we are beginning to do these evaluations and capture them in leaderboards!
👍10
This media is not supported in your browser
VIEW IN TELEGRAM
Its not only about LLMs......
🌟 Microsoft Introduces Florence-2, a Breakthrough in Computer Vision!
👉 Microsoft has just unveiled Florence-2, a revolutionary foundation model designed for various computer vision and vision-language tasks. This new model simplifies the process by using one backbone for multiple tasks. Read more about it in the paper and project details provided below 👇
Key Highlights:
✅ Achieves state-of-the-art performance in various tasks
✅ Employs a unified, prompt-based representation for vision tasks
✅ Features the FLD-5B dataset, boasting over 5 billion annotations with 126 million pictures
✅ Handles detection, captioning, and grounding—all with a single model
✅ Streamlined with a uniform set of parameters governing everything
🌟 Microsoft Introduces Florence-2, a Breakthrough in Computer Vision!
👉 Microsoft has just unveiled Florence-2, a revolutionary foundation model designed for various computer vision and vision-language tasks. This new model simplifies the process by using one backbone for multiple tasks. Read more about it in the paper and project details provided below 👇
Key Highlights:
✅ Achieves state-of-the-art performance in various tasks
✅ Employs a unified, prompt-based representation for vision tasks
✅ Features the FLD-5B dataset, boasting over 5 billion annotations with 126 million pictures
✅ Handles detection, captioning, and grounding—all with a single model
✅ Streamlined with a uniform set of parameters governing everything
👍4❤1
How big do LLMs need to be able to reason?🤔 Microsoft released Orca 2 this week, a 13B Llama-based LLM trained on complex tasks and reasoning. 🧐 Orca's performance comes from its use of synthetically generated data from bigger LLMs. I took a deeper look at paper and extracted the implementation and other insights.
𝗜𝗺𝗽𝗹𝗲𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻:
1️⃣ Constructed a new dataset (Orca 2) with ~817K samples using prompts from FLAN, and GPT-4 to generate reasoning responses with the help of detailed system prompts.
2️⃣ Grouped prompts into categories based on similarity to assign tailored system prompt that demonstrate different reasoning techniques.
3️⃣ Replaced the original system prompt with a more generic one, to have the model learn the underlying reasoning strategy (Prompt erasing).
4️⃣ Used progressive learning, starting with finetune Llama on FLAN-v2 (1 ep) , retrain on 5M ChatGPT data from Orca 1 (3 ep), combine 1M GPT-4 data from Orca 1 & 800k new Orca 2 data for final training (4 ep).
𝗜𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
📊 Imitation learning can improve capabilities with enough data.
🔬 Reasoning and longer generations to get the correct answer help smaller models to compete with bigger LLMs.
💫 Prompt Erasing helped Orca to “learn” reasoning
🎯 Lowest hallucination rates of comparable models on summarization
⚙️ Used packing for training, concatenating multiple examples into one sequence.
👨🦯 Masked user & system inputs (prompt) and only used generation for loss
🖥 Trained on 32 A100 for 80h
Paper: https://huggingface.co/papers/2311.11045
Model: https://huggingface.co/microsoft/Orca-2-13b
𝗜𝗺𝗽𝗹𝗲𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻:
1️⃣ Constructed a new dataset (Orca 2) with ~817K samples using prompts from FLAN, and GPT-4 to generate reasoning responses with the help of detailed system prompts.
2️⃣ Grouped prompts into categories based on similarity to assign tailored system prompt that demonstrate different reasoning techniques.
3️⃣ Replaced the original system prompt with a more generic one, to have the model learn the underlying reasoning strategy (Prompt erasing).
4️⃣ Used progressive learning, starting with finetune Llama on FLAN-v2 (1 ep) , retrain on 5M ChatGPT data from Orca 1 (3 ep), combine 1M GPT-4 data from Orca 1 & 800k new Orca 2 data for final training (4 ep).
𝗜𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
📊 Imitation learning can improve capabilities with enough data.
🔬 Reasoning and longer generations to get the correct answer help smaller models to compete with bigger LLMs.
💫 Prompt Erasing helped Orca to “learn” reasoning
🎯 Lowest hallucination rates of comparable models on summarization
⚙️ Used packing for training, concatenating multiple examples into one sequence.
👨🦯 Masked user & system inputs (prompt) and only used generation for loss
🖥 Trained on 32 A100 for 80h
Paper: https://huggingface.co/papers/2311.11045
Model: https://huggingface.co/microsoft/Orca-2-13b
huggingface.co
Paper page - Orca 2: Teaching Small Language Models How to Reason
Join the discussion on this paper page
❤5👍1
Think of an LLM that can find entities in a given image, describe the image and answers questions about it, without hallucinating ✨
Kosmos-2 released by Microsoft is a very underrated model that can do that. ☃️ Not only this, but Hugging Face transformers integration makes it super easy to use!
Colab link:
https://colab.research.google.com/drive/1t25qM_lOM-HQG6Wg3aRiF4LOuQMN5lUF?usp=sharing
Kosmos-2 released by Microsoft is a very underrated model that can do that. ☃️ Not only this, but Hugging Face transformers integration makes it super easy to use!
Colab link:
https://colab.research.google.com/drive/1t25qM_lOM-HQG6Wg3aRiF4LOuQMN5lUF?usp=sharing
❤5🔥2👍1
Retrieval-Augmented Generation for Large Language Models: A survey
This paper is a must read.
It covers everything you need to know about the RAG framework and its limitations. It also lists different state-of-the-art techniques to boost its performance in retrieval, augmentation, and generation.
The ultimate goal behind these techniques is to make this framework ready for scalability and production use, especially for use cases and industries where answer quality matters *a lot*.
These are the key ideas the paper discusses to make your RAG more efficient:
- 🗃️ Enhance the quality of indexed data by removing duplicate/redundant information and adding mechanisms to refresh outdated documents
- 🛠️ Optimize index structure by determining the right chunk size through quantitative evaluation
- 🏷️ Add metadata (e.g. date, chapters, or subsection) to the indexed documents to incorporate filtering functionalities that enhance efficiency and relevance
- ↔️ Align the input query with the documents by indexing the chunks of data by the questions they answer
- 🔍 Mixed retrieval: combine different search techniques like keyword-based and semantic search
- 🔄 ReRank: sort the retrieved documents to maximize diversity and optimize the similarity with a « template answer »
- 🗜️ Prompt compression: remove irrelevant context
- 💡 HyDE: generate a hypothetical answer to the input question and use it (with the query) to improve the search
- ✒️ Query rewrite and expansion to reformulate the user’s intent and remove ambiguity
Link: https://arxiv.org/abs/2312.10997
This paper is a must read.
It covers everything you need to know about the RAG framework and its limitations. It also lists different state-of-the-art techniques to boost its performance in retrieval, augmentation, and generation.
The ultimate goal behind these techniques is to make this framework ready for scalability and production use, especially for use cases and industries where answer quality matters *a lot*.
These are the key ideas the paper discusses to make your RAG more efficient:
- 🗃️ Enhance the quality of indexed data by removing duplicate/redundant information and adding mechanisms to refresh outdated documents
- 🛠️ Optimize index structure by determining the right chunk size through quantitative evaluation
- 🏷️ Add metadata (e.g. date, chapters, or subsection) to the indexed documents to incorporate filtering functionalities that enhance efficiency and relevance
- ↔️ Align the input query with the documents by indexing the chunks of data by the questions they answer
- 🔍 Mixed retrieval: combine different search techniques like keyword-based and semantic search
- 🔄 ReRank: sort the retrieved documents to maximize diversity and optimize the similarity with a « template answer »
- 🗜️ Prompt compression: remove irrelevant context
- 💡 HyDE: generate a hypothetical answer to the input question and use it (with the query) to improve the search
- ✒️ Query rewrite and expansion to reformulate the user’s intent and remove ambiguity
Link: https://arxiv.org/abs/2312.10997
arXiv.org
Retrieval-Augmented Generation for Large Language Models: A Survey
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination, outdated knowledge, and non-transparent, untraceable reasoning processes....
👍4❤3
Good material on Developing AI systems in Medical Imaging: https://aiformedicalimaging.blogspot.com/2023/09/things-to-consider-when-developing-ai.html
Blogspot
A Guide to Developing AI systems in Medical Imaging
aiformedicalimaging, thermal imaging
👍5