GenAi, Deep Learning and Computer Vision
3.29K subscribers
31 photos
5 videos
5 files
168 links
Deep Learning๐Ÿ’ก,
Computer Vision ๐Ÿ“ฝ &
#Ai ๐Ÿง 

Get #free_books,
#Online_courses,
#Research_papers,
#Codes, and #Projects,
Tricks and Hacks, coding, training Stuff

Suggestion @AIindian
Download Telegram
Seems like GPT-4 is back on top with their latest update. Will have to wait and see how it compares in real time tests.
๐Ÿ‘4
Microsoft just casually shared their new Phi-3 LLMs less than a week after Llama 3 release. Based on the benchmarks in technical report (https://arxiv.org/abs/2404.14219), even the smallest Phi-3 model beats Llama 3 8B despite being less than half the size.

Phi-3 has "only" been trained on 5x fewer tokens than Llama 3 (3.3 trillion instead of 15 trillion)

Phi-3-mini less has "only" 3.8 billion parameters, less than half the size of Llama 3 8B.

Despite being small enough to be deployed on a phone (according to report), it matches the performance of the much larger Mixtral 8x7B and GPT-3.5. (Phi-3 mini can be quantized to 4-bits, so it only requires โ‰ˆ 1.8GB of memory.)

What is the secret sauce? According to the technical report, it's dataset quality over quantity: "heavily filtered web data and synthetic data".

Next to the 4k context-window version, there's also a phi-3-mini-128K model that supports up to 128k tokens.

Fun fact: Phi-3 uses the same tokenizer with a vocabulary size of 32,064 as Llama 2.
โค4๐Ÿ‘2
Forwarded from Artificial Intelligence
Ai / Computer Vision Bootcamp๐Ÿš€

Learn Ai / Computer Vision from Basics to Deployment from IITian and COEPian.

โœ… Build Face Recognition โ˜บ๏ธ
โœ… Build Ai Object detection ๐Ÿ๏ธ๐Ÿš‡โœˆ๏ธ
โœ… Building Social Distancing App
โœ… Build Automated Invoice reader ๐Ÿ“ƒ
โœ…Image Classification ๐Ÿ˜๐Ÿฅ—
โœ… Build Application of Computer Vision in Healthcare, Automotive, retail, Manufacturing and Security, Surveillance. ๐Ÿ“ธ

+40 Hrs sessions.
+12 Weeks.
+13 Tools & Technology.
+7 Projects.
+7 Homework Assignments.
+5 Case studies
+5 Skills.
+5 Domains.

๐Ÿ“Œ Remote and weekend sessions.
๐Ÿ“Œ Starting from basics.
๐Ÿ“Œ Get Certificate.

Duration: 3 months

Attend the 1st FREE session on 11th May: https://chat.whatsapp.com/BibIwuuUEWrGEWdZHYluNe

For registrations: https://aiindia.ai/cv-bootcamp/
๐Ÿ‘7
Early result of Gemma 2 on the leaderboard, matching Llama-3-70B.

- Full data at leaderboard.lmsys.org
- Chat with Gemma 2 at chat.lmsys.org
- Gemma 2 blog goo.gle/3RLQXUa
๐Ÿ”ฅ2
The matrix calculus for Deep Learning. Very well written. https://explained.ai/matrix-calculus/
๐Ÿ‘5
How Much GPU Memory Needed To Server A LLM ?

This is a common question that consistnetly comes up in interview or during the disscusiion with your business stakeholders.

And itโ€™s not just a random question โ€” itโ€™s a key indicator of how well you understand the deployment and scalability of these powerful models in production.

As a data scientist understanding and estimating the require GPU memory is essential.

LLM's (Large Language Models) size vary from 7 billion parameters to trillions of parameters. One size certainly doesnโ€™t fit all.

Letโ€™s dive into the math that will help you estimate the GPU memory needed for deploying these models effectively.

๐“๐ก๐ž ๐Ÿ๐จ๐ซ๐ฆ๐ฎ๐ฅ๐š ๐ญ๐จ ๐ž๐ฌ๐ญ๐ข๐ฆ๐š๐ญ๐ž ๐†๐๐” ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ ๐ข๐ฌ

General formula, ๐ฆ = ((๐ * ๐ฌ๐ข๐ณ๐ž ๐ฉ๐ž๐ซ ๐ฉ๐š๐ซ๐š๐ฆ๐ž๐ญ๐ž๐ซ)/๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ ๐๐ž๐ง๐ฌ๐ข๐ญ๐ฒ) * ๐จ๐ฏ๐ž๐ซ๐ก๐ž๐š๐ ๐Ÿ๐š๐œ๐ญ๐จ๐ซ

Where:
- ๐ฆ is the GPU memory in Gigabytes.
- ๐ฉ is the number of parameters in the model.
- ๐ฌ๐ข๐ณ๐ž ๐ฉ๐ž๐ซ ๐ฉ๐š๐ซ๐š๐ฆ๐ž๐ญ๐ž๐ซ typically refers to the bytes needed for each model parameter, which is typically 4 bytes for float32 precision.
- ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ ๐๐ž๐ง๐ฌ๐ข๐ญ๐ฒ (q) refer to the number of bits typically processed in parallel, such as 32 bits for a typical GPU memory channel.
- ๐จ๐ฏ๐ž๐ซ๐ก๐ž๐š๐ ๐Ÿ๐š๐œ๐ญ๐จ๐ซ is often applied (e.g., 1.2) to account for additional memory needed beyond just storing parameters, such as activations, temporary tensors, and any memory fragmentation or padding.

๐’๐ข๐ฆ๐ฉ๐ฅ๐ข๐Ÿ๐ข๐ž๐ ๐…๐จ๐ซ๐ฆ๐ฎ๐ฅ๐š:

M = ((P * 4B)/(32/Q)) * 1.2

With this formula in hand, I hope you'll feel more confident when discussing GPU memory requirements with your business stakeholders.

#LLM
๐Ÿ‘5โค1
Uber used RAG and AI agents to build its in-house Text-to-SQL, saving 140,000 hours annually in query writing time. ๐Ÿ“ˆ
Hereโ€™s how they built the system end-to-end:

The system is called QueryGPT and is built on top of multiple agents each handling a part of the pipeline.

1. First, the Intent Agent interprets user intent and figures out the domain workspace which is relevant to answer the question (e.g., Mobility, Billing, etc).

2. The Table Agent then selects suitable tables using an LLM, which users can also review and adjust.

3. Next, the Column Prune Agent filters out any unnecessary columns from large tables using RAG. This helps the schema fit within token limits.

4. Finally, QueryGPT uses Few-Shot Prompting with selected SQL samples and schemas to generate the query.

QueryGPT reduced query authoring time from 10 minutes to 3, saving over 140,000 hours annually!

Link to the full article: https://www.uber.com/en-IN/blog/query-gpt/?uclick_id=6cfc9a34-aa3e-4140-9e8e-34e867b80b2b
๐Ÿ‘5โค1
"Agents are not enough." New Microsoft research explores that for the latest wave of agents, differentiated by GenAI, to succeed, they need to work together with Sims and Assistants (see diagram on page 3):

Agents are nothing new, evolving from early agents (1950s) to expert systems (1980s), reactive agents (1990s), and more recently multi-agent systems and cognitive architectures.

While frameworks like AutoGen help modern agents tackle complex tasks in narrow domains, challenges like generalization, scalability, and coordination persist.

To help tackle challenges and improve standardization, privacy, personalization, and trust, the research advocates for an ecosystem centered on Agents, Sims, and Assistants.

1\ Agents:

โ†’ Narrow and purpose-driven modules that are trained to do a specific task. Each agent can be autonomous, but with an ability to interface with other agents.

2\ Sims:

โ†’ Representations of the user, built from their profile, preferences, and behaviors, capturing key aspects of who the user is.
โ†’ Sims can act on the userโ€™s behalf, interacting with agents to accomplish tasks, guided by the userโ€™s Assistant.

3\ Assistants:

โ†’ Programs that interact directly with users, deeply understand them, and can call Sims or Agents to handle tasks reactively or proactively.
โ†’ Assistants act as private agents, accessing personal information and fine-tuned to the user, enabling them to perform tasks on the user's behalf.

Interaction

โ†’ Agents, Sims, and Assistants work together with high degree of synergy.
โ†’ The Assistant, deeply understanding the user, co-creates and manages Sims with user input, reflecting different facets of the userโ€™s life.
โ†’ Sims engage specialized Agents to complete tasks effectively, ensuring precision and personalization, which enhances user satisfaction.

P.S. Paper attached with link dives deeper: https://www.arxiv.org/pdf/2412.16241
๐Ÿ‘3