๐ฐ ๐ Web scraping will never be the same
Hereโs why itโs a game-changer:
๐ Free and open-source
โญ๏ธ Blazing fast performance,
๐ค LLM-friendly output formats (JSON, cleaned HTML, markdown)
๐ Supports crawling multiple URLs simultaneously
๐จ Extracts all media tags (Images, Audio, Video)
๐ Extracts all external and internal links
But thatโs not all:
๐ Extracts metadata from pages
โ User-agent customization
โ Takes screenshots of pages
๐ Executes custom JavaScript before crawling
Crawl4AI simplifies web crawling and data extraction, making it ready to use for LLMs and AI applications.
Hereโs why itโs a game-changer:
๐ Free and open-source
โญ๏ธ Blazing fast performance,
๐ค LLM-friendly output formats (JSON, cleaned HTML, markdown)
๐ Supports crawling multiple URLs simultaneously
๐จ Extracts all media tags (Images, Audio, Video)
๐ Extracts all external and internal links
But thatโs not all:
๐ Extracts metadata from pages
โ User-agent customization
โ Takes screenshots of pages
๐ Executes custom JavaScript before crawling
๐4
Guide to Building an AI Agent
1๏ธโฃ ๐๐ต๐ผ๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ฅ๐ถ๐ด๐ต๐ ๐๐๐
Not all LLMs are equal. Pick one that:
- Excels in reasoning benchmarks
- Supports chain-of-thought (CoT) prompting
- Delivers consistent responses
๐ Tip: Experiment with models & fine-tune prompts to enhance reasoning.
2๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐๐ผ๐ป๐๐ฟ๐ผ๐น ๐๐ผ๐ด๐ถ๐ฐ
Your agent needs a strategy:
- Tool Use: Call tools when needed; otherwise, respond directly.
- Basic Reflection: Generate, critique, and refine responses.
- ReAct: Plan, execute, observe, and iterate.
- Plan-then-Execute: Outline all steps first, then execute.
๐ Choosing the right approach improves reasoning & reliability.
3๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ผ๐ฟ๐ฒ ๐๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ & ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐
Set operational rules:
- How to handle unclear queries? (Ask clarifying questions)
- When to use external tools?
- Formatting rules? (Markdown, JSON, etc.)
- Interaction style?
๐ Clear system prompts shape agent behavior.
4๏ธโฃ ๐๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ ๐ฎ ๐ ๐ฒ๐บ๐ผ๐ฟ๐ ๐ฆ๐๐ฟ๐ฎ๐๐ฒ๐ด๐
LLMs forget past interactions. Memory strategies:
- Sliding Window: Retain recent turns, discard old ones.
- Summarized Memory: Condense key points for recall.
- Long-Term Memory: Store user preferences for personalization.
๐ Example: A financial AI recalls risk tolerance from past chats.
5๏ธโฃ ๐๐พ๐๐ถ๐ฝ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐ ๐๐ถ๐๐ต ๐ง๐ผ๐ผ๐น๐ & ๐๐ฃ๐๐
Extend capabilities with external tools:
- Name: Clear, intuitive (e.g., "StockPriceRetriever")
- Description: What does it do?
- Schemas: Define input/output formats
- Error Handling: How to manage failures?
๐ Example: A support AI retrieves order details via CRM API.
6๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐ฅ๐ผ๐น๐ฒ & ๐๐ฒ๐ ๐ง๐ฎ๐๐ธ๐
Narrowly defined agents perform better. Clarify:
- Mission: (e.g., "I analyze datasets for insights.")
- Key Tasks: (Summarizing, visualizing, analyzing)
- Limitations: ("I donโt offer legal advice.")
๐ Example: A financial AI focuses on finance, not general knowledge.
7๏ธโฃ ๐๐ฎ๐ป๐ฑ๐น๐ถ๐ป๐ด ๐ฅ๐ฎ๐ ๐๐๐ ๐ข๐๐๐ฝ๐๐๐
Post-process responses for structure & accuracy:
- Convert AI output to structured formats (JSON, tables)
- Validate correctness before user delivery
- Ensure correct tool execution
๐ Example: A financial AI converts extracted data into JSON.
8๏ธโฃ ๐ฆ๐ฐ๐ฎ๐น๐ถ๐ป๐ด ๐๐ผ ๐ ๐๐น๐๐ถ-๐๐ด๐ฒ๐ป๐ ๐ฆ๐๐๐๐ฒ๐บ๐ (๐๐ฑ๐๐ฎ๐ป๐ฐ๐ฒ๐ฑ)
For complex workflows:
- Info Sharing: What context is passed between agents?
- Error Handling: What if one agent fails?
- State Management: How to pause/resume tasks?
๐ Example:
1๏ธโฃ One agent fetches data
2๏ธโฃ Another summarizes
3๏ธโฃ A third generates a report
Master the fundamentals, experiment, and refine and.. now go build something amazing!
1๏ธโฃ ๐๐ต๐ผ๐ผ๐๐ฒ ๐๐ต๐ฒ ๐ฅ๐ถ๐ด๐ต๐ ๐๐๐
Not all LLMs are equal. Pick one that:
- Excels in reasoning benchmarks
- Supports chain-of-thought (CoT) prompting
- Delivers consistent responses
๐ Tip: Experiment with models & fine-tune prompts to enhance reasoning.
2๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐๐ผ๐ป๐๐ฟ๐ผ๐น ๐๐ผ๐ด๐ถ๐ฐ
Your agent needs a strategy:
- Tool Use: Call tools when needed; otherwise, respond directly.
- Basic Reflection: Generate, critique, and refine responses.
- ReAct: Plan, execute, observe, and iterate.
- Plan-then-Execute: Outline all steps first, then execute.
๐ Choosing the right approach improves reasoning & reliability.
3๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ผ๐ฟ๐ฒ ๐๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ & ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐
Set operational rules:
- How to handle unclear queries? (Ask clarifying questions)
- When to use external tools?
- Formatting rules? (Markdown, JSON, etc.)
- Interaction style?
๐ Clear system prompts shape agent behavior.
4๏ธโฃ ๐๐บ๐ฝ๐น๐ฒ๐บ๐ฒ๐ป๐ ๐ฎ ๐ ๐ฒ๐บ๐ผ๐ฟ๐ ๐ฆ๐๐ฟ๐ฎ๐๐ฒ๐ด๐
LLMs forget past interactions. Memory strategies:
- Sliding Window: Retain recent turns, discard old ones.
- Summarized Memory: Condense key points for recall.
- Long-Term Memory: Store user preferences for personalization.
๐ Example: A financial AI recalls risk tolerance from past chats.
5๏ธโฃ ๐๐พ๐๐ถ๐ฝ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐ ๐๐ถ๐๐ต ๐ง๐ผ๐ผ๐น๐ & ๐๐ฃ๐๐
Extend capabilities with external tools:
- Name: Clear, intuitive (e.g., "StockPriceRetriever")
- Description: What does it do?
- Schemas: Define input/output formats
- Error Handling: How to manage failures?
๐ Example: A support AI retrieves order details via CRM API.
6๏ธโฃ ๐๐ฒ๐ณ๐ถ๐ป๐ฒ ๐๐ต๐ฒ ๐๐ด๐ฒ๐ป๐โ๐ ๐ฅ๐ผ๐น๐ฒ & ๐๐ฒ๐ ๐ง๐ฎ๐๐ธ๐
Narrowly defined agents perform better. Clarify:
- Mission: (e.g., "I analyze datasets for insights.")
- Key Tasks: (Summarizing, visualizing, analyzing)
- Limitations: ("I donโt offer legal advice.")
๐ Example: A financial AI focuses on finance, not general knowledge.
7๏ธโฃ ๐๐ฎ๐ป๐ฑ๐น๐ถ๐ป๐ด ๐ฅ๐ฎ๐ ๐๐๐ ๐ข๐๐๐ฝ๐๐๐
Post-process responses for structure & accuracy:
- Convert AI output to structured formats (JSON, tables)
- Validate correctness before user delivery
- Ensure correct tool execution
๐ Example: A financial AI converts extracted data into JSON.
8๏ธโฃ ๐ฆ๐ฐ๐ฎ๐น๐ถ๐ป๐ด ๐๐ผ ๐ ๐๐น๐๐ถ-๐๐ด๐ฒ๐ป๐ ๐ฆ๐๐๐๐ฒ๐บ๐ (๐๐ฑ๐๐ฎ๐ป๐ฐ๐ฒ๐ฑ)
For complex workflows:
- Info Sharing: What context is passed between agents?
- Error Handling: What if one agent fails?
- State Management: How to pause/resume tasks?
๐ Example:
1๏ธโฃ One agent fetches data
2๏ธโฃ Another summarizes
3๏ธโฃ A third generates a report
Master the fundamentals, experiment, and refine and.. now go build something amazing!
๐2
10 Python Libraries Every AI Engineer Should Know
1. Hugging Face Transformers
A powerful library for using and fine-tuning pre-trained transformer models for NLP. Learn more: Hugging Face NLP Course
2. Ollama
A framework for running and managing open-source LLMs locally with ease. Learn video: Ollama Course
3. OpenAI Python SDK
The official toolkit for integrating OpenAI models into Python applications. Learn more: The official developer quickstart guide
4. Anthropic SDK
A client library for seamless interaction with Claude and other Anthropic models. Learn more: Anthropic Python SDK
5. LangChain
A framework for building LLM applications with modular and extensible components. Learn more: DeepLearning.AI
6. LlamaIndex
A toolkit for integrating custom data sources with LLMs for better retrieval. Learn more: Building Agentic RAG with LlamaIndex
7. SQLAlchemy
A Python SQL toolkit and ORM for efficient and maintainable database interactions. Learn more: SQLAlchemy Unified Tutorial
8. ChromaDB
An open-source vector database optimized for AI-powered search and retrieval. Learn more: Getting Started - Chroma Docs
9. Weaviate
A cloud-native vector search engine for efficient semantic search at scale. Learn more: 101T Work with: Text data
10. Weights & Biases
A platform for tracking, visualizing, and optimizing ML experiments.
Learn more: Effective MLOps: Model Development
#artificialintelligence
1. Hugging Face Transformers
A powerful library for using and fine-tuning pre-trained transformer models for NLP. Learn more: Hugging Face NLP Course
2. Ollama
A framework for running and managing open-source LLMs locally with ease. Learn video: Ollama Course
3. OpenAI Python SDK
The official toolkit for integrating OpenAI models into Python applications. Learn more: The official developer quickstart guide
4. Anthropic SDK
A client library for seamless interaction with Claude and other Anthropic models. Learn more: Anthropic Python SDK
5. LangChain
A framework for building LLM applications with modular and extensible components. Learn more: DeepLearning.AI
6. LlamaIndex
A toolkit for integrating custom data sources with LLMs for better retrieval. Learn more: Building Agentic RAG with LlamaIndex
7. SQLAlchemy
A Python SQL toolkit and ORM for efficient and maintainable database interactions. Learn more: SQLAlchemy Unified Tutorial
8. ChromaDB
An open-source vector database optimized for AI-powered search and retrieval. Learn more: Getting Started - Chroma Docs
9. Weaviate
A cloud-native vector search engine for efficient semantic search at scale. Learn more: 101T Work with: Text data
10. Weights & Biases
A platform for tracking, visualizing, and optimizing ML experiments.
Learn more: Effective MLOps: Model Development
#artificialintelligence
๐4
David Baum - Generative AI and LLMs for Dummies (2024).pdf
1.9 MB
Generative AI and LLMs for Dummies
David Baum, 2024
David Baum, 2024
Inside Generative AI, 2024.epub
4.6 MB
Inside Generative AI
Rick Spair, 2024
Rick Spair, 2024
๐7