PhD Students - Do you need datasets for your research?
Here are 30 datasets for research from NexData.
Use discount code for 20% off: G5W924C3ZI
1. Korean Exam Question Dataset for AI Training
https://lnkd.in/d_paSwt7
2. Multilingual Grammar Correction Dataset
https://lnkd.in/dV43iqTp
3. High quality video caption dataset
https://lnkd.in/dY9kxkhx
4. 3D models and scenes datasets for AI and simulation
https://lnkd.in/dT-zscH4
5. Image editing datasets – object removal, addition & modification
https://lnkd.in/dd8iCGMS
6. QA dataset – visual & text reasoning
https://lnkd.in/dc3TNWFD
7. English instruction tuning dataset
https://lnkd.in/dTeTgd2M
8. Large scale vision language dataset for AI training
https://lnkd.in/dBJuxazN
9. News dataset
https://lnkd.in/dYBJe5gd
10. Global building photos dataset
https://lnkd.in/dVJsDXnC
11. Facial landmarks dataset
https://lnkd.in/dz_KGCS4
12. 3D Human Pose & Landmarks dataset
https://lnkd.in/dXE9ir8Z
13. 3D Hand Pose & Gesture Recognition dataset
https://lnkd.in/d_QdGGb9
14. 14. Driver monitoring dataset – dangerous, fatigue
https://lnkd.in/d6kF-9PW
15. Japanese handwriting OCR dataset
https://lnkd.in/dHnriqrH
16. American English Male voice TTS dataset
https://lnkd.in/dqyvg862
17. Riddles and brain teasers dataset
https://lnkd.in/dKBHY3DE
18. Chinese test questions text
https://lnkd.in/dQpUd8xC
19. Chinese medical question answering data
https://lnkd.in/dsbWUCpz
20. Multi-round interpersonal dialogues text data
https://lnkd.in/dQiUq_Jg
21. Human activity recognition dataset
https://lnkd.in/dHM52MfV
22. Facial expression recognition dataset
https://lnkd.in/dqQAfMau
23. Urban surveillance dataset
https://lnkd.in/dc2RCnTk
24. Human body segmentation dataset
https://lnkd.in/d6sSrDxS
25. Fashion segmentation – clothing & accessories
https://lnkd.in/dptNUTz8
26. Fight video dataset – action recognition
https://lnkd.in/dnY_m5hZ
27. Gesture recognition dataset
https://lnkd.in/dFVPivYg
28. Facial skin defects dataset
https://lnkd.in/dKCbUvU6
29. Smoke detection and behaviour recognition dataset
https://lnkd.in/ddGg56R4
30. Weight loss transformation video dataset
https://lnkd.in/dqqT4ed9
https://t.me/CodeProgrammer👾
Here are 30 datasets for research from NexData.
Use discount code for 20% off: G5W924C3ZI
1. Korean Exam Question Dataset for AI Training
https://lnkd.in/d_paSwt7
2. Multilingual Grammar Correction Dataset
https://lnkd.in/dV43iqTp
3. High quality video caption dataset
https://lnkd.in/dY9kxkhx
4. 3D models and scenes datasets for AI and simulation
https://lnkd.in/dT-zscH4
5. Image editing datasets – object removal, addition & modification
https://lnkd.in/dd8iCGMS
6. QA dataset – visual & text reasoning
https://lnkd.in/dc3TNWFD
7. English instruction tuning dataset
https://lnkd.in/dTeTgd2M
8. Large scale vision language dataset for AI training
https://lnkd.in/dBJuxazN
9. News dataset
https://lnkd.in/dYBJe5gd
10. Global building photos dataset
https://lnkd.in/dVJsDXnC
11. Facial landmarks dataset
https://lnkd.in/dz_KGCS4
12. 3D Human Pose & Landmarks dataset
https://lnkd.in/dXE9ir8Z
13. 3D Hand Pose & Gesture Recognition dataset
https://lnkd.in/d_QdGGb9
14. 14. Driver monitoring dataset – dangerous, fatigue
https://lnkd.in/d6kF-9PW
15. Japanese handwriting OCR dataset
https://lnkd.in/dHnriqrH
16. American English Male voice TTS dataset
https://lnkd.in/dqyvg862
17. Riddles and brain teasers dataset
https://lnkd.in/dKBHY3DE
18. Chinese test questions text
https://lnkd.in/dQpUd8xC
19. Chinese medical question answering data
https://lnkd.in/dsbWUCpz
20. Multi-round interpersonal dialogues text data
https://lnkd.in/dQiUq_Jg
21. Human activity recognition dataset
https://lnkd.in/dHM52MfV
22. Facial expression recognition dataset
https://lnkd.in/dqQAfMau
23. Urban surveillance dataset
https://lnkd.in/dc2RCnTk
24. Human body segmentation dataset
https://lnkd.in/d6sSrDxS
25. Fashion segmentation – clothing & accessories
https://lnkd.in/dptNUTz8
26. Fight video dataset – action recognition
https://lnkd.in/dnY_m5hZ
27. Gesture recognition dataset
https://lnkd.in/dFVPivYg
28. Facial skin defects dataset
https://lnkd.in/dKCbUvU6
29. Smoke detection and behaviour recognition dataset
https://lnkd.in/ddGg56R4
30. Weight loss transformation video dataset
https://lnkd.in/dqqT4ed9
https://t.me/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❤8👍5👏3💯1
This media is not supported in your browser
VIEW IN TELEGRAM
🤖 Python libraries for AI agents — what to study
If you want to develop AI agents in Python, it's important to understand the order of studying libraries.
Start with LangChain, CrewAI or SmolAgents — they allow you to quickly assemble simple agents, connect tools, and test ideas.
The next level is LangGraph, LlamaIndex and Semantic Kernel. These tools are already used for production systems: RAG, orchestration, and complex workflows.
The most complex level is AutoGen, DSPy and A2A. They are needed for autonomous multi-agent systems and optimizing LLM pipelines.
LangChain — simple agents, tools, and memory
github.com/langchain-ai/langchain
CrewAI — multi-agent systems with roles
github.com/joaomdmoura/crewAI
SmolAgents — lightweight agents for quick experiments
github.com/huggingface/smolagents
LangGraph — orchestration and stateful workflow
github.com/langchain-ai/langgraph
LlamaIndex — RAG and knowledge-agents
github.com/run-llama/llama_index
Semantic Kernel — AI workflow and plugins
github.com/microsoft/semantic-kernel
AutoGen — autonomous multi-agent systems
github.com/microsoft/autogen
DSPy — optimizing LLM pipelines
github.com/stanfordnlp/dspy
A2A — protocol for interaction between agents
github.com/a2aproject/A2A
https://t.me/CodeProgrammer🌟
If you want to develop AI agents in Python, it's important to understand the order of studying libraries.
Start with LangChain, CrewAI or SmolAgents — they allow you to quickly assemble simple agents, connect tools, and test ideas.
The next level is LangGraph, LlamaIndex and Semantic Kernel. These tools are already used for production systems: RAG, orchestration, and complex workflows.
The most complex level is AutoGen, DSPy and A2A. They are needed for autonomous multi-agent systems and optimizing LLM pipelines.
LangChain — simple agents, tools, and memory
github.com/langchain-ai/langchain
CrewAI — multi-agent systems with roles
github.com/joaomdmoura/crewAI
SmolAgents — lightweight agents for quick experiments
github.com/huggingface/smolagents
LangGraph — orchestration and stateful workflow
github.com/langchain-ai/langgraph
LlamaIndex — RAG and knowledge-agents
github.com/run-llama/llama_index
Semantic Kernel — AI workflow and plugins
github.com/microsoft/semantic-kernel
AutoGen — autonomous multi-agent systems
github.com/microsoft/autogen
DSPy — optimizing LLM pipelines
github.com/stanfordnlp/dspy
A2A — protocol for interaction between agents
github.com/a2aproject/A2A
https://t.me/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
❤13🔥1🎉1
Forwarded from Machine Learning with Python
The Python + Generative AI series by Azure AI Foundry has ended, but all materials are open
Now you can calmly rewatch the recordings, download the slides, and try the code from each session — from LLM and RAG to AI agents and MCP.
All resources are here: aka.ms/pythonai/resources
👉 @codeprogrammer
Now you can calmly rewatch the recordings, download the slides, and try the code from each session — from LLM and RAG to AI agents and MCP.
All resources are here: aka.ms/pythonai/resources
Please open Telegram to view this post
VIEW IN TELEGRAM
❤10👍3🎉2🔥1
🎁 23 Years of SPOTO – Claim Your Free IT Certs Prep Kit!
🔥Whether you're preparing for #Python, #AI, #Cisco, #PMI, #Fortinet, #AWS, #Azure, #Excel, #comptia, #ITIL, #cloud or any other in-demand certification – SPOTO has got you covered!
✅ Free Resources :
・Free Python, Excel, Cyber Security, Cisco, SQL, ITIL, PMP, AWS courses: https://bit.ly/4lk4m3c
・IT Certs E-book: https://bit.ly/4bdZOqt
・IT Exams Skill Test: https://bit.ly/4sDvi0b
・Free AI material and support tools: https://bit.ly/46TpsQ8
・Free Cloud Study Guide: https://bit.ly/4lk3dIS
👉 Become Part of Our IT Learning Circle! resources and support:
https://chat.whatsapp.com/Cnc5M5353oSBo3savBl397
💬 Want exam help? Chat with an admin now!
wa.link/rozuuw
🔥Whether you're preparing for #Python, #AI, #Cisco, #PMI, #Fortinet, #AWS, #Azure, #Excel, #comptia, #ITIL, #cloud or any other in-demand certification – SPOTO has got you covered!
✅ Free Resources :
・Free Python, Excel, Cyber Security, Cisco, SQL, ITIL, PMP, AWS courses: https://bit.ly/4lk4m3c
・IT Certs E-book: https://bit.ly/4bdZOqt
・IT Exams Skill Test: https://bit.ly/4sDvi0b
・Free AI material and support tools: https://bit.ly/46TpsQ8
・Free Cloud Study Guide: https://bit.ly/4lk3dIS
👉 Become Part of Our IT Learning Circle! resources and support:
https://chat.whatsapp.com/Cnc5M5353oSBo3savBl397
💬 Want exam help? Chat with an admin now!
wa.link/rozuuw
Do you want to understand the methods used to train LLMs?
The training of large language models (LLMs) is based on various approaches that help models understand and generate text.
Each method shapes the learning process in its own way - from predicting the next word to classifying entire sentences or labeling entities.
Here are 4 common methods of training LLMs in simple language 👇
1. Causal Language Modeling
Predicts the next word in a sequence based on the previous ones. Helps the model master the natural flow of speech and the structure of sentences.
Analogy: how to finish a sentence for another person by guessing the next word.
2. Masked Language Modeling
Learns by guessing the missing words in a sentence based on the surrounding context. Improves the overall understanding of language.
Analogy: how to solve tasks with missing words.
3. Text Classification Modeling
Determines the general class of a sentence (for example, tone or topic) by comparing predictions with actual labels.
Analogy: how to sort letters into folders "Work", "Personal", or "Promotions".
4. Token Classification Modeling
Assigns labels to each word or subword - for example, highlights names, places, or dates in the text.
Analogy: how to highlight words with different colors - names in blue, places in green, dates in yellow.
These methods form the basis of modern LLMs, and each of them plays a role in making AI smarter and more useful.
https://t.me/CodeProgrammer
The training of large language models (LLMs) is based on various approaches that help models understand and generate text.
Each method shapes the learning process in its own way - from predicting the next word to classifying entire sentences or labeling entities.
Here are 4 common methods of training LLMs in simple language 👇
1. Causal Language Modeling
Predicts the next word in a sequence based on the previous ones. Helps the model master the natural flow of speech and the structure of sentences.
Analogy: how to finish a sentence for another person by guessing the next word.
2. Masked Language Modeling
Learns by guessing the missing words in a sentence based on the surrounding context. Improves the overall understanding of language.
Analogy: how to solve tasks with missing words.
3. Text Classification Modeling
Determines the general class of a sentence (for example, tone or topic) by comparing predictions with actual labels.
Analogy: how to sort letters into folders "Work", "Personal", or "Promotions".
4. Token Classification Modeling
Assigns labels to each word or subword - for example, highlights names, places, or dates in the text.
Analogy: how to highlight words with different colors - names in blue, places in green, dates in yellow.
These methods form the basis of modern LLMs, and each of them plays a role in making AI smarter and more useful.
https://t.me/CodeProgrammer
1❤4👍2
Forwarded from Udemy Coupons
Master Python Programming: The Complete Beginner to Advanced
Learn Python Programming from Scratch: Build Real-World Skills for Coding, Automation, and Data Science...
🏷 Category: development
🌍 Language: English (India)
👥 Students: 40,101 students
⭐️ Rating: 4.4/5.0 (1,110 reviews)
🏃♂️ Enrollments Left: N/A
⏳ Expires In: 0D:4H:4M
💰 Price:$28.55 => FREE
🆔 Coupon: JOSHFREE43
⚠️ Please note: A verification layer has been added to prevent bad actors and bots from claiming the courses, so it is important for genuine users to enroll manually to not lose this free opportunity.
💎 By: https://t.me/DataScienceC
Learn Python Programming from Scratch: Build Real-World Skills for Coding, Automation, and Data Science...
🏷 Category: development
🌍 Language: English (India)
👥 Students: 40,101 students
⭐️ Rating: 4.4/5.0 (1,110 reviews)
🏃♂️ Enrollments Left: N/A
⏳ Expires In: 0D:4H:4M
💰 Price:
🆔 Coupon: JOSHFREE43
⚠️ Please note: A verification layer has been added to prevent bad actors and bots from claiming the courses, so it is important for genuine users to enroll manually to not lose this free opportunity.
💎 By: https://t.me/DataScienceC
❤1
This media is not supported in your browser
VIEW IN TELEGRAM
𝐕𝐢𝐬𝐮𝐚𝐥 𝐛𝐥𝐨𝐠 on Vision Transformers is live.
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.me/CodeProgrammer
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
Learn how ViT works from the ground up, and fine-tune one on a real classification dataset.
CNNs process images through small sliding filters. Each filter only sees a tiny local region, and the model has to stack many layers before distant parts of an image can even talk to each other.
Vision Transformers threw that whole approach out.
ViT chops an image into patches, treats each patch like a token, and runs self-attention across the full sequence.
Every patch can attend to every other patch from the very first layer. No stacking required.
That global view from layer one is what made ViT surpass CNNs on large-scale benchmarks.
𝐖𝐡𝐚𝐭 𝐭𝐡𝐞 𝐛𝐥𝐨𝐠 𝐜𝐨𝐯𝐞𝐫𝐬:
- Introduction to Vision Transformers and comparison with CNNs
- Adapting transformers to images: patch embeddings and flattening
- Positional encodings in Vision Transformers
- Encoder-only structure for classification
- Benefits and drawbacks of ViT
- Real-world applications of Vision Transformers
- Hands-on: fine-tuning ViT for image classification
The Image below shows
Self-attention connects every pixel to every other pixel at once. Convolution only sees a small local window. That's why ViT captures things CNNs miss, like the optical illusion painting where distant patches form a hidden face.
The architecture is simple. Split image into patches, flatten them into embeddings (like words in a sentence), run them through a Transformer encoder, and the class token collects info from all patches for the final prediction. Patch in, class out.
Inside attention: each patch (query) compares itself to all other patches (keys), softmax gives attention weights, and the weighted sum of values produces a new representation aware of the full image, visualizes what the CLS token actually attends to through attention heatmaps.
The second half of the blog is hands-on code. I fine-tuned ViT-Base from google (86M params) on the Oxford-IIIT Pet dataset, 37 breeds, ~7,400 images.
𝐁𝐥𝐨𝐠 𝐋𝐢𝐧𝐤
https://vizuaranewsletter.com/p/vision-transformers?r=5b5pyd&utm_campaign=post&utm_medium=web
𝐒𝐨𝐦𝐞 𝐑𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬
ViT paper dissection
https://youtube.com/watch?v=U_sdodhcBC4
Build ViT from Scratch
https://youtube.com/watch?v=ZRo74xnN2SI
Original Paper
https://arxiv.org/abs/2010.11929
https://t.me/CodeProgrammer
❤2