π Context Engineering for AI Agents: A Deep Dive
π Category: AGENTIC AI
π Date: 2026-04-07 | β±οΈ Read time: 8 min read
How to optimize context, a precious finite resource for AI agents
#DataScience #AI #Python
π Category: AGENTIC AI
π Date: 2026-04-07 | β±οΈ Read time: 8 min read
How to optimize context, a precious finite resource for AI agents
#DataScience #AI #Python
π The Arithmetic of Productivity Boosts: Why Does a β40% Increase in Productivityβ Never Actually Work?
π Category: DATA SCIENCE
π Date: 2026-04-07 | β±οΈ Read time: 5 min read
Why does grand productivity promises never actually deliver? Is every product just bad, or isβ¦
#DataScience #AI #Python
π Category: DATA SCIENCE
π Date: 2026-04-07 | β±οΈ Read time: 5 min read
Why does grand productivity promises never actually deliver? Is every product just bad, or isβ¦
#DataScience #AI #Python
π Sber has released two open-source MoE models: GigaChat-3.1 Ultra and Lightning
Both code and weights are available under the MIT license on HuggingFace.
π Key details:
β’ Trained from scratch (not a finetune) on proprietary data and infrastructure
β’ Mixture-of-Experts (MoE) architecture
Models:
π§ GigaChat-3.1 Ultra
β’ 702B MoE model for high-performance environments
β’ Outperforms DeepSeek-V3-0324 and Qwen3-235B on math and reasoning benchmarks
β’ Supports FP8 training and MTP
β‘οΈ GigaChat-3.1 Lightning
β’ 10B model (1.8B active parameters)
β’ Outperforms Qwen3-4B and Gemma-3-4B on Sber benchmarks
β’ Efficient local inference
β’ Up to 256k context
Engineering highlights:
β’ Custom metric to detect and reduce generation loops
β’ DPO training moved to native FP8
β’ Improvements in post-training pipeline
β’ Identified and fixed a critical issue affecting evaluation quality
π Trained on 14 languages (optimized for English and Russian)
Use cases:
β’ chatbots
β’ AI assistants
β’ copilots
β’ internal ML systems
Sber provides a solid open foundation for developers to build production-ready AI systems with lower infrastructure costs.
Both code and weights are available under the MIT license on HuggingFace.
π Key details:
β’ Trained from scratch (not a finetune) on proprietary data and infrastructure
β’ Mixture-of-Experts (MoE) architecture
Models:
π§ GigaChat-3.1 Ultra
β’ 702B MoE model for high-performance environments
β’ Outperforms DeepSeek-V3-0324 and Qwen3-235B on math and reasoning benchmarks
β’ Supports FP8 training and MTP
β‘οΈ GigaChat-3.1 Lightning
β’ 10B model (1.8B active parameters)
β’ Outperforms Qwen3-4B and Gemma-3-4B on Sber benchmarks
β’ Efficient local inference
β’ Up to 256k context
Engineering highlights:
β’ Custom metric to detect and reduce generation loops
β’ DPO training moved to native FP8
β’ Improvements in post-training pipeline
β’ Identified and fixed a critical issue affecting evaluation quality
π Trained on 14 languages (optimized for English and Russian)
Use cases:
β’ chatbots
β’ AI assistants
β’ copilots
β’ internal ML systems
Sber provides a solid open foundation for developers to build production-ready AI systems with lower infrastructure costs.
β€2
π Why AI Is Training on Its Own Garbage (and How to Fix It)
π Category: MACHINE LEARNING
π Date: 2026-04-08 | β±οΈ Read time: 7 min read
Deep Web Data Is the Gold We Canβt Touch, Yet
#DataScience #AI #Python
π Category: MACHINE LEARNING
π Date: 2026-04-08 | β±οΈ Read time: 7 min read
Deep Web Data Is the Gold We Canβt Touch, Yet
#DataScience #AI #Python
β€2
π Detecting Translation Hallucinations with Attention Misalignment
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-08 | β±οΈ Read time: 15 min read
A low-budget way to get token-level uncertainty estimation for neural machine translations
#DataScience #AI #Python
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-08 | β±οΈ Read time: 15 min read
A low-budget way to get token-level uncertainty estimation for neural machine translations
#DataScience #AI #Python
π How to Use Claude Code to Build a Minimum Viable Product
π Category: AGENTIC AI
π Date: 2026-04-08 | β±οΈ Read time: 8 min read
Learn how to effectively present product ideas by building MVPs with coding agents
#DataScience #AI #Python
π Category: AGENTIC AI
π Date: 2026-04-08 | β±οΈ Read time: 8 min read
Learn how to effectively present product ideas by building MVPs with coding agents
#DataScience #AI #Python
Forwarded from Machine Learning with Python
βοΈ 10 Books to Understand How Large Language Models Function (2026)
1. Deep Learning
https://deeplearningbook.org
The definitive reference for neural networks, covering backpropagation, architectures, and foundational concepts.
2. Artificial Intelligence: A Modern Approach
https://aima.cs.berkeley.edu
A fundamental perspective on artificial intelligence as a comprehensive system.
3. Speech and Language Processing
https://web.stanford.edu/~jurafsky/slp3/
An in-depth examination of natural language processing, transformers, and linguistics.
4. Machine Learning: A Probabilistic Perspective
https://probml.github.io/pml-book/
An exploration of probabilities, statistics, and the theoretical foundations of machine learning.
5. Understanding Deep Learning
https://udlbook.github.io/udlbook/
A contemporary explanation of deep learning principles with strong intuitive insights.
6. Designing Machine Learning Systems
https://oreilly.com/library/view/designing-machine-learning/9781098107956/
Strategies for deploying models into production environments.
7. Generative Deep Learning
https://github.com/3p5ilon/ML-books/blob/main/generative-deep-learning-teaching-machines-to-paint-write-compose-and-play.pdf
Practical applications of generative models and transformer architectures.
8. Natural Language Processing with Transformers
https://dokumen.pub/natural-language-processing-with-transformers-revised-edition-1098136799-9781098136796-9781098103248.html
Methodologies for constructing natural language processing systems based on transformers.
9. Machine Learning Engineering
https://mlebook.com
Principles of machine learning engineering and operational deployment.
10. The Hundred-Page Machine Learning Book
https://themlbook.com
A highly concentrated foundational overview without extraneous detail. ππ€
1. Deep Learning
https://deeplearningbook.org
The definitive reference for neural networks, covering backpropagation, architectures, and foundational concepts.
2. Artificial Intelligence: A Modern Approach
https://aima.cs.berkeley.edu
A fundamental perspective on artificial intelligence as a comprehensive system.
3. Speech and Language Processing
https://web.stanford.edu/~jurafsky/slp3/
An in-depth examination of natural language processing, transformers, and linguistics.
4. Machine Learning: A Probabilistic Perspective
https://probml.github.io/pml-book/
An exploration of probabilities, statistics, and the theoretical foundations of machine learning.
5. Understanding Deep Learning
https://udlbook.github.io/udlbook/
A contemporary explanation of deep learning principles with strong intuitive insights.
6. Designing Machine Learning Systems
https://oreilly.com/library/view/designing-machine-learning/9781098107956/
Strategies for deploying models into production environments.
7. Generative Deep Learning
https://github.com/3p5ilon/ML-books/blob/main/generative-deep-learning-teaching-machines-to-paint-write-compose-and-play.pdf
Practical applications of generative models and transformer architectures.
8. Natural Language Processing with Transformers
https://dokumen.pub/natural-language-processing-with-transformers-revised-edition-1098136799-9781098136796-9781098103248.html
Methodologies for constructing natural language processing systems based on transformers.
9. Machine Learning Engineering
https://mlebook.com
Principles of machine learning engineering and operational deployment.
10. The Hundred-Page Machine Learning Book
https://themlbook.com
A highly concentrated foundational overview without extraneous detail. ππ€
β€1
π Grounding Your LLM: A Practical Guide to RAG for Enterprise Knowledge Bases
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-08 | β±οΈ Read time: 17 min read
A clear mental model and a practical foundation you can build on
#DataScience #AI #Python
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-08 | β±οΈ Read time: 17 min read
A clear mental model and a practical foundation you can build on
#DataScience #AI #Python
How a University Student Built a Game Changing Bot for Polymarket β And You Can Use It Too
A computer science student built a bot that snipes trades before the market reacts! Meet Peter, who automated crypto trading by tracking blockchain data delays. He created the Oracle Lag Sniper to get in on Polymarket trades faster than anyone else.
β‘ Why it works:
β’ Super Fast Execution: Snipes trades before the market catches up
β’ Polymarket-Optimized: Built for speed & accuracy
β’ Open Source & Free: Tweak it as you wish
β’ Easy Setup: No tech skills required!
Start using the Oracle Lag Sniper today. Head to GitHub, set it up, and make smarter, quicker trades.
Sponsored by Polymarket Analytics
A computer science student built a bot that snipes trades before the market reacts! Meet Peter, who automated crypto trading by tracking blockchain data delays. He created the Oracle Lag Sniper to get in on Polymarket trades faster than anyone else.
β‘ Why it works:
β’ Super Fast Execution: Snipes trades before the market catches up
β’ Polymarket-Optimized: Built for speed & accuracy
β’ Open Source & Free: Tweak it as you wish
β’ Easy Setup: No tech skills required!
Start using the Oracle Lag Sniper today. Head to GitHub, set it up, and make smarter, quicker trades.
Sponsored by Polymarket Analytics
β€2π₯2
π A Visual Explanation of Linear Regression
π Category: DATA SCIENCE
π Date: 2026-04-09 | β±οΈ Read time: 107 min read
A long-form article featuring over 100 visualizations, covering a range of topics from how toβ¦
#DataScience #AI #Python
π Category: DATA SCIENCE
π Date: 2026-04-09 | β±οΈ Read time: 107 min read
A long-form article featuring over 100 visualizations, covering a range of topics from how toβ¦
#DataScience #AI #Python
β€1
π How Visual-Language-Action (VLA) Models Work
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-09 | β±οΈ Read time: 18 min read
The mathematical foundations of Vision-Language-Action (VLA) models for humanoid robots and more
#DataScience #AI #Python
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-09 | β±οΈ Read time: 18 min read
The mathematical foundations of Vision-Language-Action (VLA) models for humanoid robots and more
#DataScience #AI #Python
π A Survival Analysis Guide with Python: Using Time-To-Event Models to Forecast Customer Lifetime
π Category: DATA SCIENCE
π Date: 2026-04-09 | β±οΈ Read time: 13 min read
Understand survival analysis by modeling customer retention through Kaplan-Meier curves and Cox Proportional Hazard regressions.
#DataScience #AI #Python
π Category: DATA SCIENCE
π Date: 2026-04-09 | β±οΈ Read time: 13 min read
Understand survival analysis by modeling customer retention through Kaplan-Meier curves and Cox Proportional Hazard regressions.
#DataScience #AI #Python
π The Future of AI for Sales Is Diverse and Distributed
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-09 | β±οΈ Read time: 11 min read
True creativity and innovation will come from human-agent collaboration. One human, millions of agents.
#DataScience #AI #Python
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-09 | β±οΈ Read time: 11 min read
True creativity and innovation will come from human-agent collaboration. One human, millions of agents.
#DataScience #AI #Python
π1
π Why MLOps Retraining Schedules Fail β Models Donβt Forget, They Get Shocked
π Category: MACHINE LEARNING
π Date: 2026-04-10 | β±οΈ Read time: 17 min read
We fitted the Ebbinghaus forgetting curve to 555,000 real fraud transactions and got RΒ² =β¦
#DataScience #AI #Python
π Category: MACHINE LEARNING
π Date: 2026-04-10 | β±οΈ Read time: 17 min read
We fitted the Ebbinghaus forgetting curve to 555,000 real fraud transactions and got RΒ² =β¦
#DataScience #AI #Python
π1
π A Guide to Voice Cloning on Voxtral with a Missing Encoder
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-10 | β±οΈ Read time: 13 min read
Can we reconstruct audio codes if we have audio for the Voxtral text-to-speech model?
#DataScience #AI #Python
π Category: LARGE LANGUAGE MODELS
π Date: 2026-04-10 | β±οΈ Read time: 13 min read
Can we reconstruct audio codes if we have audio for the Voxtral text-to-speech model?
#DataScience #AI #Python
π How Does AI Learn to See in 3D and Understand Space?
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-10 | β±οΈ Read time: 19 min read
How depth estimation, foundation segmentation, and geometric fusion are converging into spatial intelligence
#DataScience #AI #Python
π Category: ARTIFICIAL INTELLIGENCE
π Date: 2026-04-10 | β±οΈ Read time: 19 min read
How depth estimation, foundation segmentation, and geometric fusion are converging into spatial intelligence
#DataScience #AI #Python
β€3π1π€©1
Forwarded from Machine Learning with Python
π 12 Essential Articles for Data Scientists
π· Article: Seq2Seq Learning with NN
https://arxiv.org/pdf/1409.3215
An introduction to Seq2Seq models, which serve as the foundation for machine translation utilizing deep learning.
π· Article: GANs
https://arxiv.org/pdf/1406.2661
An introduction to Generative Adversarial Networks (GANs) and the concept of generating synthetic data. This forms the basis for creating images and videos with artificial intelligence.
π· Article: Attention is All You Need
https://arxiv.org/pdf/1706.03762
This paper was revolutionary in natural language processing. It introduced the Transformer architecture, which underlies GPT, BERT, and contemporary intelligent language models.
π· Article: Deep Residual Learning
https://arxiv.org/pdf/1512.03385
This work introduced the ResNet model, enabling neural networks to achieve greater depth and accuracy without compromising the learning process.
π· Article: Batch Normalization
https://arxiv.org/pdf/1502.03167
This paper introduced a technique that facilitates faster and more stable training of neural networks.
π· Article: Dropout
https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf
A straightforward method designed to prevent overfitting in neural networks.
π· Article: ImageNet Classification with DCNN
https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
The first successful application of a deep neural network for image recognition.
π· Article: Support-Vector Machines
https://link.springer.com/content/pdf/10.1007/BF00994018.pdf
This seminal work introduced the Support Vector Machine (SVM) algorithm, a widely utilized method for data classification.
π· Article: A Few Useful Things to Know About ML
https://homes.cs.washington.edu/~pedro/papers/cacm12.pdf
A comprehensive collection of practical and empirical insights regarding machine learning.
π· Article: Gradient Boosting Machine
https://www.cse.iitb.ac.in/~soumen/readings/papers/Friedman1999GreedyFuncApprox.pdf
This paper introduced the "Gradient Boosting" method, which serves as the foundation for many modern machine learning models, including XGBoost and LightGBM.
π· Article: Latent Dirichlet Allocation
https://jmlr.org/papers/volume3/blei03a/blei03a.pdf
This work introduced a model for text analysis capable of identifying the topics discussed within an article.
π· Article: Random Forests
https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf
This paper introduced the "Random Forest" algorithm, a powerful machine learning method that aggregates multiple models to achieve enhanced accuracy.
https://t.me/CodeProgrammerπ
π· Article: Seq2Seq Learning with NN
https://arxiv.org/pdf/1409.3215
An introduction to Seq2Seq models, which serve as the foundation for machine translation utilizing deep learning.
π· Article: GANs
https://arxiv.org/pdf/1406.2661
An introduction to Generative Adversarial Networks (GANs) and the concept of generating synthetic data. This forms the basis for creating images and videos with artificial intelligence.
π· Article: Attention is All You Need
https://arxiv.org/pdf/1706.03762
This paper was revolutionary in natural language processing. It introduced the Transformer architecture, which underlies GPT, BERT, and contemporary intelligent language models.
π· Article: Deep Residual Learning
https://arxiv.org/pdf/1512.03385
This work introduced the ResNet model, enabling neural networks to achieve greater depth and accuracy without compromising the learning process.
π· Article: Batch Normalization
https://arxiv.org/pdf/1502.03167
This paper introduced a technique that facilitates faster and more stable training of neural networks.
π· Article: Dropout
https://jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf
A straightforward method designed to prevent overfitting in neural networks.
π· Article: ImageNet Classification with DCNN
https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
The first successful application of a deep neural network for image recognition.
π· Article: Support-Vector Machines
https://link.springer.com/content/pdf/10.1007/BF00994018.pdf
This seminal work introduced the Support Vector Machine (SVM) algorithm, a widely utilized method for data classification.
π· Article: A Few Useful Things to Know About ML
https://homes.cs.washington.edu/~pedro/papers/cacm12.pdf
A comprehensive collection of practical and empirical insights regarding machine learning.
π· Article: Gradient Boosting Machine
https://www.cse.iitb.ac.in/~soumen/readings/papers/Friedman1999GreedyFuncApprox.pdf
This paper introduced the "Gradient Boosting" method, which serves as the foundation for many modern machine learning models, including XGBoost and LightGBM.
π· Article: Latent Dirichlet Allocation
https://jmlr.org/papers/volume3/blei03a/blei03a.pdf
This work introduced a model for text analysis capable of identifying the topics discussed within an article.
π· Article: Random Forests
https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf
This paper introduced the "Random Forest" algorithm, a powerful machine learning method that aggregates multiple models to achieve enhanced accuracy.
https://t.me/CodeProgrammer
Please open Telegram to view this post
VIEW IN TELEGRAM
β€1
π When Things Get Weird with Custom Calendars in Tabular Models
π Category: POWER BI
π Date: 2026-04-10 | β±οΈ Read time: 10 min read
Since September 2025, we have had Calendar-based Time Intelligence in Power BI and Fabric Tabularβ¦
#DataScience #AI #Python
π Category: POWER BI
π Date: 2026-04-10 | β±οΈ Read time: 10 min read
Since September 2025, we have had Calendar-based Time Intelligence in Power BI and Fabric Tabularβ¦
#DataScience #AI #Python
β€1