Artem Ryblov’s Data Science Weekly

Neural Networks: Zero to Hero by Andrej Karpathy

A course by Andrej Karpathy on building neural networks, from scratch, in code.

"We start with the basics of backpropagation and build up to modern deep neural networks, like GPT. In my opinion language models are an excellent place to learn deep learning, even if your intention is to eventually go to other areas like computer vision because most of what you learn will be immediately transferable. This is why we dive into and focus on language models."

Prerequisites:
- Solid programming (Python)
- Intro-level math (e.g. derivative, gaussian).

Current Syllabus:
- The spelled-out intro to neural networks and backpropagation: building micrograd
- The spelled-out intro to language modeling: building makemore
- Building makemore Part 2: MLP
- Building makemore Part 3: Activations & Gradients, BatchNorm
- Building makemore Part 4: Becoming a Backprop Ninja
- Building makemore Part 5: Building a WaveNet
- Let's build GPT: from scratch, in code, spelled out.
- ongoing...

Links:
- https://karpathy.ai/zero-to-hero.html
- https://github.com/karpathy/nn-zero-to-hero/tree/master

Navigational hashtags: #armknowledgesharing #armcourses
General hashtags: #deeplearning #mlp #batchnorm #backprop #gpt #fromscratch #neuralnetworks #python

@data_science_weekly

339 viewsArtem Ryblov, edited 12:03

Prompt Engineering Guide by Open.AI

This guide shares strategies and tactics for getting better results from large language models (sometimes referred to as GPT models) like GPT-4. The methods described here can sometimes be deployed in combination for greater effect. We encourage experimentation to find the methods that work best for you.

Some of the examples demonstrated here currently work only with our most capable model, gpt-4. In general, if you find that a model fails at a task and a more capable model is available, it's often worth trying again with the more capable model.

Link: https://platform.openai.com/docs/guides/prompt-engineering

Navigational hashtags: #armknowledgesharing #armtutorials
General hashtags: #llm #openai #prompts #promptengineering #gpt #gpt3 #gpt4

@data_science_weekly

660 viewsArtem Ryblov, edited 06:59

Artem Ryblov’s Data Science Weekly

Machine Learning Engineering Online Book by Stas Bekman

An open collection of methodologies to help with successful training of large language models and multi-modal models.

This is a technical material suitable for LLM/VLM training engineers and operators. That is the content here contains lots of scripts and copy-n-paste commands to enable you to quickly address your needs.

This repo is an ongoing brain dump of my experiences training Large Language Models (LLM) (and VLMs); a lot of the know-how Stas acquired while training the open-source BLOOM-176B model in 2022 and IDEFICS-80B multi-modal model in 2023. Currently, he is working on developing/training open-source Retrieval Augmented models at Contextual.AI.

Table of Contents
Part 1. Insights
- The AI Battlefield Engineering - What You Need To Know
Part 2. Key Hardware Components
- Accelerator - the work horses of ML - GPUs, TPUs, IPUs, FPGAs, HPUs, QPUs, RDUs (WIP)
- Network - intra-node and inter-node connectivity, calculating bandwidth requirements
- IO - local and distributed disks and filesystems
- CPU - cpus, affinities (WIP)
- CPU Memory - how much CPU memory is enough - the shortest chapter ever.
Part 3. Performance
- Fault Tolerance
- Performance
- Multi-Node networking
- Model parallelism
Part 4. Operating
- SLURM
- Training hyper-parameters and model initializations
- Instabilities
Part 5. Development
- Debugging software and hardware failures
- And more debugging
- Reproducibility
- Tensor precision / Data types
- HF Transformers notes - making small models, tokenizers, datasets, and other tips
Part 6. Miscellaneous
- Resources - LLM/VLM chronicles

Link: https://github.com/stas00/ml-engineering

Navigational hashtags: #armknowledgesharing #armbooks #armrepo
General hashtags: #llm #gpt #gpt3 #gpt4 #ml #engineering #mlsystemdesign #systemdesign #reproducibility #performance

@data_science_weekly

529 viewsArtem Ryblov, edited 08:57

Artem Ryblov’s Data Science Weekly

What are embeddings? by Vicki Boykis

Over the past decade, embeddings — numerical representations of machine learning features used as input to deep learning models — have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important.

Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.

Link: https://vickiboykis.com/what_are_embeddings/index.html

Navigational hashtags: #armknowledgesharing #armbooks
General hashtags: #dl #deeplearning #pytorch #embeddings #tfidf #svd #pca #word2vec #cbow #skipgram #bert #gpt #llm #transformers

@data_science_weekly

506 viewsArtem Ryblov, 07:02

Artem Ryblov’s Data Science Weekly

🧠 Awesome ChatGPT Prompts

Welcome to the "Awesome ChatGPT Prompts" repository! This is a collection of prompt examples to be used with the ChatGPT model.

The ChatGPT model is a large language model trained by OpenAI that is capable of generating human-like text. By providing it with a prompt, it can generate responses that continue the conversation or expand on the given prompt.

In this repository, you will find a variety of prompts that can be used with ChatGPT.

To get started, simply clone this repository and use the prompts in the README.md file as input for ChatGPT. You can also use the prompts in this file as inspiration for creating your own.

Link: Direct

Navigational hashtags: #armknowledgesharing #armrepo
General hashtags: #prompts #prompt #promptengineering #chatgpt #gpt

@data_science_weekly

307 views07:03

About

Blog

Apps

Platform