lucidrains/toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Language: Python
#api_calling #artificial_intelligence #attention_mechanisms #deep_learning #transformers
Stars: 419 Issues: 2 Forks: 13
https://github.com/lucidrains/toolformer-pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Language: Python
#api_calling #artificial_intelligence #attention_mechanisms #deep_learning #transformers
Stars: 419 Issues: 2 Forks: 13
https://github.com/lucidrains/toolformer-pytorch
GitHub
GitHub - lucidrains/toolformer-pytorch: Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI - lucidrains/toolformer-pytorch
lucidrains/recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #long_context #memory #recurrence #transformers
Stars: 223 Issues: 0 Forks: 4
https://github.com/lucidrains/recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #long_context #memory #recurrence #transformers
Stars: 223 Issues: 0 Forks: 4
https://github.com/lucidrains/recurrent-memory-transformer-pytorch
GitHub
GitHub - lucidrains/recurrent-memory-transformer-pytorch: Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in…
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch - lucidrains/recurrent-memory-transformer-pytorch
lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #learned_tokenization #long_context #transformers
Stars: 204 Issues: 0 Forks: 10
https://github.com/lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #learned_tokenization #long_context #transformers
Stars: 204 Issues: 0 Forks: 10
https://github.com/lucidrains/MEGABYTE-pytorch
GitHub
GitHub - lucidrains/MEGABYTE-pytorch: Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers…
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch - lucidrains/MEGABYTE-pytorch
lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #audio_generation #deep_learning #non_autoregressive #transformers
Stars: 181 Issues: 0 Forks: 6
https://github.com/lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #audio_generation #deep_learning #non_autoregressive #transformers
Stars: 181 Issues: 0 Forks: 6
https://github.com/lucidrains/soundstorm-pytorch
GitHub
GitHub - lucidrains/soundstorm-pytorch: Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind…
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch - lucidrains/soundstorm-pytorch
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
Language: Python
#artificial_intelligence #attention #attention_is_all_you_need #attention_mechanisms #chatgpt #context_length #gpt3 #gpt4 #machine_learning #transformer
Stars: 381 Issues: 4 Forks: 55
https://github.com/kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
Language: Python
#artificial_intelligence #attention #attention_is_all_you_need #attention_mechanisms #chatgpt #context_length #gpt3 #gpt4 #machine_learning #transformer
Stars: 381 Issues: 4 Forks: 55
https://github.com/kyegomez/LongNet
GitHub
GitHub - kyegomez/LongNet: Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens" - kyegomez/LongNet
lucidrains/meshgpt-pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #mesh_generation #transformers
Stars: 195 Issues: 0 Forks: 7
https://github.com/lucidrains/meshgpt-pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #deep_learning #mesh_generation #transformers
Stars: 195 Issues: 0 Forks: 7
https://github.com/lucidrains/meshgpt-pytorch
GitHub
GitHub - lucidrains/meshgpt-pytorch: Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch - lucidrains/meshgpt-pytorch
kyegomez/MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
Language: Python
#ai #artificial_intelligence #attention_mechanism #machine_learning #mamba #ml #pytorch #ssm #torch #transformer_architecture #transformers #zeta
Stars: 264 Issues: 0 Forks: 9
https://github.com/kyegomez/MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
Language: Python
#ai #artificial_intelligence #attention_mechanism #machine_learning #mamba #ml #pytorch #ssm #torch #transformer_architecture #transformers #zeta
Stars: 264 Issues: 0 Forks: 9
https://github.com/kyegomez/MultiModalMamba
GitHub
GitHub - kyegomez/MultiModalMamba: A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi…
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever. - kyegomez/MultiModalMamba
thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Language: Python
#attention #inference_acceleration #llm #quantization
Stars: 145 Issues: 6 Forks: 3
https://github.com/thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Language: Python
#attention #inference_acceleration #llm #quantization
Stars: 145 Issues: 6 Forks: 3
https://github.com/thu-ml/SageAttention
GitHub
GitHub - thu-ml/SageAttention: Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers,…
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models. - thu-ml/SageAttention
lucidrains/native-sparse-attention-pytorch
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Language: Python
#artificial_intelligence #attention #deep_learning #sparse_attention
Stars: 341 Issues: 3 Forks: 9
https://github.com/lucidrains/native-sparse-attention-pytorch
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Language: Python
#artificial_intelligence #attention #deep_learning #sparse_attention
Stars: 341 Issues: 3 Forks: 9
https://github.com/lucidrains/native-sparse-attention-pytorch
GitHub
GitHub - lucidrains/native-sparse-attention-pytorch: Implementation of the sparse attention pattern proposed by the Deepseek team…
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper - lucidrains/native-sparse-attention-pytorch
therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
GitHub
GitHub - therealoliver/Deepdive-llama3-from-scratch: Achieve the llama3 inference step-by-step, grasp the core concepts, master…
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code. - therealoliver/Deepdive-llama3-from-scratch