microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language: Python
Stars trend:
11 Apr 2023
12 Apr 2023
#python
#billionparameters, #compression, #dataparallelism, #deeplearning, #gpu, #inference, #machinelearning, #mixtureofexperts, #modelparallelism, #pipelineparallelism, #pytorch, #trillionparameters, #zero
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language: Python
Stars trend:
11 Apr 2023
5pm ▎ 2
6pm ▏ 1
7pm ▎ 2
8pm ▋ 5
9pm █▋ 13
10pm ▌ 4
11pm █▊ 14
12 Apr 2023
12am █▊ 14
1am ██▍ 19
2am ██▊ 22
3am ████▉ 39
4am ██▌ 20
#python
#billionparameters, #compression, #dataparallelism, #deeplearning, #gpu, #inference, #machinelearning, #mixtureofexperts, #modelparallelism, #pipelineparallelism, #pytorch, #trillionparameters, #zero
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
12pm ▎ +2
1pm ▍ +3
2pm ▏ +1
3pm +0
4pm +0
5pm +0
6pm +0
7pm ▊ +6
8pm ███▉ +31
9pm ████████ +64
10pm ███████▍ +59
11pm ██████▎ +50
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
guillaumekln/faster-whisper
Faster Whisper transcription with CTranslate2
Language: Python
Total stars: 3284
Stars trend:
19 Jul 2023
#python
#deeplearning, #inference, #openai, #quantization, #speechrecognition, #speechtotext, #transformer, #whisper
Faster Whisper transcription with CTranslate2
Language: Python
Total stars: 3284
Stars trend:
19 Jul 2023
1am ▍ +3
2am ▍ +3
3am ▎ +2
4am ██▌ +20
5am ██████▊ +54
6am █████▏ +41
7am ██▏ +17
8am ██▉ +23
9am █▍ +11
10am █ +8
11am ▉ +7
12pm █▍ +11
#python
#deeplearning, #inference, #openai, #quantization, #speechrecognition, #speechtotext, #transformer, #whisper
tairov/llama2.mojo
Inference Llama 2 in one file of pure 🔥
Total stars: 280
Stars trend:
11 Sep 2023
12 Sep 2023
#inference, #llama, #llama2, #modular, #mojo, #parallelize, #performance, #simd, #tensor, #vectorization
Inference Llama 2 in one file of pure 🔥
Total stars: 280
Stars trend:
11 Sep 2023
10pm ██▌ +20
11pm █▌ +12
12 Sep 2023
12am █▎ +10
1am ██▍ +19
2am ██ +16
3am █▉ +15
4am ██ +16
5am ███▏ +25
6am ███▌ +28
7am ██ +16
8am ███▏ +25
9am ███▋ +29
#inference, #llama, #llama2, #modular, #mojo, #parallelize, #performance, #simd, #tensor, #vectorization
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language:Python
Total stars: 162
Stars trend:
#python
#artificialintelligence, #deeplearning, #gpt, #inference, #llama, #llama2, #llminference, #llmserving
Efficient AI Inference & Serving
Language:Python
Total stars: 162
Stars trend:
8 Jan 2024
1am ▎ +2
2am ▌ +4
3am ███ +24
4am ██▌ +20
5am ████▌ +36
6am ███ +24
7am ██▍ +19
8am ██▉ +23
#python
#artificialintelligence, #deeplearning, #gpt, #inference, #llama, #llama2, #llminference, #llmserving
microsoft/aici
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
11 Mar 2024
6pm ██▏ +17
7pm ███▎ +26
8pm ████▊ +38
#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
cncf/llm-in-action
🤖 Discover how to apply your LLM app skills on Kubernetes!
Language:Python
Total stars: 106
Stars trend:
#python
#cloudnative, #inference, #llm
🤖 Discover how to apply your LLM app skills on Kubernetes!
Language:Python
Total stars: 106
Stars trend:
20 Mar 2024
8am █████████████ +104
#python
#cloudnative, #inference, #llm
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python
Total stars: 7640
Stars trend:
#python
#bloom, #deeplearning, #falcon, #gpt, #inference, #nlp, #pytorch, #starcoder, #transformer
Large Language Model Text Generation Inference
Language:Python
Total stars: 7640
Stars trend:
9 Apr 2024
9pm ▊ +6
10pm ▊ +6
11pm ▍ +3
10 Apr 2024
12am ▋ +5
1am ▌ +4
2am █▎ +10
3am █▌ +12
4am ▉ +7
5am ▊ +6
6am ▋ +5
7am ▊ +6
8am ▉ +7
#python
#bloom, #deeplearning, #falcon, #gpt, #inference, #nlp, #pytorch, #starcoder, #transformer
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Language:C++
Total stars: 9485
Stars trend:
#cplusplus
#deeplearning, #gpuacceleration, #inference, #nvidia, #tensorrt
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Language:C++
Total stars: 9485
Stars trend:
14 Jun 2024
10am ▌ +4
11am ███▊ +30
12pm ████▉ +39
1pm ████▋ +37
#cplusplus
#deeplearning, #gpuacceleration, #inference, #nvidia, #tensorrt
AgibotTech/agibot_x1_infer
The inference module for AgiBot X1.
Language:C++
Total stars: 107
Stars trend:
#cplusplus
#inference, #opensource, #robotics
The inference module for AgiBot X1.
Language:C++
Total stars: 107
Stars trend:
24 Oct 2024
2am ▏ +1
3am ▉ +7
4am ██ +16
5am ██▏ +17
6am ██▏ +17
7am ███▉ +31
#cplusplus
#inference, #opensource, #robotics
adithya-s-k/AI-Engineering.academy
Mastering Applied AI, One Concept at a Time
Language:Jupyter Notebook
Total stars: 262
Stars trend:
#jupyternotebook
#finetuning, #finetuning, #finetuningllms, #inference, #largelanguagemodels, #llm, #python, #quantization
Mastering Applied AI, One Concept at a Time
Language:Jupyter Notebook
Total stars: 262
Stars trend:
3 Dec 2024
7pm ▎ +2
8pm ▍ +3
9pm +0
10pm ▏ +1
11pm ▍ +3
4 Dec 2024
12am ▊ +6
1am █▏ +9
2am █▉ +15
3am █▏ +9
4am █▍ +11
5am █▍ +11
6am █▍ +11
#jupyternotebook
#finetuning, #finetuning, #finetuningllms, #inference, #largelanguagemodels, #llm, #python, #quantization
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
20 Jan 2025
8pm ▎ +2
9pm ▎ +2
10pm █▏ +9
11pm ▍ +3
21 Jan 2025
12am ▎ +2
1am █ +8
2am ▉ +7
3am ▉ +7
4am █▌ +12
5am ▊ +6
6am █▎ +10
7am █▍ +11
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
7 Feb 2025
10pm ▊ +6
11pm ▊ +6
8 Feb 2025
12am ▍ +3
1am █ +8
2am █▋ +13
3am █▎ +10
4am ▌ +4
5am █▊ +14
6am █▋ +13
7am █ +8
8am █▊ +14
9am █▊ +14
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
GetSoloTech/solo-server
Platform for Hardware Aware Inference
Language:Python
Total stars: 136
Stars trend:
#python
#ai, #computervision, #deeplearning, #edgeai, #inference, #llm, #llmops, #machinelearning, #mlops, #modeldeployment, #ondevice, #python3
Platform for Hardware Aware Inference
Language:Python
Total stars: 136
Stars trend:
6 Mar 2025
3am ▏ +1
4am +0
5am ███▊ +30
6am █████▍ +43
7am ▎ +2
#python
#ai, #computervision, #deeplearning, #edgeai, #inference, #llm, #llmops, #machinelearning, #mlops, #modeldeployment, #ondevice, #python3
huggingface/huggingface.js
Utilities to use the Hugging Face Hub API
Language:TypeScript
Total stars: 1819
Stars trend:
#typescript
#apiclient, #hub, #huggingface, #inference, #machinelearning
Utilities to use the Hugging Face Hub API
Language:TypeScript
Total stars: 1819
Stars trend:
22 May 2025
10am ▎ +2
11am ▉ +7
12pm ▉ +7
1pm █▎ +10
2pm ▉ +7
3pm █▏ +9
4pm ▊ +6
5pm ▋ +5
6pm ▊ +6
7pm ▊ +6
8pm █▏ +9
9pm █▏ +9
#typescript
#apiclient, #hub, #huggingface, #inference, #machinelearning