vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
12pm ▎ +2
1pm ▍ +3
2pm ▏ +1
3pm +0
4pm +0
5pm +0
6pm +0
7pm ▊ +6
8pm ███▉ +31
9pm ████████ +64
10pm ███████▍ +59
11pm ██████▎ +50
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language:Python
Total stars: 162
Stars trend:
#python
#artificialintelligence, #deeplearning, #gpt, #inference, #llama, #llama2, #llminference, #llmserving
Efficient AI Inference & Serving
Language:Python
Total stars: 162
Stars trend:
8 Jan 2024
1am ▎ +2
2am ▌ +4
3am ███ +24
4am ██▌ +20
5am ████▌ +36
6am ███ +24
7am ██▍ +19
8am ██▉ +23
#python
#artificialintelligence, #deeplearning, #gpt, #inference, #llama, #llama2, #llminference, #llmserving
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Language:Python
Total stars: 3518
Stars trend:
#python
#llm, #llminference, #llmserving, #llmtraining, #llmops
本项目旨在分享大模型相关技术原理以及实战经验。
Language:Python
Total stars: 3518
Stars trend:
2 Mar 2024
1am ▏ +1
2am +0
3am ▎ +2
4am ▏ +1
5am ▌ +4
6am █▉ +15
7am █▋ +13
8am ▍ +3
9am █▋ +13
#python
#llm, #llminference, #llmserving, #llmtraining, #llmops
microsoft/aici
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
11 Mar 2024
6pm ██▏ +17
7pm ███▎ +26
8pm ████▊ +38
#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
skypilot-org/skypilot
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
Language:Python
Total stars: 5888
Stars trend:
#python
#cloudcomputing, #cloudmanagement, #costmanagement, #costoptimization, #datascience, #deeplearning, #distributedtraining, #finops, #gpu, #hyperparametertuning, #jobqueue, #jobscheduler, #llmserving, #llmtraining, #machinelearning, #mlinfrastructure, #mlplatform, #multicloud, #spotinstances, #tpu
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
Language:Python
Total stars: 5888
Stars trend:
3 Jun 2024
5pm ██ +16
6pm █▍ +11
7pm ▊ +6
8pm █▌ +12
9pm █ +8
10pm █▍ +11
11pm ▋ +5
4 Jun 2024
12am ▏ +1
1am ▊ +6
#python
#cloudcomputing, #cloudmanagement, #costmanagement, #costoptimization, #datascience, #deeplearning, #distributedtraining, #finops, #gpu, #hyperparametertuning, #jobqueue, #jobscheduler, #llmserving, #llmtraining, #machinelearning, #mlinfrastructure, #mlplatform, #multicloud, #spotinstances, #tpu
zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language:C++
Total stars: 135
Stars trend:
#cplusplus
#cuda, #gpt, #inferenceengine, #llama, #llm, #llmserving, #pytorch
A highly optimized inference acceleration engine for Llama and its variants.
Language:C++
Total stars: 135
Stars trend:
9 Dec 2024
9pm ▏ +1
10pm +0
11pm +0
10 Dec 2024
12am ▏ +1
1am █▎ +10
2am ██▏ +17
3am █▋ +13
4am █▏ +9
5am █▎ +10
6am █ +8
7am ▉ +7
#cplusplus
#cuda, #gpt, #inferenceengine, #llama, #llm, #llmserving, #pytorch
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
20 Jan 2025
8pm ▎ +2
9pm ▎ +2
10pm █▏ +9
11pm ▍ +3
21 Jan 2025
12am ▎ +2
1am █ +8
2am ▉ +7
3am ▉ +7
4am █▌ +12
5am ▊ +6
6am █▎ +10
7am █▍ +11
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
7 Feb 2025
10pm ▊ +6
11pm ▊ +6
8 Feb 2025
12am ▍ +3
1am █ +8
2am █▋ +13
3am █▎ +10
4am ▌ +4
5am █▊ +14
6am █▋ +13
7am █ +8
8am █▊ +14
9am █▊ +14
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
skypilot-org/skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Language:Python
Total stars: 7844
Stars trend:
#python
#cloudcomputing, #cloudmanagement, #costmanagement, #costoptimization, #datascience, #deeplearning, #distributedtraining, #finops, #gpu, #hyperparametertuning, #jobqueue, #jobscheduler, #llmserving, #llmtraining, #machinelearning, #mlinfrastructure, #mlplatform, #multicloud, #spotinstances, #tpu
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Language:Python
Total stars: 7844
Stars trend:
26 Apr 2025
3pm █▍ +11
4pm █ +8
5pm ▍ +3
6pm ▉ +7
7pm ▉ +7
8pm █▍ +11
9pm ▊ +6
10pm ▍ +3
11pm ▍ +3
27 Apr 2025
12am ▎ +2
1am █▏ +9
2am █▍ +11
#python
#cloudcomputing, #cloudmanagement, #costmanagement, #costoptimization, #datascience, #deeplearning, #distributedtraining, #finops, #gpu, #hyperparametertuning, #jobqueue, #jobscheduler, #llmserving, #llmtraining, #machinelearning, #mlinfrastructure, #mlplatform, #multicloud, #spotinstances, #tpu