GitHub repos

liltom-eth/llama2-webui
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Language: Python
#llama_2 #llama2 #llm #llm_inference
Stars: 481 Issues: 2 Forks: 42
https://github.com/liltom-eth/llama2-webui

GitHub

GitHub - liltom-eth/llama2-webui: Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2…

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps. - GitHub - liltom-eth/llama2-...

👍4🔥1🥰1👏1

3.1K views10:15

GitHub repos

SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer

GitHub

GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving for Local Deployment

High-speed Large Language Model Serving for Local Deployment - SJTU-IPADS/PowerInfer

👍2❤1

2.03K views11:23

GitHub repos

hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer

GitHub

GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving

Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.

🔥1

2.51K views11:25

GitHub repos

databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language: Python
#databricks #gen_ai #generative_ai #llm #llm_inference #llm_training #mosaic_ai
Stars: 1113 Issues: 7 Forks: 86
https://github.com/databricks/dbrx

GitHub

GitHub - databricks/dbrx: Code examples and resources for DBRX, a large language model developed by Databricks

Code examples and resources for DBRX, a large language model developed by Databricks - databricks/dbrx

2.76K views10:29

GitHub repos

arc53/llm-price-compass
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
Language: TypeScript
#benchmark #gpu #inference_comparison #llm #llm_comparison #llm_inference #llm_price
Stars: 138 Issues: 1 Forks: 5
https://github.com/arc53/llm-price-compass

GitHub

GitHub - arc53/llm-price-compass: This project collects GPU benchmarks from various cloud providers and compares them to fixed…

This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient LLM GPU selections and cost-effective AI models. LLM provide...

2.57K views16:00

GitHub repos

MLSys-Learner-Resources/Awesome-MLSys-Blogger
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
Language: HTML
#llm #llm_inference #llm_training #machine_learning #machine_learning_systems #mlsys
Stars: 120 Issues: 0 Forks: 0
https://github.com/MLSys-Learner-Resources/Awesome-MLSys-Blogger

GitHub

GitHub - MLSys-Learner-Resources/Awesome-MLSys-Blogger: The repository has collected a batch of noteworthy MLSys bloggers (Alg…

The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems) - MLSys-Learner-Resources/Awesome-MLSys-Blogger

👍1

1.88K views05:00

GitHub repos

codelion/openevolve
Open-source implementation of AlphaEvolve
Language: Python
#alpha_evolve #alphacode #alphaevolve #coding_agent #deepmind #deepmind_lab #discovery #distributed_evolutionary_algorithms #evolutionary_algorithms #evolutionary_computation #genetic_algorithm #genetic_algorithms #iterative_methods #iterative_refinement #llm_engineering #llm_ensemble #llm_inference #openevolve #optimize
Stars: 312 Issues: 1 Forks: 26
https://github.com/codelion/openevolve

GitHub

GitHub - algorithmicsuperintelligence/openevolve: Open-source implementation of AlphaEvolve

Open-source implementation of AlphaEvolve. Contribute to algorithmicsuperintelligence/openevolve development by creating an account on GitHub.

1.89K views16:00

GitHub repos

bentoml/llm-inference-in-production
Everything you need to know about LLM inference
Language: TypeScript
#llm #llm_inference
Stars: 154 Issues: 3 Forks: 8
https://github.com/bentoml/llm-inference-in-production

GitHub

GitHub - bentoml/llm-inference-handbook: Everything you need to know about LLM inference

Everything you need to know about LLM inference. Contribute to bentoml/llm-inference-handbook development by creating an account on GitHub.

❤1

1.56K views22:00

GitHub repos

yassa9/qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600

GitHub

GitHub - yassa9/qwen600: Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine - yassa9/qwen600

❤1

1.55K views10:00

About

Blog

Apps

Platform