liltom-eth/llama2-webui
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Language: Python
#llama_2 #llama2 #llm #llm_inference
Stars: 481 Issues: 2 Forks: 42
https://github.com/liltom-eth/llama2-webui
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Language: Python
#llama_2 #llama2 #llm #llm_inference
Stars: 481 Issues: 2 Forks: 42
https://github.com/liltom-eth/llama2-webui
GitHub
GitHub - liltom-eth/llama2-webui: Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2…
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps. - GitHub - liltom-eth/llama2-...
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
GitHub
GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs - SJTU-IPADS/PowerInfer
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
GitHub
GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving
Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language: Python
#databricks #gen_ai #generative_ai #llm #llm_inference #llm_training #mosaic_ai
Stars: 1113 Issues: 7 Forks: 86
https://github.com/databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language: Python
#databricks #gen_ai #generative_ai #llm #llm_inference #llm_training #mosaic_ai
Stars: 1113 Issues: 7 Forks: 86
https://github.com/databricks/dbrx
GitHub
GitHub - databricks/dbrx: Code examples and resources for DBRX, a large language model developed by Databricks
Code examples and resources for DBRX, a large language model developed by Databricks - databricks/dbrx
arc53/llm-price-compass
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
Language: TypeScript
#benchmark #gpu #inference_comparison #llm #llm_comparison #llm_inference #llm_price
Stars: 138 Issues: 1 Forks: 5
https://github.com/arc53/llm-price-compass
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
Language: TypeScript
#benchmark #gpu #inference_comparison #llm #llm_comparison #llm_inference #llm_price
Stars: 138 Issues: 1 Forks: 5
https://github.com/arc53/llm-price-compass
GitHub
GitHub - arc53/llm-price-compass: This project collects GPU benchmarks from various cloud providers and compares them to fixed…
This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient LLM GPU selections and cost-effective AI models. LLM provide...