liltom-eth/llama2-webui
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Language: Python
#llama_2 #llama2 #llm #llm_inference
Stars: 481 Issues: 2 Forks: 42
https://github.com/liltom-eth/llama2-webui
Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Language: Python
#llama_2 #llama2 #llm #llm_inference
Stars: 481 Issues: 2 Forks: 42
https://github.com/liltom-eth/llama2-webui
GitHub
GitHub - liltom-eth/llama2-webui: Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2…
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps. - GitHub - liltom-eth/llama2-...
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
GitHub
GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs - SJTU-IPADS/PowerInfer
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
GitHub
GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving
Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.
databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language: Python
#databricks #gen_ai #generative_ai #llm #llm_inference #llm_training #mosaic_ai
Stars: 1113 Issues: 7 Forks: 86
https://github.com/databricks/dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Language: Python
#databricks #gen_ai #generative_ai #llm #llm_inference #llm_training #mosaic_ai
Stars: 1113 Issues: 7 Forks: 86
https://github.com/databricks/dbrx
GitHub
GitHub - databricks/dbrx: Code examples and resources for DBRX, a large language model developed by Databricks
Code examples and resources for DBRX, a large language model developed by Databricks - databricks/dbrx
arc53/llm-price-compass
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
Language: TypeScript
#benchmark #gpu #inference_comparison #llm #llm_comparison #llm_inference #llm_price
Stars: 138 Issues: 1 Forks: 5
https://github.com/arc53/llm-price-compass
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
Language: TypeScript
#benchmark #gpu #inference_comparison #llm #llm_comparison #llm_inference #llm_price
Stars: 138 Issues: 1 Forks: 5
https://github.com/arc53/llm-price-compass
GitHub
GitHub - arc53/llm-price-compass: LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table
LLM provider price comparison, gpu benchmarks to price per token calculation, gpu benchmark table - arc53/llm-price-compass