Writesonic/GPTRouter
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Language: TypeScript
#anthropic #azure_openai #cohere #google_gemini #langchain #llama_index #llm #llmops #llms #mlops #openai #palm_api
Stars: 130 Issues: 6 Forks: 16
https://github.com/Writesonic/GPTRouter
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Language: TypeScript
#anthropic #azure_openai #cohere #google_gemini #langchain #llama_index #llm #llmops #llms #mlops #openai #palm_api
Stars: 130 Issues: 6 Forks: 16
https://github.com/Writesonic/GPTRouter
GitHub
GitHub - Writesonic/GPTRouter: Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speedโฆ
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability. - Writesonic/GPTRouter
โค5
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer
GitHub
GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving for Local Deployment
High-speed Large Language Model Serving for Local Deployment - SJTU-IPADS/PowerInfer
๐2โค1
hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer
GitHub
GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving
Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.
๐ฅ1
mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
Language: TypeScript
#ai #browser #browser_automation #gpt #langchain #llama #llm #openai #playwright #puppeteer #scraper
Stars: 332 Issues: 6 Forks: 23
https://github.com/mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
Language: TypeScript
#ai #browser #browser_automation #gpt #langchain #llama #llm #openai #playwright #puppeteer #scraper
Stars: 332 Issues: 6 Forks: 23
https://github.com/mishushakov/llm-scraper
GitHub
GitHub - mishushakov/llm-scraper: Turn any webpage into structured data using LLMs
Turn any webpage into structured data using LLMs. Contribute to mishushakov/llm-scraper development by creating an account on GitHub.
๐2
mbzuai-oryx/LLaVA-pp
๐ฅ๐ฅ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Language: Python
#conversation #llama_3_llava #llama_3_vision #llama3 #llama3_llava #llama3_vision #llava #llava_llama3 #llava_phi3 #llm #lmms #phi_3_llava #phi_3_vision #phi3 #phi3_llava #phi3_vision #vision_language
Stars: 297 Issues: 2 Forks: 13
https://github.com/mbzuai-oryx/LLaVA-pp
๐ฅ๐ฅ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Language: Python
#conversation #llama_3_llava #llama_3_vision #llama3 #llama3_llava #llama3_vision #llava #llava_llama3 #llava_phi3 #llm #lmms #phi_3_llava #phi_3_vision #phi3 #phi3_llava #phi3_vision #vision_language
Stars: 297 Issues: 2 Forks: 13
https://github.com/mbzuai-oryx/LLaVA-pp
GitHub
GitHub - mbzuai-oryx/LLaVA-pp: ๐ฅ๐ฅ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
๐ฅ๐ฅ LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3) - mbzuai-oryx/LLaVA-pp
WinampDesktop/winamp
Iconic media player
Language: C++
#llama #media #player #winamp
Stars: 1751 Issues: 15 Forks: 428
https://github.com/WinampDesktop/winamp
Iconic media player
Language: C++
#llama #media #player #winamp
Stars: 1751 Issues: 15 Forks: 428
https://github.com/WinampDesktop/winamp
๐1๐1๐ฅด1๐1
vietanhdev/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
Language: Python
#llama #llama_3_2 #llama3 #llava #moondream #owen #personal_assistant #private_gpt
Stars: 170 Issues: 0 Forks: 12
https://github.com/vietanhdev/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
Language: Python
#llama #llama_3_2 #llama3 #llava #moondream #owen #personal_assistant #private_gpt
Stars: 170 Issues: 0 Forks: 12
https://github.com/vietanhdev/llama-assistant
GitHub
GitHub - nrl-ai/llama-assistant: AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and manyโฆ
AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace. - nrl-ai/llama-assistant
๐1
maiqingqiang/NotebookMLX
๐ NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
Language: Jupyter Notebook
#ai #llama #mlx #notebookllama #notebooklm
Stars: 121 Issues: 1 Forks: 7
https://github.com/maiqingqiang/NotebookMLX
๐ NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
Language: Jupyter Notebook
#ai #llama #mlx #notebookllama #notebooklm
Stars: 121 Issues: 1 Forks: 7
https://github.com/maiqingqiang/NotebookMLX
GitHub
GitHub - johnmai-dev/NotebookMLX: ๐ NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
๐ NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama) - johnmai-dev/NotebookMLX
edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
GitHub
GitHub - edwko/OuteTTS: Interface for OuteTTS models.
Interface for OuteTTS models. Contribute to edwko/OuteTTS development by creating an account on GitHub.
papersgpt/papersgpt-for-zotero
Zotero AI plugin chatting papers with ChatGPT, Gemini, Claude, Llama 3.2, QwQ-32B-Preview, Marco-o1, Gemma, Mistral and Phi-3.5
Language: JavaScript
#ai #chatgpt #claude #gemini #gemma #llama #marco_o1 #mistral #paper #phi_3 #qwq_32b_preview #summary #zotero #zotero_plugin
Stars: 232 Issues: 3 Forks: 1
https://github.com/papersgpt/papersgpt-for-zotero
Zotero AI plugin chatting papers with ChatGPT, Gemini, Claude, Llama 3.2, QwQ-32B-Preview, Marco-o1, Gemma, Mistral and Phi-3.5
Language: JavaScript
#ai #chatgpt #claude #gemini #gemma #llama #marco_o1 #mistral #paper #phi_3 #qwq_32b_preview #summary #zotero #zotero_plugin
Stars: 232 Issues: 3 Forks: 1
https://github.com/papersgpt/papersgpt-for-zotero
GitHub
GitHub - papersgpt/papersgpt-for-zotero: Chat Multiple PDFs in Zotero AI with Gemini, Grok 4, DeepSeek, GPT, ChatGPT, Claude, OpenRouterโฆ
Chat Multiple PDFs in Zotero AI with Gemini, Grok 4, DeepSeek, GPT, ChatGPT, Claude, OpenRouter, Gemma 3, Qwen 3 - papersgpt/papersgpt-for-zotero
โค1
zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight
๐1
ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
GitHub
GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,โฆ
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...
therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch
GitHub
GitHub - therealoliver/Deepdive-llama3-from-scratch: Achieve the llama3 inference step-by-step, grasp the core concepts, masterโฆ
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code. - therealoliver/Deepdive-llama3-from-scratch
๐1
dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python
#apple_silicon #generative_ai #kv_cache #llama_cpp #llm #m1 #m2 #m3 #memory_optimization #metal #optimization #quantization
Stars: 222 Issues: 1 Forks: 5
https://github.com/dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python
#apple_silicon #generative_ai #kv_cache #llama_cpp #llm #m1 #m2 #m3 #memory_optimization #metal #optimization #quantization
Stars: 222 Issues: 1 Forks: 5
https://github.com/dipampaul17/KVSplit
GitHub
GitHub - dipampaul17/KVSplit: Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cacheโฆ
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with &am...