zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight