beam-cloud/beta9
The open-source serverless GPU container runtime.
Language:Go
Total stars: 82
Stars trend:
#go
#cuda, #distributedcomputing, #finetuning, #generativeai, #gpu, #largelanguagemodels, #llm, #llminference, #mlplatform, #selfhosted
The open-source serverless GPU container runtime.
Language:Go
Total stars: 82
Stars trend:
13 May 2024
5pm █▎ +10
6pm ███▎ +26
7pm ██▋ +21
8pm ▉ +7
9pm ▊ +6
10pm █▍ +11
#go
#cuda, #distributedcomputing, #finetuning, #generativeai, #gpu, #largelanguagemodels, #llm, #llminference, #mlplatform, #selfhosted
HigherOrderCO/HVM
A massively parallel, optimal functional runtime in Rust
Language:Cuda
Total stars: 7284
Stars trend:
#cuda
A massively parallel, optimal functional runtime in Rust
Language:Cuda
Total stars: 7284
Stars trend:
16 May 2024
4pm ▏ +1
5pm +0
6pm ▏ +1
7pm +0
8pm +0
9pm █▌ +12
10pm █▊ +14
11pm █▋ +13
17 May 2024
12am █▏ +9
1am █▍ +11
2am ▉ +7
3am █▎ +10
#cuda
m4rs-mt/ILGPU
ILGPU JIT Compiler for high-performance .Net GPU programs
Language:C#
Total stars: 1156
Stars trend:
#csharp
#amd, #cil, #compiler, #cpu, #cuda, #dotnet, #gpgpu, #gpgpucomputing, #gpu, #ilgpu, #intel, #jit, #kernels, #msil, #nvidia, #opencl, #parallel, #ptx
ILGPU JIT Compiler for high-performance .Net GPU programs
Language:C#
Total stars: 1156
Stars trend:
17 May 2024
8pm ▏ +1
9pm ▏ +1
10pm ██▍ +19
11pm █▉ +15
18 May 2024
12am ▉ +7
1am ▊ +6
2am ▉ +7
3am █▎ +10
4am ▍ +3
5am █ +8
#csharp
#amd, #cil, #compiler, #cpu, #cuda, #dotnet, #gpgpu, #gpgpucomputing, #gpu, #ilgpu, #intel, #jit, #kernels, #msil, #nvidia, #opencl, #parallel, #ptx
rapidsai/cudf
cuDF - GPU DataFrame Library
Language:C++
Total stars: 7615
Stars trend:
#cplusplus
#arrow, #cpp, #cuda, #cudf, #dask, #dataanalysis, #datascience, #dataframe, #gpu, #pandas, #pydata, #python, #rapids
cuDF - GPU DataFrame Library
Language:C++
Total stars: 7615
Stars trend:
2 Jun 2024
7pm ▎ +2
8pm ▋ +5
9pm +0
10pm +0
11pm █▍ +11
3 Jun 2024
12am █▏ +9
1am ██ +16
2am █▌ +12
3am █▋ +13
4am ██▌ +20
#cplusplus
#arrow, #cpp, #cuda, #cudf, #dask, #dataanalysis, #datascience, #dataframe, #gpu, #pandas, #pydata, #python, #rapids
likejazz/llama3.cuda
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
Language:Cuda
Total stars: 89
Stars trend:
#cuda
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
Language:Cuda
Total stars: 89
Stars trend:
2 Jun 2024
11pm ▊ +6
3 Jun 2024
12am █▍ +11
1am █▌ +12
2am ▋ +5
3am ▌ +4
4am ▋ +5
5am ▌ +4
6am ▊ +6
7am █▏ +9
8am █▎ +10
9am █ +8
#cuda
clu0/unet.cu
UNet diffusion model in pure CUDA
Language:Cuda
Total stars: 109
Stars trend:
#cuda
UNet diffusion model in pure CUDA
Language:Cuda
Total stars: 109
Stars trend:
28 Jun 2024
7pm ███▋ +29
8pm ████ +32
9pm █▉ +15
10pm █▎ +10
11pm █▏ +9
29 Jun 2024
12am █▏ +9
#cuda
NVIDIA/gpu-operator
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
Language:Go
Total stars: 1354
Stars trend:
#go
#cuda, #gpu, #kubernetes, #nvidia
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
Language:Go
Total stars: 1354
Stars trend:
2 Jul 2024
3am ▎ +2
4am ▌ +4
5am ▏ +1
6am █▍ +11
7am ▋ +5
8am ▍ +3
9am ▋ +5
10am ▊ +6
11am █▎ +10
12pm █▏ +9
1pm █▊ +14
2pm ▉ +7
#go
#cuda, #gpu, #kubernetes, #nvidia
chrxh/alien
ALIEN is a CUDA-powered artificial life simulation program.
Language:C++
Total stars: 4417
Stars trend:
#cplusplus
#agentbasedsimulation, #artificiallife, #cuda, #openendedevolution, #physicsengine
ALIEN is a CUDA-powered artificial life simulation program.
Language:C++
Total stars: 4417
Stars trend:
18 Aug 2024
11am ██▏ +17
12pm █▉ +15
1pm █▍ +11
2pm ██▌ +20
3pm ██▋ +21
4pm ██▌ +20
5pm █▋ +13
6pm ██▍ +19
7pm █▌ +12
8pm █▏ +9
9pm █▏ +9
10pm ▉ +7
#cplusplus
#agentbasedsimulation, #artificiallife, #cuda, #openendedevolution, #physicsengine
cupy/cupy
NumPy & SciPy for GPU
Language:Python
Total stars: 8386
Stars trend:
#python
#cublas, #cuda, #cudnn, #cupy, #curand, #cusolver, #cusparse, #cusparselt, #cutensor, #gpu, #nccl, #numpy, #nvrtc, #nvtx, #python, #rocm, #scipy, #tensor
NumPy & SciPy for GPU
Language:Python
Total stars: 8386
Stars trend:
20 Sep 2024
3am ▍ +3
4am ▏ +1
5am ▎ +2
6am +0
7am ▎ +2
8am +0
9am +0
10am █ +8
11am █ +8
12pm ▊ +6
1pm █▊ +14
2pm ██████ +48
#python
#cublas, #cuda, #cudnn, #cupy, #curand, #cusolver, #cusparse, #cusparselt, #cutensor, #gpu, #nccl, #numpy, #nvrtc, #nvtx, #python, #rocm, #scipy, #tensor
shader-slang/slang
Making it easier to work with shaders
Language:C++
Total stars: 2472
Stars trend:
#cplusplus
#cuda, #d3d12, #glsl, #hlsl, #shaders, #vulkan
Making it easier to work with shaders
Language:C++
Total stars: 2472
Stars trend:
22 Nov 2024
2am █▋ +13
3am ▊ +6
4am ▊ +6
5am ▌ +4
6am ▉ +7
7am ██ +16
8am █▏ +9
9am █▍ +11
10am ▋ +5
11am ▋ +5
12pm ▌ +4
1pm █▎ +10
#cplusplus
#cuda, #d3d12, #glsl, #hlsl, #shaders, #vulkan
zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language:C++
Total stars: 135
Stars trend:
#cplusplus
#cuda, #gpt, #inferenceengine, #llama, #llm, #llmserving, #pytorch
A highly optimized inference acceleration engine for Llama and its variants.
Language:C++
Total stars: 135
Stars trend:
9 Dec 2024
9pm ▏ +1
10pm +0
11pm +0
10 Dec 2024
12am ▏ +1
1am █▎ +10
2am ██▏ +17
3am █▋ +13
4am █▏ +9
5am █▎ +10
6am █ +8
7am ▉ +7
#cplusplus
#cuda, #gpt, #inferenceengine, #llama, #llm, #llmserving, #pytorch
kevmo314/scuda
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.
Language:C++
Total stars: 1118
Stars trend:
#cplusplus
#cublas, #cuda, #cudnn, #gpu, #mlops, #networking, #nvml, #remoteaccess
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.
Language:C++
Total stars: 1118
Stars trend:
13 Jan 2025
12pm █▉ +15
1pm █▎ +10
2pm █ +8
3pm █▎ +10
4pm █▊ +14
5pm ▉ +7
6pm ▍ +3
7pm █▏ +9
8pm ▉ +7
9pm ▌ +4
10pm ▋ +5
11pm ▊ +6
#cplusplus
#cublas, #cuda, #cudnn, #gpu, #mlops, #networking, #nvml, #remoteaccess
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
20 Jan 2025
8pm ▎ +2
9pm ▎ +2
10pm █▏ +9
11pm ▍ +3
21 Jan 2025
12am ▎ +2
1am █ +8
2am ▉ +7
3am ▉ +7
4am █▌ +12
5am ▊ +6
6am █▎ +10
7am █▍ +11
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
SGLang is a fast serving framework for large language models and vision language models.
Language:Python
Total stars: 8931
Stars trend:
7 Feb 2025
10pm ▊ +6
11pm ▊ +6
8 Feb 2025
12am ▍ +3
1am █ +8
2am █▋ +13
3am █▎ +10
4am ▌ +4
5am █▊ +14
6am █▋ +13
7am █ +8
8am █▊ +14
9am █▊ +14
#python
#cuda, #deepseek, #deepseekllm, #deepseekv3, #inference, #llama, #llama2, #llama3, #llama31, #llava, #llm, #llmserving, #moe, #pytorch, #transformer, #vlm
Rust-GPU/Rust-CUDA
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Language:Rust
Total stars: 4103
Stars trend:
#rust
#cuda, #cudakernels, #cudaprogramming, #gpgpu, #gpu, #gpuprogramming, #rust, #rustlang
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Language:Rust
Total stars: 4103
Stars trend:
11 Apr 2025
5pm ██ +16
6pm █▋ +13
7pm ██▍ +19
8pm ██▍ +19
9pm █ +8
#rust
#cuda, #cudakernels, #cudaprogramming, #gpgpu, #gpu, #gpuprogramming, #rust, #rustlang
ashvardanian/less_slow.cpp
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Language:C++
Total stars: 1069
Stars trend:
#cplusplus
#assembly, #assemblylanguage, #avx512, #benchmark, #coroutines, #cpp, #cppprogramming, #cpp17, #cpp20, #cuda, #gcc, #googlebenchmark, #hpc, #iouring, #linuxkernel, #llvm, #ptx, #ranges, #tutorial, #tutorials
Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO
Language:C++
Total stars: 1069
Stars trend:
19 Apr 2025
7am █▊ +14
8am ██ +16
9am ▊ +6
10am ▋ +5
11am ▏ +1
12pm █▌ +12
1pm █▍ +11
2pm █ +8
3pm ▎ +2
4pm █▌ +12
5pm █▌ +12
6pm █ +8
#cplusplus
#assembly, #assemblylanguage, #avx512, #benchmark, #coroutines, #cpp, #cppprogramming, #cpp17, #cpp20, #cuda, #gcc, #googlebenchmark, #hpc, #iouring, #linuxkernel, #llvm, #ptx, #ranges, #tutorial, #tutorials
tracel-ai/burn
Burn is a next generation Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
Language:Rust
Total stars: 10475
Stars trend:
#rust
#autodiff, #crossplatform, #cuda, #deeplearning, #kernelfusion, #machinelearning, #metal, #ndarray, #neuralnetwork, #onnx, #pytorch, #rocm, #rust, #scientificcomputing, #tensor, #vulkan, #wasm, #webgpu
Burn is a next generation Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
Language:Rust
Total stars: 10475
Stars trend:
25 Apr 2025
6am ▏ +1
7am ▎ +2
8am +0
9am +0
10am ▌ +4
11am █▍ +11
12pm █▏ +9
1pm ▌ +4
2pm █▌ +12
3pm █▍ +11
4pm █▌ +12
5pm █▏ +9
#rust
#autodiff, #crossplatform, #cuda, #deeplearning, #kernelfusion, #machinelearning, #metal, #ndarray, #neuralnetwork, #onnx, #pytorch, #rocm, #rust, #scientificcomputing, #tensor, #vulkan, #wasm, #webgpu