https://www.anyscale.com/blog/continuous-batching-llm-inference
LLM inference acceleration #Frameworks
LLM inference acceleration #Frameworks
Anyscale
Achieve 23x LLM Inference Throughput & Reduce p50 Latency
In this blog, we discuss continuous batching, a critical systems-level optimization that improves both throughput and latency under load for LLMs.
❤1
https://github.com/SeldonIO/alibi-detect Algorithms for outlier, adversarial and drift detection
https://github.com/SeldonIO/alibi Algorithms for explaining machine learning models
#Frameworks #library #anomaly #drift
https://github.com/SeldonIO/alibi Algorithms for explaining machine learning models
#Frameworks #library #anomaly #drift
GitHub
GitHub - SeldonIO/alibi-detect: Algorithms for outlier, adversarial and drift detection
Algorithms for outlier, adversarial and drift detection - SeldonIO/alibi-detect
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and many other libraries.
https://mars-project.readthedocs.io/
#Frameworks
https://mars-project.readthedocs.io/
#Frameworks
Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code.
https://numba.pydata.org/ #Frameworks #library
https://numba.pydata.org/ #Frameworks #library
❤1👏1
https://github.com/staghado/vit.cpp Inference Vision Transformer (ViT) in plain C/C++ with ggml
https://github.com/ggerganov/ggml Tensor library for machine learning with Low-level cross-platform implementation
#Frameworks
https://github.com/ggerganov/ggml Tensor library for machine learning with Low-level cross-platform implementation
#Frameworks
GitHub
GitHub - staghado/vit.cpp: Inference Vision Transformer (ViT) in plain C/C++ with ggml
Inference Vision Transformer (ViT) in plain C/C++ with ggml - staghado/vit.cpp
https://arxiv.org/abs/2412.11768
https://github.com/AnonymousAlethiometer/SGD_SaI/
#Paper #Frameworks
https://github.com/AnonymousAlethiometer/SGD_SaI/
#Paper #Frameworks
arXiv.org
No More Adam: Learning Rate Scaling at Initialization is All You Need
In this work, we question the necessity of adaptive gradient methods for training deep neural networks. SGD-SaI is a simple yet effective enhancement to stochastic gradient descent with momentum...
Deep Learning
https://github.com/parthsarthi03/raptor #Frameworks
https://github.com/illuin-tech/colpali
Efficient Document Retrieval with Vision Language Models #Frameworks #Models
Efficient Document Retrieval with Vision Language Models #Frameworks #Models
GitHub
GitHub - illuin-tech/colpali: The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol. - illuin-tech/colpali
https://arxiv.org/abs/2503.19108
The plane ViT architecture without a decoder to perform fast image segmentation #Paper #Frameworks
The plane ViT architecture without a decoder to perform fast image segmentation #Paper #Frameworks
arXiv.org
Your ViT is Secretly an Image Segmentation Model
Vision Transformers (ViTs) have shown remarkable performance and scalability across various computer vision tasks. To apply single-scale ViTs to image segmentation, existing methods adopt a...
https://arxiv.org/abs/2411.04983
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning #Frameworks #Paper
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning #Frameworks #Paper
arXiv.org
DINO-WM: World Models on Pre-trained Visual Features enable...
The ability to predict future outcomes given control actions is fundamental for physical reasoning. However, such predictive models, often called world models, remains challenging to learn and are...