https://arxiv.org/abs/2506.01928 #Paper
Esoteric Language Models ;)
a new family of models that fuses autoregressive and Masked Diffusion Models paradigms
Esoteric Language Models ;)
a new family of models that fuses autoregressive and Masked Diffusion Models paradigms
arXiv.org
Esoteric Language Models
Diffusion-based language models offer a compelling alternative to autoregressive (AR) models by enabling parallel and controllable generation. Among this family of models, Masked Diffusion Models...
https://arxiv.org/abs/2503.19108
The plane ViT architecture without a decoder to perform fast image segmentation #Paper #Frameworks
The plane ViT architecture without a decoder to perform fast image segmentation #Paper #Frameworks
arXiv.org
Your ViT is Secretly an Image Segmentation Model
Vision Transformers (ViTs) have shown remarkable performance and scalability across various computer vision tasks. To apply single-scale ViTs to image segmentation, existing methods adopt a...
https://arxiv.org/abs/2411.04983
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning #Frameworks #Paper
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning #Frameworks #Paper
arXiv.org
DINO-WM: World Models on Pre-trained Visual Features enable...
The ability to predict future outcomes given control actions is fundamental for physical reasoning. However, such predictive models, often called world models, remains challenging to learn and are...
Achieving 10,000x training data reduction with high-fidelity labels https://share.google/PXeW6ut6dkPw4M0zw
#paper
#paper
research.google
Achieving 10,000x training data reduction with high-fidelity labels
https://arxiv.org/abs/2510.05949v1 JEPA architectures such as DINOv3 can be effectively used for data curation, outlier detection and similar tasks. #Paper
arXiv.org
Gaussian Embeddings: How JEPAs Secretly Learn Your Data Density
Joint Embedding Predictive Architectures (JEPAs) learn representations able to solve numerous downstream tasks out-of-the-box. JEPAs combine two objectives: (i) a latent-space prediction term,...