https://arxiv.org/abs/2510.05949v1 JEPA architectures such as DINOv3 can be effectively used for data curation, outlier detection and similar tasks. #Paper
arXiv.org
Gaussian Embeddings: How JEPAs Secretly Learn Your Data Density
Joint Embedding Predictive Architectures (JEPAs) learn representations able to solve numerous downstream tasks out-of-the-box. JEPAs combine two objectives: (i) a latent-space prediction term,...
https://github.com/microsoft/markitdown
Converts all major document formats to markdown and can work as an MCP server
Converts all major document formats to markdown and can work as an MCP server
GitHub
GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.
Python tool for converting files and office documents to Markdown. - microsoft/markitdown