Self Supervised Boy
@selfsupervised
160
subscribers
9
photos
56
links
Posting links to papers I read. Right now I'm mostly interested in things around LLMs, AI agents, and ML4Code. That is subject to change.
@martolod
Download Telegram
Join
Self Supervised Boy
160 subscribers
Self Supervised Boy
https://arxiv.org/abs/2510.12773
arXiv.org
Dr.LLM: Dynamic Layer Routing in LLMs
Large Language Models (LLMs) process every token through all layers of a transformer stack, causing wasted computation on simple queries and insufficient flexibility for harder ones that need...
Self Supervised Boy
https://arxiv.org/abs/2510.18148v1
arXiv.org
Extracting Rule-based Descriptions of Attention Features in Transformers
Mechanistic interpretability strives to explain model behavior in terms of bottom-up primitives. The leading paradigm is to express hidden states as a sparse linear combination of basis vectors,...
π
1
Self Supervised Boy
https://arxiv.org/abs/2510.18147v1
arXiv.org
LLMs Encode How Difficult Problems Are
Large language models exhibit a puzzling inconsistency: they solve complex problems yet frequently fail on seemingly simpler ones. We investigate whether LLMs internally encode problem difficulty...
Self Supervised Boy
https://arxiv.org/abs/2510.21614v1
arXiv.org
Huxley-GΓΆdel Machine: Human-Level Coding Agent Development by an...
Recent studies operationalize self-improvement through coding agents that edit their own codebases. They grow a tree of self-modifications through expansion strategies that favor higher software...
Self Supervised Boy
https://arxiv.org/abs/2601.05167
arXiv.org
RelayLLM: Efficient Reasoning via Collaborative Decoding
Large Language Models (LLMs) for complex reasoning is often hindered by high computational costs and latency, while resource-efficient Small Language Models (SLMs) typically lack the necessary...
Self Supervised Boy
https://arxiv.org/abs/2601.03335v1
arXiv.org
Digital Red Queen: Adversarial Program Evolution in Core War with LLMs
Large language models (LLMs) are increasingly being used to evolve solutions to problems in many domains, in a process inspired by biological evolution. However, unlike biological evolution, most...
π
1
Self Supervised Boy
https://arxiv.org/abs/2601.04786v1
arXiv.org
AgentOCR: Reimagining Agent History via Optical Self-Compression
Recent advances in large language models (LLMs) enable agentic systems trained with reinforcement learning (RL) over multi-turn interaction trajectories, but practical deployment is bottlenecked...
Self Supervised Boy
https://arxiv.org/abs/2601.05106
arXiv.org
Token-Level LLM Collaboration via FusionRoute
Large language models (LLMs) exhibit strengths across diverse domains. However, achieving strong performance across these domains with a single general-purpose model typically requires scaling to...
π
1
Self Supervised Boy
https://arxiv.org/abs/2601.07582v1
arXiv.org
ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents
Memory is critical for dialogue agents to maintain coherence and enable continuous adaptation in long-term interactions. While existing memory mechanisms offer basic storage and retrieval...
Self Supervised Boy
https://arxiv.org/abs/2601.09503v1
arXiv.org
What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm...
Large language model (LLM) agents have demonstrated remarkable capabilities in complex decision-making and tool-use tasks, yet their ability to generalize across varying environments remains a...
π₯
1
Self Supervised Boy
https://arxiv.org/abs/2601.10343v1
arXiv.org
OctoBench: Benchmarking Scaffold-Aware Instruction Following in...
Modern coding scaffolds turn LLMs into capable software agents, but their ability to follow scaffold-specified instructions remains under-examined, especially when constraints are heterogeneous...
Self Supervised Boy
https://arxiv.org/abs/2601.10245v1
arXiv.org
TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step...
Multi-step reasoning tasks like mathematical problem solving are vulnerable to cascading failures, where a single incorrect step leads to complete solution breakdown. Current LLM routing methods...
Self Supervised Boy
https://arxiv.org/abs/2601.10639v1
arXiv.org
STEM: Scaling Transformers with Embedding Modules
Fine-grained sparsity promises higher parametric capacity without proportional per-token compute, but often suffers from training instability, load balancing, and communication overhead. We...
π
1
Self Supervised Boy
Forwarded from
Just links
Time Horizon 1.1
https://metr.org/blog/2026-1-29-time-horizon-1-1/
metr.org
Time Horizon 1.1
Weβre releasing a new version of our time horizon estimates (TH1.1), using more tasks and a new eval infrastructure.
π
3