Monthly ranking
2026-06
Top 10 papers, rising papers, dominant themes, and the full monthly index.
Top 10 Papers
- The Verification Horizon: No Silver Bullet for Coding Agent Rewards100
- DanceOPD: On-Policy Generative Field Distillation98
- OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning98
- JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting97
- Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents95
- Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation94
- ViQ: Text-Aligned Visual Quantized Representations at Any Resolution94
- ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation91
- COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origami91
- GUI vs. CLI: Execution Bottlenecks in Screen-Only and Skill-Mediated Computer-Use Agents90
Rising Papers
- The Verification Horizon: No Silver Bullet for Coding Agent Rewards100/100
- DanceOPD: On-Policy Generative Field Distillation98/100
- OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning98/100
- JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting97/100
- Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents95/100
Themes
- diffusion models2
- reinforcement learning2
- 3D meshes1
- 3D reasoning1
- Blind Trust Problem1
- Clopper-Pearson bound1
Complete Monthly Index
- The Verification Horizon: No Silver Bullet for Coding Agent Rewardsgenerative capabilities, human intent, policy capability, proxy signals, reward design, reward hacking
- DanceOPD: On-Policy Generative Field Distillationclassifier-free guidance, expert capabilities, flow-matching models, generative field distillation, global editing, local editing
- OPID: On-Policy Skill Distillation for Agentic Reinforcement Learningcritical-first routing, hierarchical skills, on-policy trajectories, outcome-based reinforcement learning, policy optimization, reinforcement learning
- JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree DraftingMoE Qwen3, acceptance rate, autoregressive Large Language Models, autoregressive factorization, bidirectional block-diffusion, branch-agnostic marginals
- Neglected Free Lunch from Post-training: Progress Advantage for LLM AgentsMarkov decision process, advantage function, agentic settings, failure attribution, log-probability ratio, progress advantage
- Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generationagentic framework, context gap, context grounding, context-aware planning, image agent bench, image agent capabilities
- ViQ: Text-Aligned Visual Quantized Representations at Any Resolutiondiscrete representations, feature discretization, low-level reconstruction, multimodal modeling, position-aware head-wise quantization, proximal representation learning
- ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and GenerationGRPO, boundary-aware count policy, count-faithful image generation, crowd counting, cycle-consistent learning, density-aware adaptive zooming
- COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origamiaesthetic evaluation, base packing, co-creativity, computational origami, crease patterns, flat foldability
- GUI vs. CLI: Execution Bottlenecks in Screen-Only and Skill-Mediated Computer-Use AgentsN/A
- Information-Aware KV Cache Compression for Long ReasoningForward Influence, KV cache, KV cache compression, LLMs, attention weights, entropy-aware
- When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier ModelsClopper-Pearson bound, GPQA-Diamond, Gaussian copula, Self-MoA, accuracy, beta
- Why Multi-Step Tool-Use Reinforcement Learning Collapses and How Supervisory Signals Fix Itagentic reinforcement learning, catastrophic collapse, control tokens, erroneous example supervision, exploratory learning, hint-based guidance
- How Post-Training Shapes Biological Reasoning Modelscontinued pre-training, foundation models, generalization, in-domain performance, language models, multimodal biological data
- LISA: Likelihood Score Alignment for Visual-condition Controllable GenerationLISA, conditional control, decoder, diffusion models, disentangled features, feature projection
- Discretizing Reward ModelsMonte Carlo dropout, discretization, discriminative ability, oversensitivity, policy learning, reinforcement learning
- EO-WM: A Physically Informed World Model for Probabilistic Earth Observation ForecastingNDVI, Normalized Difference Vegetation Index, climatological baseline, cumulative physical stress signals, diffusion models, meteorological forcing
- Hallucination in World Models is Predictable and Preventablecoverage-aware sampling, curiosity rewards, data-centric signals, data-efficient fine-tuning, ground-truth actions, hallucination
- CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent EconomiesLLM agents, agent behavior, autonomous agents, communication, cumulative net income, economic systems
- OpenBioRQ: Unsolved Biomedical Research Questions for Agentsagentic collapse, agentic models, answer key, biomedical research questions, citation verification, frontier agents
- Fast LeWorldModelJoint-Embedding Predictive Architectures, LeWorldModel, action-prefix prediction, autoregressive rollout, latent transition model, latent world model
- Confidence-Aware Tool Orchestration for Robust Video UnderstandingBlind Trust Problem, agentic video understanding, calibrated reliability score, confidence-cost GRPO reward, evidence interface, reliability-relevance score
- Running the Gauntlet: Re-evaluating the Capabilities of Agents Beyond Familiar Environments3D reasoning, agent generalization, agentic systems, automated evaluation engine, benchmark, graphical understanding
- PhysiFormer: Learning to Simulate Mechanics in World Space3D meshes, attention factorised, autoregressive baselines, denoising diffusion process, diffusion transformer, permutation-invariant
- In-Context World Modeling for Robotic ControlVision-Language-Action models, in-context adaptation, novel configurations, parameter updates, real-world robot platforms, robot policies