Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.06221

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 93
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 102
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30, 2025 • 3
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey

Paper • 2511.07448 • Published Nov 5, 2025 • 3
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published Nov 20, 2025 • 109

HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10, 2025 • 61.3k • • 892
HuggingFaceFW/fineweb

Viewer • Updated Jul 11, 2025 • 52.5B • 208k • 2.64k
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
p-e-w/Llama-3.1-8B-Instruct-heretic

Text Generation • 8B • Updated Nov 15, 2025 • 170 • 9

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 5
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14, 2025 • 144
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

Paper • 2510.00184 • Published Sep 30, 2025 • 17
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 215
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 98
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20, 2025 • 52
Performance Trade-offs of Optimizing Small Language Models for E-Commerce

Paper • 2510.21970 • Published Oct 24, 2025 • 3

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21, 2025 • 36
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Paper • 2510.08551 • Published Oct 9, 2025 • 33
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Paper • 2510.04212 • Published Oct 5, 2025 • 25
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14, 2025 • 28

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 93
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 102
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30, 2025 • 3
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey

Paper • 2511.07448 • Published Nov 5, 2025 • 3
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Paper • 2511.16043 • Published Nov 20, 2025 • 109

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10, 2025 • 61.3k • • 892
HuggingFaceFW/fineweb

Viewer • Updated Jul 11, 2025 • 52.5B • 208k • 2.64k
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
p-e-w/Llama-3.1-8B-Instruct-heretic

Text Generation • 8B • Updated Nov 15, 2025 • 170 • 9

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 215
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning

Paper • 2510.20150 • Published Oct 23, 2025 • 5
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Paper • 2508.10433 • Published Aug 14, 2025 • 144
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 104

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 98
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't

Paper • 2503.16219 • Published Mar 20, 2025 • 52
Performance Trade-offs of Optimizing Small Language Models for E-Commerce

Paper • 2510.21970 • Published Oct 24, 2025 • 3

Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls

Paper • 2510.00184 • Published Sep 30, 2025 • 17
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9, 2025 • 133

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21, 2025 • 36
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation

Paper • 2510.08551 • Published Oct 9, 2025 • 33
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention

Paper • 2510.04212 • Published Oct 5, 2025 • 25
ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning

Paper • 2510.12693 • Published Oct 14, 2025 • 28

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs