-
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 95 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 236 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 217 -
Sharp Monocular View Synthesis in Less Than a Second
Paper • 2512.10685 • Published • 28
Collections
Discover the best community collections!
Collections including paper arxiv:2510.11690
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 32 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27
-
Is Noise Conditioning Necessary for Denoising Generative Models?
Paper • 2502.13129 • Published • 1 -
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
Paper • 2504.10483 • Published • 22 -
Mean Flows for One-step Generative Modeling
Paper • 2505.13447 • Published • 7 -
Latent Diffusion Model without Variational Autoencoder
Paper • 2510.15301 • Published • 49
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
Paper • 2510.12276 • Published • 147 -
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 73 -
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Paper • 2510.14847 • Published • 56
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 506 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing
Paper • 2510.05213 • Published • 6
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 69 -
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Paper • 2512.04926 • Published • 42
-
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
Paper • 2510.08540 • Published • 109 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Spotlight on Token Perception for Multimodal Reinforcement Learning
Paper • 2510.09285 • Published • 37 -
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
Paper • 2510.17354 • Published • 35
-
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
Paper • 2512.16093 • Published • 95 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 236 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 217 -
Sharp Monocular View Synthesis in Less Than a Second
Paper • 2512.10685 • Published • 28
-
Is Noise Conditioning Necessary for Denoising Generative Models?
Paper • 2502.13129 • Published • 1 -
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion Transformers
Paper • 2504.10483 • Published • 22 -
Mean Flows for One-step Generative Modeling
Paper • 2505.13447 • Published • 7 -
Latent Diffusion Model without Variational Autoencoder
Paper • 2510.15301 • Published • 49
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
Paper • 2510.12276 • Published • 147 -
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 73 -
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
Paper • 2510.14847 • Published • 56
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 506 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing
Paper • 2510.05213 • Published • 6
-
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 69 -
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion
Paper • 2512.04926 • Published • 42
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 32 -
Self-Improving LLM Agents at Test-Time
Paper • 2510.07841 • Published • 10 -
Making Mathematical Reasoning Adaptive
Paper • 2510.04617 • Published • 23 -
DocReward: A Document Reward Model for Structuring and Stylizing
Paper • 2510.11391 • Published • 27
-
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
Paper • 2510.08540 • Published • 109 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 166 -
Spotlight on Token Perception for Multimodal Reinforcement Learning
Paper • 2510.09285 • Published • 37 -
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
Paper • 2510.17354 • Published • 35