-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 62
Collections
Discover the best community collections!
Collections including paper arxiv:2503.09573
-
Structured Denoising Diffusion Models in Discrete State-Spaces
Paper • 2107.03006 • Published • 1 -
Simplified and Generalized Masked Diffusion for Discrete Data
Paper • 2406.04329 • Published • 8 -
Simple and Effective Masked Diffusion Language Models
Paper • 2406.07524 • Published • 12 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Sadeed: Advancing Arabic Diacritization Through Small Language Model
Paper • 2504.21635 • Published • 59 -
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
Paper • 2509.21320 • Published • 101 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 82
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 1.05k • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 2.87k • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 485 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 98 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 7
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Diffusion Language Models Know the Answer Before Decoding
Paper • 2508.19982 • Published • 27 -
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
Paper • 2512.13586 • Published • 93 -
LSRIF: Logic-Structured Reinforcement Learning for Instruction Following
Paper • 2601.06431 • Published • 12 -
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
Paper • 2601.09088 • Published • 62
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 1.05k • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 2.87k • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 485 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75
-
Structured Denoising Diffusion Models in Discrete State-Spaces
Paper • 2107.03006 • Published • 1 -
Simplified and Generalized Masked Diffusion for Discrete Data
Paper • 2406.04329 • Published • 8 -
Simple and Effective Masked Diffusion Language Models
Paper • 2406.07524 • Published • 12 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 126 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 98 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 55 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 7
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 75 -
Sadeed: Advancing Arabic Diacritization Through Small Language Model
Paper • 2504.21635 • Published • 59 -
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
Paper • 2509.21320 • Published • 101 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 82
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27