Collections
Discover the best community collections!
Collections including paper arxiv:2512.00473
-
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 82 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 236 -
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
Paper • 2512.00473 • Published • 26
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 78
-
Adversarial Flow Models
Paper • 2511.22475 • Published • 23 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 29 -
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Paper • 2511.22396 • Published • 5 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17
-
EgoX: Egocentric Video Generation from a Single Exocentric Video
Paper • 2512.08269 • Published • 119 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 132 -
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Paper • 2512.09363 • Published • 72 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77
-
Adversarial Flow Models
Paper • 2511.22475 • Published • 23 -
DiP: Taming Diffusion Models in Pixel Space
Paper • 2511.18822 • Published • 29 -
Asking like Socrates: Socrates helps VLMs understand remote sensing images
Paper • 2511.22396 • Published • 5 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17
-
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 82 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 236 -
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards
Paper • 2512.00473 • Published • 26
-
EgoX: Egocentric Video Generation from a Single Exocentric Video
Paper • 2512.08269 • Published • 119 -
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Paper • 2512.08765 • Published • 132 -
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
Paper • 2512.09363 • Published • 72 -
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Paper • 2512.08478 • Published • 77
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 78