LingBot-World NF4 Quantized

Pre-quantized NF4 weights for LingBot-World video generation model. This is a complete, self-contained package - no additional downloads required.

Features

  • 4-bit NF4 quantization via bitsandbytes - fits in 32GB VRAM
  • Pre-quantized weights - no runtime quantization overhead
  • Complete package - includes T5 encoder, VAE, and diffusion models

Quick Start

# Clone the repo
git clone https://huggingface.co/cahlen/lingbot-world-base-cam-nf4
cd lingbot-world-base-cam-nf4

# Install dependencies
pip install -r requirements.txt

# Generate a video
python generate_prequant.py \
    --image your_image.jpg \
    --prompt "A cinematic video of the scene" \
    --frame_num 81 \
    --output output.mp4

Model Contents

File Size Description
high_noise_model_bnb_nf4/model.safetensors ~9.6GB NF4 quantized diffusion model (high noise)
low_noise_model_bnb_nf4/model.safetensors ~9.6GB NF4 quantized diffusion model (low noise)
models_t5_umt5-xxl-enc-bf16.pth ~10.6GB T5-XXL text encoder
Wan2.1_VAE.pth ~485MB VAE encoder/decoder

Total size: ~30GB (vs ~85GB for full precision models)

Usage

Basic Generation

python generate_prequant.py \
    --image input.jpg \
    --prompt "Your prompt here" \
    --frame_num 81 \
    --size "480*832" \
    --output output.mp4

Parameters

Parameter Default Description
--image required Input image path
--prompt required Text prompt describing the video
--frame_num 81 Number of frames (81 = ~5 seconds at 16fps)
--size "480*832" Output resolution (height*width)
--sampling_steps 40 Diffusion sampling steps
--guide_scale 5.0 Classifier-free guidance scale
--seed -1 Random seed (-1 for random)
--output "output.mp4" Output video path

With Camera Control

python generate_prequant.py \
    --image input.jpg \
    --prompt "Your prompt" \
    --action_path /path/to/camera_poses/ \
    --frame_num 81

Camera pose directory should contain:

  • poses.npy: Shape [num_frames, 4, 4] - camera transformation matrices
  • intrinsics.npy: Shape [num_frames, 4] - [fx, fy, cx, cy]

Requirements

  • Python 3.10+
  • CUDA 11.8+ (tested with CUDA 12.x)
  • ~32GB VRAM (RTX 4090, RTX 5090, A100, etc.)

Quantization Details

The diffusion models are quantized using bitsandbytes NF4 with double quantization:

{
  "format": "bnb_nf4",
  "double_quant": true,
  "compute_dtype": "bfloat16",
  "blocksize": 64
}

This achieves ~3.9x compression while maintaining generation quality.

License

This model is based on LingBot-World and follows its license terms.

Citation

@misc{lingbot-world-nf4,
  title={LingBot-World NF4 Quantized},
  year={2025},
  url={https://huggingface.co/cahlen/lingbot-world-base-cam-nf4}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support