[models] GTX 1660 Super 6gb - a YellowjacketGames Collection

YellowjacketGames 's Collections

[mixed] Chess x AI

[papers] Gameplay Optimization

[models] RTX a6000 48gb

[models] GTX 1660 Super 6gb

[models] CPU-Offload &/|| A6000x2

[models] Sub-1gb for Edge

[models] iGPU-Capable < 512mb

[models] non-EN specialists

[mixed] ORCAssist "Work's Done!"

[mixed] Image Generation Stack

[mixed] Scientific Method

[papers] Distillation

[papers] RAG$ to Riche$

[papers] Sports Tech

[papers] Film & Cinema

[data] What a Dump!

[models] GTX 1660 Super 6gb

updated about 20 hours ago

The best little card under 100 euros. Full Precision vs Quants not benchmarked. This card is so much better at running inference than you realize.

MaziyarPanahi/Nemotron-Orchestrator-8B-GGUF

Text Generation • 8B • Updated Dec 6, 2025 • 95.4k • 4

Note Q4_K_M = Full GPU Offload Reasoning? YES Context: 2048 tok [ 4096 too big for 1660 vram, just start a new chat anyway lmao ] Token Thruput: 14.44 tok/sec, 2796 tokens, 0.55s to first token Thinking Duration: ~2min Benchmark Prompt: "How can I use { ModelName } to automate my office's intranet?" Note: Reasoning step was nearer 30 TPS, response TPS was measurably slower. VERY Verbose ( blew its context limit in one reply )
unsloth/Qwen3-4B-Instruct-2507-GGUF

4B • Updated Aug 20, 2025 • 63.4k • 143

Note Q_8 XL = Full GPU Offload Reasoning? NO Context: 4096 tok Token Thruput:14.78 tok/sec, 546 tokens, 0.62s to first token Benchmark Prompt: "How can I use { ModelName } to automate my office's intranet?" Note: Fast & Concise.
unsloth/SmolLM3-3B-128K-GGUF

3B • Updated Jul 8, 2025 • 2.96k • 37

Note 128K-UD-Q8_K_XL = Full GPU Offload Reasoning? YES Context: 4096 tok Token Thruput:34.13 tok/sec, 1774 tokens, 1.60s to first token Thinking Duration: 25 sec Benchmark Prompt: "How can I use { ModelName } to automate my office's intranet?" Note: A reliable LLM for generating synthetic data on small gpus. ran a full 24h cycle generating and reviewing manacaster art prompts with reasonably good results for a tiny model!
unsloth/Ministral-3-3B-Reasoning-2512-GGUF

3B • Updated Dec 4, 2025 • 3.65k • 13

Note Q6_K_XL fast as shit, quality seems fine. not tested image multimodal yet.
mradermacher/SERA-8B-GA-i1-GGUF

8B • Updated 11 days ago • 649
eaddario/Olmo-3-7B-Think-GGUF

Text Generation • 7B • Updated Jan 2 • 144