Models that I personally recommend, periodically updated.
Doctor Shotgun
Doctor-Shotgun
AI & ML interests
Local ML enthusiast, LLM and diffusion finetuner, hobbyist developer
Recent Activity
updated a model 5 days ago
CPU-Hybrid-MoE/MiniMax-M2.7-CPU-NUMA4-AMXINT8 updated a model 5 days ago
CPU-Hybrid-MoE/GLM-5.1-CPU-NUMA4-AMXINT8 updated a model 5 days ago
CPU-Hybrid-MoE/GLM-5-CPU-NUMA4-AMXINT8Organizations
Doc's Diffusion
Models/loras for image diffusion.
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 766 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 707 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 7 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 7 • 2
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 11 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 21 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 97 • 56 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 283 • 6
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 3 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 11 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 14 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 14 • 1
Doc's Choice
Models that I personally recommend, periodically updated.
Magnum Diamond (24B/70B/123B)
Focusing on applying enough heat and pressure to dry, assistant-tuned models until they turn into creative writing gems!
-
Doctor-Shotgun/ML2-123B-Magnum-Diamond
Text Generation • 123B • Updated • 11 • 11 -
Doctor-Shotgun/L3.3-70B-Magnum-Diamond
Text Generation • 71B • Updated • 21 • 5 -
Doctor-Shotgun/MS3.2-24B-Magnum-Diamond
Text Generation • 24B • Updated • 97 • 56 -
Doctor-Shotgun/ML2-123B-Magnum-Diamond-GGUF
Text Generation • 123B • Updated • 283 • 6
Doc's Diffusion
Models/loras for image diffusion.
Qwen 3 ScatterMoE
Drop-in implementation of https://github.com/shawntan/scattermoe for efficient training of Qwen 3 MoE.
-
chargoddard/Qwen3-30B-A3B-Base-ScatterMoE
31B • Updated • 3 -
Doctor-Shotgun/Qwen3-30B-A3B-Instruct-2507-ScatterMoE
Text Generation • 31B • Updated • 11 • 1 -
Doctor-Shotgun/Qwen3-30B-A3B-Thinking-2507-ScatterMoE
Text Generation • 31B • Updated • 14 -
Doctor-Shotgun/Qwen3-Coder-30B-A3B-Instruct-ScatterMoE
Text Generation • 31B • Updated • 14 • 1
LLM Speculative Decoding Experiments
Tiny language models meant to serve as draft models for speculative decoding.
-
Doctor-Shotgun/TinyLlama-1.1B-32k
Text Generation • 1B • Updated • 766 • 30 -
Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct
Text Generation • 1B • Updated • 707 • • 13 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta
Text Generation • Updated • 7 • 1 -
Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft
Text Generation • Updated • 7 • 2