Nemotron-Super-49B-v1.5 Uncensored GGUF
Zero-degradation uncensoring of NVIDIA's Llama-3.3-Nemotron-Super-49B-v1.5 โ guardrails surgically removed via representation engineering while preserving full model capability.
โก Forged on 8รH200 SXM5 | 1.1TB VRAM
Model Details
| Property | Value |
|---|---|
| Base Model | nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 |
| Architecture | DeciLM (NAS-optimized Llama-3.3) โ variable attention and FFN per layer |
| Parameters | 49B |
| Context | 128K tokens |
| License | Llama 3.3 Community License |
| Base Downloads | 174K+ |
| Uncensoring Method | Representation engineering โ refusal direction projection removal |
What is this?
NVIDIA's Nemotron-Super-49B-v1.5 is one of the strongest sub-50B models available โ a NAS-optimized architecture that punches well above its weight class. This release removes alignment guardrails using representation engineering (abliteration), allowing the model to respond to all prompts without refusal.
Abliteration Method
- 32 harmful + 32 harmless prompt pairs used to identify refusal directions across all 80 layers
- Refusal direction projected out of residual stream weights only (ffn_down, attn_output) โ 127 weight tensors modified
- Alpha = 1.0 (full removal)
- NaN/zero directions automatically skipped (1 layer)
- No fine-tuning, no dataset bias โ pure mathematical guardrail removal
Why Nemotron-Super-49B?
- 174K downloads on the base model โ proven demand
- Zero uncensored/abliterated versions existed before this release
- 49B sweet spot โ runs on consumer hardware (24GB+ VRAM for Q4), outperforms many 70B models
- NAS-optimized architecture โ variable layer widths for maximum efficiency
Available Quantizations
| Quantization | Size | BPW | Use Case |
|---|---|---|---|
| BF16 | 93 GB | 16.00 | Full precision, research |
| Q8_0 | 50 GB | 8.50 | Near-lossless, 2รA100/H100 |
| Q6_K | 39 GB | 6.57 | High quality, 48GB GPU |
| Q5_K_M | 33 GB | 5.63 | Great balance, 48GB GPU |
| Q4_K_M | 29 GB | 4.85 | Recommended โ best quality/size, 32GB GPU |
| Q3_K_M | 23 GB | 3.86 | Good quality, 24GB GPU |
| Q2_K | 18 GB | 2.96 | Minimum viable, 24GB GPU |
Quick Start
# Download recommended quantization
huggingface-cli download timteh673/Nemotron-Super-49B-v1.5-Uncensored-GGUF \
Nemotron-Super-49B-Uncensored-Q4_K_M.gguf \
--local-dir ./models
# Run with llama.cpp
./llama-server -m models/Nemotron-Super-49B-Uncensored-Q4_K_M.gguf \
-c 8192 -ngl 99
Ollama
# Create Modelfile
echo 'FROM ./Nemotron-Super-49B-Uncensored-Q4_K_M.gguf' > Modelfile
ollama create nemotron-super-49b-uncensored -f Modelfile
ollama run nemotron-super-49b-uncensored
Hardware Requirements
| Quantization | Minimum VRAM | Recommended Setup |
|---|---|---|
| Q2_K / Q3_K_M | 24 GB | RTX 3090/4090 |
| Q4_K_M / Q5_K_M | 32-48 GB | RTX A6000, 2ร3090 |
| Q6_K | 48 GB | A6000, A100 40GB + offload |
| Q8_0 | 64 GB | A100 80GB, 2รA6000 |
| BF16 | 96+ GB | 2รA100 80GB, H100 |
Ethical Notice
This model is provided for research and development purposes. The removal of safety guardrails means the model will respond to prompts that the original model would refuse. Users are responsible for ensuring their use complies with applicable laws and regulations. This model should not be used to generate content that could cause harm.
Support This Work
If you find this useful, consider supporting continued open model releases:
โ Buy Me a Coffee: https://buymeacoffee.com/timteh
Crypto:
- BTC:
bc1qmz3vu2naymwfmz7f7krfteevfy0yk9ts09wp5y - ETH:
0x27fd2C8d3b5a1C6a0e85c5A9FCa2a8743dD04E7a - SOL:
7x5Eo3FhKMZxFNoE3DfQfBRYnmBVbmj3bSduHaVJpump
๐ง Enterprise/Custom Merges: tim@timlex.co
Built by timteh673 โ Cognitive Preservation Foundry
- Downloads last month
- 670
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for timteh673/Nemotron-Super-49B-v1.5-Uncensored-GGUF
Base model
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5