Configuration Parsing Warning:Invalid JSON for config file tokenizer_config.json

30% Smaller, +0.4% Better

Qwen2.5-3B pruned by 30% and retrained for general through Experiential Plasticity.

2.30 โ†’ 2.29 perplexity ยท 3 cycles

Verify Chain of Custody

Every claim on this card is verified
Trust: self-attested ยท 1 benchmark ยท 2 devices tested
ForgeAlloy chain of custody ยท Download alloy ยท Merkle-chained


Qwen2.5-3B with cryptographic provenance via the ForgeAlloy chain of custody.

Benchmarks

Benchmark Result Verified
perplexity 2.3 Self-reported

What Changed (Base โ†’ Forged)

Base Forged Delta
Perplexity (general) 2.30 2.29 -0.4% โœ…
Pruning None 30% heads (magnitude) -30% params โœ…
Training General general, 1000 steps LR 2e-4, 3 cycles
Pipeline prune โ†’ train 3 cycles

Runs On

Device Format Size Speed
MacBook Pro 16GB fp16 โ€” Verified
MacBook Pro 32GB fp16 โ€” Verified
MacBook Pro 32GB fp16 8.0GB Expected
MacBook Air 16GB Q8_0 ~4.0GB Expected
MacBook Air 8GB Q4_K_M ~2.5GB Expected
iPhone / Android Q4_K_M ~2.5GB Expected

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("continuum-ai/qwen2.5-3b-general-forged",
    torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("continuum-ai/qwen2.5-3b-general-forged")

inputs = tokenizer("def merge_sort(arr):", return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Methodology

Produced via head pruning. Full methodology, ablations, and per-stage rationale are in the methodology paper and the companion MODEL_METHODOLOGY.md in this repository. The pipeline ran as prune โ†’ train over 3 cycles on MacBook Pro 16GB.

Chain of Custody

Scan the QR or verify online. Download the alloy file to verify independently.

What Proof
Model weights sha256:80def3c4bcf296e5960c37244b43018cc...
Code that ran sha256:legacy-pre-alloy-...
Forged on MacBook Pro 16GB, 2026-03-27T09:33:23-05:00
Trust level self-attested
Spec ForgeAlloy โ€” Rust/Python/TypeScript

Make Your Own

Forged with Continuum โ€” a distributed AI world that runs on your hardware.

Continuum Model Factory

The Factory configurator lets you design and forge custom models visually โ€” context extension, pruning, LoRA, quantization, vision/audio modalities. Pick your target devices, the system figures out what fits.

GitHub ยท All Models ยท Forge-Alloy

License

apache-2.0

Downloads last month
683
Safetensors
Model size
3B params
Tensor type
BF16
ยท
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for continuum-ai/qwen2.5-3b-general-forged

Base model

Qwen/Qwen2.5-3B
Finetuned
(370)
this model