Do We Really Need More Parameters, or the Right Parameters?
A Comparative Study of AI Text Humanization Across Model Scales
Rofati Β· April 2026
π Read the Paper (PDF)
The prevailing assumption in LLM deployment is that larger models produce better outputs. We challenge this in the domain of AI text humanization β rewriting machine-generated text to read as naturally human-written.
We compare three models spanning a 48Γ parameter range β Qwen 2.5-1.5B with speculative decoding, Gemma 4 E4B at 4-bit quantization, and Qwen 2.5-72B at full precision β on five diverse AI-generated passages using identical prompts.
The 4.5B quantized model wins. It surpasses the 72B model by 62% in contraction usage, 66% in sentence structure variation, and achieves 100% AI pattern removal β while running on free CPU hardware.
We introduce the "Polished AI" trap: larger models are so fluent that their rewrites become more uniform and detectable than mid-scale counterparts.
| Model | Params | AI Patterns β | Contractions β | Sent. Variance β | Word Diversity β |
|---|---|---|---|---|---|
| Original AI Text | β | 5.0 | 0.6 | 34.4 | 0.747 |
| Qwen 2.5-1.5B | 1.5B | 0.4 | 1.6 | 17.3 | 0.861 |
| Gemma 4 E4B | 4.5B | 0.0 | 4.2 | 46.4 | 0.826 |
| Qwen 2.5-72B | 72B | 0.0 | 2.6 | 28.0 | 0.791 |
Live Demos
Citation
@article{rofati2026rightparameters,
title={Do We Really Need More Parameters, or the Right Parameters?
A Comparative Study of AI Text Humanization Across Model Scales},
author={Rofati},
year={2026},
url={https://huggingface.co/Rofati/right-parameters-not-more-parameters}
}