Do We Really Need More Parameters, or the Right Parameters?

A Comparative Study of AI Text Humanization Across Model Scales

Rofati · April 2026

📄 Read the Paper (PDF)

The prevailing assumption in LLM deployment is that larger models produce better outputs. We challenge this in the domain of AI text humanization — rewriting machine-generated text to read as naturally human-written.

We compare three models spanning a 48× parameter range — Qwen 2.5-1.5B with speculative decoding, Gemma 4 E4B at 4-bit quantization, and Qwen 2.5-72B at full precision — on five diverse AI-generated passages using identical prompts.

The 4.5B quantized model wins. It surpasses the 72B model by 62% in contraction usage, 66% in sentence structure variation, and achieves 100% AI pattern removal — while running on free CPU hardware.

We introduce the "Polished AI" trap: larger models are so fluent that their rewrites become more uniform and detectable than mid-scale counterparts.

Model	Params	AI Patterns ↓	Contractions ↑	Sent. Variance ↑	Word Diversity ↑
Original AI Text	—	5.0	0.6	34.4	0.747
Qwen 2.5-1.5B	1.5B	0.4	1.6	17.3	0.861
Gemma 4 E4B	4.5B	0.0	4.2	46.4	0.826
Qwen 2.5-72B	72B	0.0	2.6	28.0	0.791

Live Demos

Citation

@article{rofati2026rightparameters,
  title={Do We Really Need More Parameters, or the Right Parameters? 
         A Comparative Study of AI Text Humanization Across Model Scales},
  author={Rofati},
  year={2026},
  url={https://huggingface.co/Rofati/right-parameters-not-more-parameters}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support