eloquent26 RoBERTa-base detector โ€” PAN'25/26 Voight-Kampff Subtask 1

Fine-tuned roberta-base on the official PAN'25/26 Generative AI Detection training split (Zenodo DOI 10.5281/zenodo.14962653). Used in the eloquent26 detector panel for the ELOQUENT 2026 Voight-Kampff research paper.

Training data

  • Source: Bevendorff et al., PAN'25/26 Generative AI Detection: Voight-Kampff AI Detection Sensitivity (Zenodo, March 2025).
  • Split: train.jsonl (n = 23,707; 9,101 human + 14,606 LLM-generated).
  • Generator models in train: gpt-3.5-turbo, gpt-4o, gpt-4o-mini, o3-mini, gemini-1.5-pro, gemini-2.0-flash, llama-3.1-8b-instruct, llama-3.3-70b-instruct, ministral-8b-instruct-2410, deepseek-r1-distill-qwen-32b, falcon3-10b-instruct, gpt-4.5-preview, gpt-4-turbo-paraphrase, gemini-pro.
  • Genres in train: essays, fiction, news.

Training config

  • Epochs: 2
  • Batch size: 16
  • Learning rate: 2e-05
  • Weight decay: 0.01
  • Warmup ratio: 0.1
  • Max length: 512
  • Seed: 42
  • Mixed precision: fp16

Final val metrics

{
  "eval_loss": 0.07029607146978378,
  "eval_accuracy": 0.9877403176372248,
  "eval_f1": 0.9905579399141631,
  "eval_roc_auc": 0.9994946525295825,
  "eval_runtime": 3.7238,
  "eval_samples_per_second": 963.793,
  "eval_steps_per_second": 30.345,
  "epoch": 2.0
}

Usage

from transformers import pipeline
clf = pipeline("text-classification", model="protagonist/roberta-eloquent", top_k=None)
print(clf("Your text here"))

Returned label llm carries the probability that the text was machine-generated.

Reproduce

The training script lives at notebooks/train_roberta_a100.py in the eloquent26 repo.

Downloads last month
42
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for protagonist/roberta-eloquent

Finetuned
(2248)
this model