Democracy Detector — Multilingual Modern Bert - Binary Classifier

Task

Binary classification of sentences from political party press releases:

0 — Not democracy: Sentence does not contain a democratic appeal.
1 — Democracy: Sentence contains a democratic appeal (any rhetorical invocation of democracy, democratic norms, institutions, or principles).

This is Stage 1 of a two-stage classification pipeline:

Stage 1 (this model): Fast binary detection of democracy-related sentences.
Stage 2 (GPT-based): Strategy classification of detected sentences (self-assertion, accusation, counter-claim, agenda-setting).

Model Details

Base model: jhu-clsp/mmBERT-base
Fine-tuned on: ~3654 hand-coded sentences from the PartyPress dataset
Languages: German, Swedish, English, Danish, Polish and Spanish (multilingual press releases)
Max sequence length: 104 tokens

Training Configuration

Parameter	Value
Learning rate	0.0001
Epochs	3
Batch size	16
Warmup ratio	0.1
Weight decay	0.01
Scheduler	cosine
Class weights	True
Focal loss	False (gamma=2.0)
Precision	fp16

Training Data

Split	Total	Democracy (1)	Not democracy (0)
Train	3654	1512	2142
Val	731	205	526
Test	412	169	243

Performance (Test Set)

           precision    recall  f1-score   support

Not democracy 0.907 0.918 0.912 243 Democracy 0.880 0.864 0.872 169

 accuracy                          0.896       412
macro avg      0.893     0.891     0.892       412

weighted avg 0.895 0.896 0.895 412

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

repo = "LBenoit/democracy-mmBert"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo)
model.eval()

sentence = "Die AfD gefährdet unsere demokratische Grundordnung."
inputs = tokenizer(sentence, return_tensors="pt", truncation=True, max_length=104)

with torch.no_grad():
    logits = model(**inputs).logits
    prob = torch.softmax(logits, dim=-1)[0, 1].item()

label = "Democracy" if prob >= threshold else "Not democracy"
print(f"{label} (p={prob:.3f})")

Citation

Part of a PhD dissertation on democratic credibility competition in European party systems.

Author

Léandre Benoit

Downloads last month: 160

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for LBenoit/democracy-mmBert

Base model

jhu-clsp/mmBERT-base

Finetuned

(89)

this model