Democracy Detector — Multilingual Modern Bert - Binary Classifier

Task

Binary classification of sentences from political party press releases:

  • 0 — Not democracy: Sentence does not contain a democratic appeal.
  • 1 — Democracy: Sentence contains a democratic appeal (any rhetorical invocation of democracy, democratic norms, institutions, or principles).

This is Stage 1 of a two-stage classification pipeline:

  1. Stage 1 (this model): Fast binary detection of democracy-related sentences.
  2. Stage 2 (GPT-based): Strategy classification of detected sentences (self-assertion, accusation, counter-claim, agenda-setting).

Model Details

  • Base model: jhu-clsp/mmBERT-base
  • Fine-tuned on: ~3654 hand-coded sentences from the PartyPress dataset
  • Languages: German, Swedish, English, Danish, Polish and Spanish (multilingual press releases)
  • Max sequence length: 104 tokens

Training Configuration

Parameter Value
Learning rate 0.0001
Epochs 3
Batch size 16
Warmup ratio 0.1
Weight decay 0.01
Scheduler cosine
Class weights True
Focal loss False (gamma=2.0)
Precision fp16

Training Data

Split Total Democracy (1) Not democracy (0)
Train 3654 1512 2142
Val 731 205 526
Test 412 169 243

Performance (Test Set)

           precision    recall  f1-score   support

Not democracy 0.907 0.918 0.912 243 Democracy 0.880 0.864 0.872 169

 accuracy                          0.896       412
macro avg      0.893     0.891     0.892       412

weighted avg 0.895 0.896 0.895 412

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

repo = "LBenoit/democracy-mmBert"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo)
model.eval()

sentence = "Die AfD gefährdet unsere demokratische Grundordnung."
inputs = tokenizer(sentence, return_tensors="pt", truncation=True, max_length=104)

with torch.no_grad():
    logits = model(**inputs).logits
    prob = torch.softmax(logits, dim=-1)[0, 1].item()

label = "Democracy" if prob >= threshold else "Not democracy"
print(f"{label} (p={prob:.3f})")

image

image

Citation

Part of a PhD dissertation on democratic credibility competition in European party systems.

Author

Léandre Benoit

Downloads last month
160
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LBenoit/democracy-mmBert

Finetuned
(89)
this model