πŸ‡­πŸ‡Ί GuiltRoBERTa-hu: A Two-Stage Classifier for Guilt-Assignment Rhetoric in Hungarian Political Texts

GuiltRoBERTa-hu is a two-stage AI pipeline for detecting guilt-assignment rhetoric in Hungarian political discourse.
It combines:

  1. Stage 1 – Emotion Pre-Filtering: emotion labels from the Babel Emotions Tool,
  2. Stage 2 – Guilt Classification: a fine-tuned binary XLM-RoBERTa model trained on manually annotated Hungarian texts (guilt vs no_guilt).

The approach is grounded in political communication theory, which suggests that guilt attribution often emerges in anger-laden contexts.
Thus, only texts labeled as β€œAnger” in Stage 1 are passed to the guilt classifier.


🧩 Model Architecture

Stage 1: Emotion Pre-Filtering (Babel Emotions Tool)

  • Tool: Emotions 6 Babel Machine (developed by PoltextLAB)
  • Task: 6-class emotion classification (Anger, Fear, Disgust, Sadness, Joy, None)
  • Input: CSV file with one text per row
  • Output: CSV file with predicted labels and probabilities
  • Usage: retain only rows with predicted_emotion == "Anger" for Stage 2

βš™οΈ The Babel Emotions Tool is not an API but a web-based interface.
Upload a CSV file, download the labeled results, and use them as input to the guilt classifier.
The included notebook loads these predictions from gold_standard_with_emotion_prediction.xlsx.

Stage 2: Guilt Classification

  • Base model: xlm-roberta-base
  • Task: Binary classification (guilt, no_guilt)
  • Training data: guilt_roberta_train.xlsx
  • Evaluation: Independent gold-standard dataset of Hungarian political discourse

🧠 Motivation

Guilt assignment β€” attributing moral responsibility or blame β€” is a key rhetorical strategy in political communication.
Since guilt often appears alongside anger, direct one-stage classification risks conflating emotional tones.

This two-stage pipeline improves precision by:

  • Filtering anger-related contexts first
  • Then applying a dedicated guilt detector only where relevant

πŸ“Š Evaluation (Gold Standard)

Stage 1 Filter Threshold (Ο„) Precision Recall F1 Accuracy
Anger-only 0.01 0.78 0.95 0.85 0.75

Best configuration: Anger-only, Ο„ = 0.01
ROC-AUC = 0.61 PR-AUC = 0.82
The two-stage model improves F1 by β‰ˆ +0.20 compared to single-stage baselines.


πŸš€ Usage Example

import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline

# Load Babel emotion predictions
df = pd.read_excel("gold_standard_with_emotion_prediction.xlsx")

# Filter for 'Anger'
anger_df = df[df["emotion_predicted"] == "Anger"].copy()

# Load the guilt classifier
repo_id = "<your-org>/guiltroberta-hu"
tok = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)
pipe = TextClassificationPipeline(model=model, tokenizer=tok, return_all_scores=True)

# Apply predictions
anger_df["guilt_score"] = anger_df["text"].apply(lambda t: pipe(t)[0][1]["score"])  # score for 'guilt'
anger_df.to_excel("anger_with_guilt_predictions.xlsx", index=False)
Downloads last month
19
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results