peft-bert-agnews

PEFT LoRA Fine-Tuned BERT for AG News Classification

Model Overview

This repository contains a parameter-efficient fine-tuned version of BERT Base Uncased trained for news topic classification using the AG News dataset. The model was fine-tuned using LoRA (Low-Rank Adaptation) via the PEFT library, allowing efficient training with a very small number of additional parameters.

The purpose of this project is to demonstrate how parameter-efficient fine-tuning techniques can be applied to transformer models for text classification tasks while minimizing computational cost.

Base Model

Model: bert-base-uncased
Architecture: BERT (Bidirectional Encoder Representations from Transformers)
Task: Sequence Classification
Number of labels: 4

Dataset

The model was trained using the AG News dataset, a widely used benchmark dataset for topic classification.

Dataset characteristics:

Total classes: 4
Classes include:
- World
- Sports
- Business
- Sci/Tech

For demonstration and rapid experimentation purposes, only 500 training samples from the training split were used during fine-tuning.

Training Method

Fine-tuning was performed using Parameter-Efficient Fine-Tuning (PEFT) with LoRA.

LoRA configuration:

Rank (r): 8
Alpha: 16
Dropout: 0.1
Target modules: query, value

This approach injects trainable low‑rank matrices into the attention layers while keeping the original pretrained weights frozen.

Advantages of this approach:

Significantly fewer trainable parameters
Lower GPU memory usage
Faster training
Easy adapter sharing

Training Configuration

Training samples: 500
Epochs: 1
Batch size: 8
Training framework: Hugging Face Transformers Trainer
Fine-tuning method: PEFT (LoRA)

Intended Use

This model is intended for:

Educational demonstrations of PEFT techniques
Experiments with LoRA on BERT
Lightweight news classification tasks
Research on parameter-efficient training

It is not intended for production-level deployment, since the model was trained on a very small subset of the dataset.

Limitations

Training dataset is intentionally small (500 samples)
Performance may be limited compared to fully fine-tuned models
Model may not generalize well to unseen news sources

Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "NightPrince/peft-bert-agnews"

model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

text = "Apple announces new AI chip for data centers"
inputs = tokenizer(text, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

Citation

If you use this model in your research or projects, please cite:

Yahya Muhammad Alnwsany

Portfolio:

https://yahya-portfoli-app.netlify.app/

Author

Yahya Muhammad Alnwsany

Machine Learning Engineer | NLP Researcher

Downloads last month: 1

Model tree for NightPrince/peft-bert-agnews

Base model

google-bert/bert-base-uncased

Adapter

(131)

this model

NightPrince
/

peft-bert-agnews