peft-bert-agnews

PEFT LoRA Fine-Tuned BERT for AG News Classification

Model Overview

This repository contains a parameter-efficient fine-tuned version of BERT Base Uncased trained for news topic classification using the AG News dataset. The model was fine-tuned using LoRA (Low-Rank Adaptation) via the PEFT library, allowing efficient training with a very small number of additional parameters.

The purpose of this project is to demonstrate how parameter-efficient fine-tuning techniques can be applied to transformer models for text classification tasks while minimizing computational cost.

Base Model

  • Model: bert-base-uncased
  • Architecture: BERT (Bidirectional Encoder Representations from Transformers)
  • Task: Sequence Classification
  • Number of labels: 4

Dataset

The model was trained using the AG News dataset, a widely used benchmark dataset for topic classification.

Dataset characteristics:

  • Total classes: 4

  • Classes include:

    • World
    • Sports
    • Business
    • Sci/Tech

For demonstration and rapid experimentation purposes, only 500 training samples from the training split were used during fine-tuning.

Training Method

Fine-tuning was performed using Parameter-Efficient Fine-Tuning (PEFT) with LoRA.

LoRA configuration:

  • Rank (r): 8
  • Alpha: 16
  • Dropout: 0.1
  • Target modules: query, value

This approach injects trainable low‑rank matrices into the attention layers while keeping the original pretrained weights frozen.

Advantages of this approach:

  • Significantly fewer trainable parameters
  • Lower GPU memory usage
  • Faster training
  • Easy adapter sharing

Training Configuration

  • Training samples: 500
  • Epochs: 1
  • Batch size: 8
  • Training framework: Hugging Face Transformers Trainer
  • Fine-tuning method: PEFT (LoRA)

Intended Use

This model is intended for:

  • Educational demonstrations of PEFT techniques
  • Experiments with LoRA on BERT
  • Lightweight news classification tasks
  • Research on parameter-efficient training

It is not intended for production-level deployment, since the model was trained on a very small subset of the dataset.

Limitations

  • Training dataset is intentionally small (500 samples)
  • Performance may be limited compared to fully fine-tuned models
  • Model may not generalize well to unseen news sources

Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "NightPrince/peft-bert-agnews"

model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

text = "Apple announces new AI chip for data centers"
inputs = tokenizer(text, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

Citation

If you use this model in your research or projects, please cite:

Yahya Muhammad Alnwsany

Portfolio:

https://yahya-portfoli-app.netlify.app/

Author

Yahya Muhammad Alnwsany

Machine Learning Engineer | NLP Researcher

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NightPrince/peft-bert-agnews

Adapter
(131)
this model

Dataset used to train NightPrince/peft-bert-agnews