peft-bert-agnews
PEFT LoRA Fine-Tuned BERT for AG News Classification
Model Overview
This repository contains a parameter-efficient fine-tuned version of BERT Base Uncased trained for news topic classification using the AG News dataset. The model was fine-tuned using LoRA (Low-Rank Adaptation) via the PEFT library, allowing efficient training with a very small number of additional parameters.
The purpose of this project is to demonstrate how parameter-efficient fine-tuning techniques can be applied to transformer models for text classification tasks while minimizing computational cost.
Base Model
- Model: bert-base-uncased
- Architecture: BERT (Bidirectional Encoder Representations from Transformers)
- Task: Sequence Classification
- Number of labels: 4
Dataset
The model was trained using the AG News dataset, a widely used benchmark dataset for topic classification.
Dataset characteristics:
Total classes: 4
Classes include:
- World
- Sports
- Business
- Sci/Tech
For demonstration and rapid experimentation purposes, only 500 training samples from the training split were used during fine-tuning.
Training Method
Fine-tuning was performed using Parameter-Efficient Fine-Tuning (PEFT) with LoRA.
LoRA configuration:
- Rank (r): 8
- Alpha: 16
- Dropout: 0.1
- Target modules: query, value
This approach injects trainable low‑rank matrices into the attention layers while keeping the original pretrained weights frozen.
Advantages of this approach:
- Significantly fewer trainable parameters
- Lower GPU memory usage
- Faster training
- Easy adapter sharing
Training Configuration
- Training samples: 500
- Epochs: 1
- Batch size: 8
- Training framework: Hugging Face Transformers Trainer
- Fine-tuning method: PEFT (LoRA)
Intended Use
This model is intended for:
- Educational demonstrations of PEFT techniques
- Experiments with LoRA on BERT
- Lightweight news classification tasks
- Research on parameter-efficient training
It is not intended for production-level deployment, since the model was trained on a very small subset of the dataset.
Limitations
- Training dataset is intentionally small (500 samples)
- Performance may be limited compared to fully fine-tuned models
- Model may not generalize well to unseen news sources
Usage Example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "NightPrince/peft-bert-agnews"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
text = "Apple announces new AI chip for data centers"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
Citation
If you use this model in your research or projects, please cite:
Yahya Muhammad Alnwsany
Portfolio:
https://yahya-portfoli-app.netlify.app/
Author
Yahya Muhammad Alnwsany
Machine Learning Engineer | NLP Researcher
- Downloads last month
- 1
Model tree for NightPrince/peft-bert-agnews
Base model
google-bert/bert-base-uncased