GPT-Classification: Custom Transformer for Text Classification

This model is a custom Transformer-based classifier built from scratch using PyTorch. Unlike standard pre-trained models, this was designed with a specific focus on understanding character-level patterns for short to medium-length text classification.

Model Architecture

  • Type: GPT-style (Decoder-only architecture adapted for classification)
  • Layers: 4 Transformer Blocks
  • Heads: 4 Multi-Head Self-Attention
  • Embedding Dimension: 128
  • Context Window: 128 characters
  • Classification Head: Linear layer applied to the mean of sequence embeddings.

Tokenization

  • Level: Character-level
  • Vocabulary Size: 62 unique characters
  • Robustness: The encode function is designed to ignore unknown characters to prevent runtime crashes during inference.

Dataset Information

  • Source: Custom JSONL dataset
  • Samples: 9,999 after cleaning
  • Preprocessing: Removed malformed template labels and handled various special characters.

Files in this Folder

  • model.pt: The PyTorch state dictionary containing the trained weights.
  • config.json: Contains the exact hyperparameters, the character-to-index mapping (stoi), and the label mapping for inference.
  • README.md: This documentation file.

How to Use

  1. Load the config.json to reconstruct the stoi mapping and model hyperparameters.
  2. Initialize the GPTClassification class with the saved hyperparameters.
  3. Load the weights using torch.load('model.pt').
  4. Ensure input strings are encoded using the character map and padded/truncated to 128 characters.
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support