BikoRiko
/

Gpt-Classification

Text Classification

Model card Files Files and versions

GPT-Classification: Custom Transformer for Text Classification

This model is a custom Transformer-based classifier built from scratch using PyTorch. Unlike standard pre-trained models, this was designed with a specific focus on understanding character-level patterns for short to medium-length text classification.

Model Architecture

Type: GPT-style (Decoder-only architecture adapted for classification)
Layers: 4 Transformer Blocks
Heads: 4 Multi-Head Self-Attention
Embedding Dimension: 128
Context Window: 128 characters
Classification Head: Linear layer applied to the mean of sequence embeddings.

Tokenization

Level: Character-level
Vocabulary Size: 62 unique characters
Robustness: The encode function is designed to ignore unknown characters to prevent runtime crashes during inference.

Dataset Information

Source: Custom JSONL dataset
Samples: 9,999 after cleaning
Preprocessing: Removed malformed template labels and handled various special characters.

Files in this Folder

model.pt: The PyTorch state dictionary containing the trained weights.
config.json: Contains the exact hyperparameters, the character-to-index mapping (stoi), and the label mapping for inference.
README.md: This documentation file.

How to Use

Load the config.json to reconstruct the stoi mapping and model hyperparameters.
Initialize the GPTClassification class with the saved hyperparameters.
Load the weights using torch.load('model.pt').
Ensure input strings are encoded using the character map and padded/truncated to 128 characters.

Downloads last month: 28