Instructions to use BikoRiko/Gpt-Classification with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BikoRiko/Gpt-Classification with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="BikoRiko/Gpt-Classification")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("BikoRiko/Gpt-Classification", dtype="auto") - Notebooks
- Google Colab
- Kaggle
GPT-Classification: Custom Transformer for Text Classification
This model is a custom Transformer-based classifier built from scratch using PyTorch. Unlike standard pre-trained models, this was designed with a specific focus on understanding character-level patterns for short to medium-length text classification.
Model Architecture
- Type: GPT-style (Decoder-only architecture adapted for classification)
- Layers: 4 Transformer Blocks
- Heads: 4 Multi-Head Self-Attention
- Embedding Dimension: 128
- Context Window: 128 characters
- Classification Head: Linear layer applied to the mean of sequence embeddings.
Tokenization
- Level: Character-level
- Vocabulary Size: 62 unique characters
- Robustness: The
encodefunction is designed to ignore unknown characters to prevent runtime crashes during inference.
Dataset Information
- Source: Custom JSONL dataset
- Samples: 9,999 after cleaning
- Preprocessing: Removed malformed template labels and handled various special characters.
Files in this Folder
model.pt: The PyTorch state dictionary containing the trained weights.config.json: Contains the exact hyperparameters, the character-to-index mapping (stoi), and the label mapping for inference.README.md: This documentation file.
How to Use
- Load the
config.jsonto reconstruct thestoimapping and model hyperparameters. - Initialize the
GPTClassificationclass with the saved hyperparameters. - Load the weights using
torch.load('model.pt'). - Ensure input strings are encoded using the character map and padded/truncated to 128 characters.
- Downloads last month
- 28