π’ Titanic Survival Classifier
A lightweight MLP classifier wrapped in the Hugging Face PreTrainedModel interface,
trained to predict passenger survival on the Titanic dataset.
Model description
| Component | Detail |
|---|---|
| Architecture | 4-layer MLP with BatchNorm, GELU, Dropout |
| Hidden dim | 128 |
| Input features | 13 engineered tabular features |
| Output | Binary (survived / not survived) |
| Parameters | ~12,578 |
Training details
| Setting | Value |
|---|---|
| Optimizer | AdamW |
| Learning rate | 0.001 |
| Scheduler | Cosine annealing |
| Epochs | 30 |
| Batch size | 32 |
| Train / Val / Test split | 80 / 10 / 10 % |
Feature engineering
Features used: Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title
Key transformations applied:
- Title extraction from passenger names (Mr, Mrs, Miss, Master, Rare)
- Age imputation using median per title group
- FamilySize = SibSp + Parch + 1; IsAlone flag
- HasCabin binary flag
- AgeBand and FareBand discretisation
- StandardScaler normalisation (params saved in
scaler_params.json)
Test set performance
| Metric | Score |
|---|---|
| Accuracy | 0.6111 |
| Precision | 0.0 |
| Recall | 0.0 |
| F1-Score | 0.0 |
How to use
import json, torch, numpy as np
from huggingface_hub import hf_hub_download
from transformers import PretrainedConfig, PreTrainedModel
REPO = "Asimzaman19/Fine_Tuning_Dataset"
# Load model
model = TitanicClassifier.from_pretrained(REPO)
model.eval()
# Load scaler params
params_path = hf_hub_download(REPO, "scaler_params.json")
with open(params_path) as f:
sp = json.load(f)
mean = np.array(sp["mean"])
scale = np.array(sp["scale"])
# Prepare a sample (must match FEATURES order)
raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)
scaled = ((raw - mean) / scale).astype(np.float32)
with torch.no_grad():
logits = model(torch.tensor(scaled)).logits
pred = logits.argmax(-1).item()
prob = torch.softmax(logits, dim=-1)[0, 1].item()
print(f"Survived: {bool(pred)} (prob={prob:.2%})")
Dataset
The Titanic dataset contains information about 891 passengers including demographics, ticket class, and fare β with the binary survival label as target.
Limitations
- Trained on a small historical dataset (891 rows); performance may not generalise beyond the Titanic domain.
- Features are hand-engineered; a more robust pipeline would use automated feature selection.
License
MIT
- Downloads last month
- 23
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Evaluation results
- accuracy on Titanicself-reported0.611
- f1 on Titanicself-reported0.000
- precision on Titanicself-reported0.000
- recall on Titanicself-reported0.000