PARSeq Malayalam OCR

This repository contains a PARSeq checkpoint trained for Malayalam OCR.

Model details

  • Architecture: PARSeq
  • Framework: PyTorch Lightning checkpoint
  • Checkpoint file: checkpoints/last.ckpt
  • Charset config: configs/charset/malayalam.yaml
  • Training data source: magles/malayalam-synthetic-ocr-datsetthh
  • Training environment: NVIDIA A40 with mixed precision

Important note

This is a Lightning .ckpt checkpoint, not a native Hugging Face Transformers model. Use it with the original PARSeq codebase for inference or further fine-tuning.

Load for inference

from strhub.models.parseq.system import PARSeq

model = PARSeq.load_from_checkpoint("checkpoints/last.ckpt")
model.eval()

Continue fine-tuning

python train.py \
  charset=malayalam \
  dataset=malayalam \
  data.root_dir=data \
  data.train_dir=YOUR_LMDB_DIR \
  data.normalize_unicode=false \
  trainer.accelerator=gpu \
  trainer.devices=1 \
  ckpt_path=checkpoints/last.ckpt

Notes

  • Validation in the referenced run used a very small validation split, so those metrics should not be treated as definitive.
  • This checkpoint is best used as a reusable starting point for further evaluation and fine-tuning.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support