STR-Lite

STR-Lite is an ultra-lightweight scene text recognition model that combines Masked Autoencoder (MAE) pretraining with an autoregressive decoder for text generation. With only 6M parameters, it achieves competitive accuracy while remaining highly efficient for real-world deployment.

GitHub: balaboom123/STR-Lite
Author: Kuanwei Chen
License: MIT

Model Architecture

Component	Details
Backbone	ViT-Tiny (embed=192, depth=12, heads=12)
Decoder	1-layer autoregressive transformer (embed=192, heads=12)
Input size	32 × 128 (H × W)
Patch size	4 × 8
Parameters	~6M
Precision	bfloat16

Training

Stage 1 — MAE Pretraining

Dataset: U14M-Unlabeled
Epochs: 40

Stage 2 — Fine-tuning

Dataset: U14M-L-Filtered
Epochs: 20, Batch: 256, LR: 1e-3, Weight decay: 0.01

Checkpoints

Model	Description	Epochs	Acc	Download
MAE ViT-Tiny	Pretrained encoder only	40	—	pretrain/checkpoint-last.pth
STRLite	Full fine-tuned model	20	93.82%	finetune/checkpoint-best.pth

Results

Common STR Benchmarks

Subset	w/ pretrain	w/o pretrain
CUTE80	95.83	94.79
IC13	96.85	96.50
IC15	86.80	86.25
IIIT5k	96.97	96.47
SVT	95.36	94.90
SVTP	92.40	89.77
Weighted avg.	93.82	93.12

U14M Benchmarks

Subset	w/ pretrain	w/o pretrain
artistic	67.78	62.11
contextless	78.95	77.43
curve	82.19	78.97
general	81.07	79.96
multi oriented	82.91	78.57
multi words	76.72	74.31
salient	78.17	75.33
Weighted avg.	81.03	79.88

Usage

Download and evaluate:

git clone https://github.com/balaboom123/STR-Lite
cd STR-Lite

# Download checkpoint
from huggingface_hub import hf_hub_download
path = hf_hub_download("balaboom123/STRLite", "finetune/checkpoint-best.pth")

# Evaluate
python eval.py \
  resume=$path \
  test_data_path='[/path/to/lmdb_test]'

Fine-tune from MAE pretrained weights:

path = hf_hub_download("balaboom123/STRLite", "pretrain/checkpoint-last.pth")

python main_finetune.py \
  train_data_path='[/path/to/lmdb_train]' \
  val_data_path='[/path/to/lmdb_val]' \
  pretrained_mae=$path

See the GitHub repo for full installation and dataset preparation instructions.

Downloads last month: -; Downloads are not tracked for this model. How to track