ASL โ†” English Bidirectional Translation (mBART-50 + LoRA)

Fine-tuned version of facebook/mbart-large-50-many-to-many-mmt for bidirectional translation between American Sign Language (ASL) gloss and English.

The LoRA adapter has been merged into the base model, so this is a drop-in replacement for mBART-50 that adds support for the new language code asl_GL.

Model Summary

  • Base model: facebook/mbart-large-50-many-to-many-mmt (610M parameters)
  • Adapter: LoRA (r=16, ฮฑ=32, dropout=0.1)
  • New language code: asl_GL (warm-started from en_XX)
  • Training data: 30,662 pseudo-gloss pairs generated by Gemini 3 Flash Preview
  • Epochs: 3 | effective batch: 32 | LR: 0.0005
  • Direction: bidirectional (single adapter handles enโ†’asl and aslโ†’en)

Evaluation Metrics

Evaluated on 2M-Flores-ASL val (979 rows, 20 few-shot IDs excluded).

Metric English โ†’ ASL Gloss ASL Gloss โ†’ English
chrF 28.02 39.26
BLEU 3.03 6.91
ROUGE-L 25.72 30.94
Token-F1 31.88 36.88

Usage

from transformers import MBart50TokenizerFast, MBartForConditionalGeneration

REPO = "manohonsy/asl-mbart-50-lora"
tokenizer = MBart50TokenizerFast.from_pretrained(REPO)
model     = MBartForConditionalGeneration.from_pretrained(REPO)

# English โ†’ ASL gloss
tokenizer.src_lang = "en_XX"
inputs = tokenizer("I want to bake a chocolate cake for my sister's birthday.",
                   return_tensors="pt", max_length=128, truncation=True)
out = model.generate(
    **inputs,
    forced_bos_token_id=tokenizer.convert_tokens_to_ids("asl_GL"),
    max_length=128,
    num_beams=4,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))
# Expected: ASL gloss like "IX WANT BAKE CHOCOLATE CAKE FOR SISTER BIRTHDAY"

# ASL gloss โ†’ English
tokenizer.src_lang = "asl_GL"
inputs = tokenizer("IX WANT BAKE CHOCOLATE CAKE FOR SISTER BIRTHDAY",
                   return_tensors="pt", max_length=128, truncation=True)
out = model.generate(
    **inputs,
    forced_bos_token_id=tokenizer.convert_tokens_to_ids("en_XX"),
    max_length=128,
    num_beams=4,
)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Tokenizer note

After loading, you may need to manually register asl_GL in lang_code_to_id:

asl_id = tokenizer.convert_tokens_to_ids("asl_GL")
tokenizer.lang_code_to_id["asl_GL"] = asl_id

Training Pipeline

The training data was generated in a 4-notebook capstone pipeline:

  1. Data prep: How2Sign (English subtitles) + 2M-Flores-ASL (authentic ASL gloss)
  2. Pseudo-gloss generation: Gemini 3 Flash Preview via Batch API with 20 few-shot examples and structured JSON output; 99.99% success rate; token-F1 0.41 vs authentic gloss
  3. LoRA fine-tuning: bidirectional training on 30,662 pairs, single adapter, gradient masking so only the asl_GL embedding row updates during training
  4. Evaluation: held-out 2M-Flores-ASL val split (20 few-shot IDs excluded)

Gloss is pseudo-gloss (LLM-generated) not authentic ASL, so the model learns a translation-friendly approximation of ASL gloss conventions including UPPERCASE signs, # for fingerspelled/acronyms, cl: for classifier predicates, and IX for pointing.

Limitations

  • Trained on pseudo-gloss from How2Sign subtitles, not true authentic ASL
  • English โ†’ ASL direction has weaker performance than ASL โ†’ English (mBART-50 was pretrained to generate English, not ASL)
  • Fingerspelling of rare proper nouns often contains errors
  • No sign language video input or production; text-only gloss representation

Citation

If this model is useful to your work, please cite the Gemini 3 Flash Preview model used for pseudo-gloss generation, the mBART-50 paper, and the 2M-Flores-ASL and How2Sign datasets.

License

Apache 2.0, inheriting from mBART-50.

Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for manohonsy/asl-mbart-50-lora

Adapter
(19)
this model