wav2vec2-japanese-mlx

MLX-converted wav2vec2 model for Japanese CTC forced alignment on Apple Silicon.

Overview

This is a conversion of jonatasgrosman/wav2vec2-large-xlsr-53-japanese to MLX format, enabling torch-free inference on Apple Silicon Macs.

Usage

Requirement already satisfied: mlx-forced-aligner in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (0.1.0) Requirement already satisfied: mlx>=0.20.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from mlx-forced-aligner) (0.31.1) Requirement already satisfied: numpy>=1.26.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from mlx-forced-aligner) (2.2.6) Requirement already satisfied: soundfile>=0.12.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from mlx-forced-aligner) (0.13.1) Requirement already satisfied: huggingface-hub>=0.20.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from mlx-forced-aligner) (0.32.3) Requirement already satisfied: filelock in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (3.18.0) Requirement already satisfied: fsspec>=2023.5.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (2025.5.1) Requirement already satisfied: packaging>=20.9 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (24.2) Requirement already satisfied: pyyaml>=5.1 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (6.0.2) Requirement already satisfied: requests in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (2.32.3) Requirement already satisfied: tqdm>=4.42.1 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (4.67.1) Requirement already satisfied: typing-extensions>=3.7.4.3 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (4.13.2) Requirement already satisfied: hf-xet<2.0.0,>=1.1.2 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from huggingface-hub>=0.20.0->mlx-forced-aligner) (1.1.2) Requirement already satisfied: mlx-metal==0.31.1 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from mlx>=0.20.0->mlx-forced-aligner) (0.31.1) Requirement already satisfied: cffi>=1.0 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from soundfile>=0.12.0->mlx-forced-aligner) (1.17.1) Requirement already satisfied: pycparser in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from cffi>=1.0->soundfile>=0.12.0->mlx-forced-aligner) (2.22) Requirement already satisfied: charset-normalizer<4,>=2 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from requests->huggingface-hub>=0.20.0->mlx-forced-aligner) (3.4.2) Requirement already satisfied: idna<4,>=2.5 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from requests->huggingface-hub>=0.20.0->mlx-forced-aligner) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from requests->huggingface-hub>=0.20.0->mlx-forced-aligner) (2.4.0) Requirement already satisfied: certifi>=2017.4.17 in /Users/naoki/.pyenv/versions/3.12.3/lib/python3.12/site-packages (from requests->huggingface-hub>=0.20.0->mlx-forced-aligner) (2025.4.26)

Model Details

  • Base model: facebook/wav2vec2-large-xlsr-53 fine-tuned on Japanese
  • Parameters: ~315M
  • Framework: MLX (Apple Silicon optimized)
  • Use case: Character/word-level forced alignment for Japanese speech

License

Apache-2.0 (inherited from base model)

Downloads last month
104
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Coidemo/wav2vec2-japanese-mlx

Finetuned
(3)
this model