| --- |
| license: mit |
| library_name: mlx |
| tags: |
| - mlx |
| - audio |
| - speech-enhancement |
| - noise-suppression |
| - deepfilternet |
| - apple-silicon |
| base_model: DeepFilterNet/DeepFilterNet |
| pipeline_tag: audio-to-audio |
| --- |
| |
| # DeepFilterNet1 — MLX |
|
|
| MLX-compatible weights for [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement model that suppresses background noise from audio. |
|
|
| This is a direct conversion of the original PyTorch weights to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. |
|
|
| ## Origin |
|
|
| - **Original model:** [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter |
| - **Paper:** [DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering](https://arxiv.org/abs/2110.05588) |
| - **License:** MIT (same as the original) |
| - **Conversion:** PyTorch -> `safetensors` via the included `convert_deepfilternet.py` script |
|
|
| No fine-tuning or quantization was applied. Weights are converted directly from the original checkpoint. |
|
|
| ## Files |
|
|
| | File | Description | |
| |---|---| |
| | `config.json` | Model architecture configuration | |
| | `model.safetensors` | Pre-converted weights (~7.2 MB, float32) | |
| | `convert_deepfilternet.py` | Conversion script (PyTorch -> MLX safetensors) | |
|
|
| ## Model Details |
|
|
| | Parameter | Value | |
| |---|---| |
| | Sample rate | 48 kHz | |
| | FFT size | 960 | |
| | Hop size | 480 | |
| | ERB bands | 32 | |
| | DF bins | 96 | |
| | DF order | 5 | |
| | Embedding hidden dim | 512 | |
|
|
| ## Usage |
|
|
| ### Swift (mlx-audio-swift) |
|
|
| ```swift |
| import MLXAudioSTS |
| |
| let model = try await DeepFilterNetModel.fromPretrained("iky1e/DeepFilterNet1-MLX") |
| let enhanced = try model.enhance(audioArray) |
| ``` |
|
|
| ### Python (mlx-audio) |
|
|
| ```python |
| from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel |
| |
| model = DeepFilterNetModel.from_pretrained("iky1e/DeepFilterNet1-MLX") |
| enhanced = model.enhance("noisy.wav") |
| ``` |
|
|
| ## Converting from PyTorch |
|
|
| ```bash |
| python convert_deepfilternet.py \ |
| --input /path/to/DeepFilterNet \ |
| --output ./DeepFilterNet1-MLX \ |
| --name DeepFilterNet |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{schroeter2022deepfilternet, |
| title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering}, |
| author = {Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas}, |
| booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, |
| year={2022}, |
| organization={IEEE} |
| } |
| ``` |
|
|