ESPnet

non-profit

https://github.com/espnet/espnet

Activity Feed Request to join this org

AI & ML interests

voice-conversion speech-separation speech-enhancement speech-translation speech-synthesis speech-recognition spoken-language-understanding

Recent Activity

cjli updated a model 2 days ago

espnet/powsm_ctc

pyf98 authored a paper 8 days ago

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

pyf98 authored a paper 8 days ago

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

View all activity

cjli

updated a model 2 days ago

espnet/powsm_ctc

Automatic Speech Recognition • Updated 2 days ago • 80 • 5

authored 3 papers 8 days ago

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

Paper • 2505.24200 • Published May 30, 2025

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

Paper • 2502.15218 • Published Feb 21, 2025

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Paper • 2604.24954 • Published 10 days ago • 19

Aniket-Tathe-08

updated a model 15 days ago

espnet/marathi_lrec2020

Automatic Speech Recognition • Updated 15 days ago • 8

posted an update 17 days ago

Post

3233

Built a small site for tracking speech-to-speech, full-duplex, and audio foundation model work.
It covers models, benchmarks, datasets, and some blog posts to organize the landscape in one place.

Still early, but sharing in case it is useful:
https://www.fullduplex.ai/

If you spot missing entries or mistakes, I would really appreciate corrections.

2 replies

·

Aniket-Tathe-08

published a model 18 days ago

espnet/marathi_lrec2020

Automatic Speech Recognition • Updated 15 days ago • 8

authored a paper 24 days ago

Voxtral TTS

Paper • 2603.25551 • Published Mar 26 • 61

cjli

authored a paper 29 days ago

An Empirical Recipe for Universal Phone Recognition

Paper • 2603.29042 • Published Mar 30 • 5

in espnet/owsm_ctc_v4_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#4 opened 30 days ago by

in espnet/owsm_ctc_v3.2_ft_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#2 opened 30 days ago by

in espnet/owsm_ctc_v3.1_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#3 opened 30 days ago by

in espnet/owsm_ctc_v4_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#3 opened 30 days ago by

in espnet/owsm_ctc_v3.2_ft_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#1 opened 30 days ago by

in espnet/owsm_ctc_v3.1_1B 29 days ago

Add Open ASR Leaderboard evaluation results

#2 opened 30 days ago by

in espnet/yodas2 about 1 month ago

Concerns about licensing

#6 opened about 1 month ago by

updated a model about 2 months ago

espnet/ci_tools

updated a model about 2 months ago

espnet/cmusic_dev

Automatic Speech Recognition • Updated Mar 15 • 4

published a model about 2 months ago

espnet/cmusic_dev

Automatic Speech Recognition • Updated Mar 15 • 4

authored a paper about 2 months ago

Fish Audio S2 Technical Report

Paper • 2603.08823 • Published Mar 9 • 37