Takuya Umeki's picture

Open to Collab

3 1 10

Takuya Umeki

consome2

otoearth

·

https://www.full-duplex.ai/

AI & ML interests

Full-duplex

Recent Activity

reacted to their post with ❤️ 21 days ago

We’ve released two conversational speech datasets from oto on Hugging Face 🤗 Both are based on real, casual, full-duplex conversations, but with slightly different focuses. Dataset 1: Processed / curated subset https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-processed-141h * Full-duplex, spontaneous multi-speaker conversations * Participants filtered for high audio quality * PII removal and audio enhancement applied * Designed for training and benchmarking S2S or dialogue models Dataset 2: Larger raw(er) release https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-280h * Same collection pipeline, with broader coverage * More diversity in speakers, accents, and conversation styles * Useful for analysis, filtering, or custom preprocessing experiments We intentionally split the release to support different research workflows: clean and ready-to-use vs. more exploratory and research-oriented use. The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested. If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together. Feedback and ideas are very welcome!

replied to their post 23 days ago

We’ve released two conversational speech datasets from oto on Hugging Face 🤗 Both are based on real, casual, full-duplex conversations, but with slightly different focuses. Dataset 1: Processed / curated subset https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-processed-141h * Full-duplex, spontaneous multi-speaker conversations * Participants filtered for high audio quality * PII removal and audio enhancement applied * Designed for training and benchmarking S2S or dialogue models Dataset 2: Larger raw(er) release https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-280h * Same collection pipeline, with broader coverage * More diversity in speakers, accents, and conversation styles * Useful for analysis, filtering, or custom preprocessing experiments We intentionally split the release to support different research workflows: clean and ready-to-use vs. more exploratory and research-oriented use. The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested. If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together. Feedback and ideas are very welcome!

posted an update 24 days ago

We’ve released two conversational speech datasets from oto on Hugging Face 🤗 Both are based on real, casual, full-duplex conversations, but with slightly different focuses. Dataset 1: Processed / curated subset https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-processed-141h * Full-duplex, spontaneous multi-speaker conversations * Participants filtered for high audio quality * PII removal and audio enhancement applied * Designed for training and benchmarking S2S or dialogue models Dataset 2: Larger raw(er) release https://huggingface.co/datasets/otoearth/otoSpeech-full-duplex-280h * Same collection pipeline, with broader coverage * More diversity in speakers, accents, and conversation styles * Useful for analysis, filtering, or custom preprocessing experiments We intentionally split the release to support different research workflows: clean and ready-to-use vs. more exploratory and research-oriented use. The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested. If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together. Feedback and ideas are very welcome!

View all activity

Organizations

liked a dataset 26 days ago

otoearth/otoSpeech-full-duplex-processed-141h

Preview • Updated 11 days ago • 109 • 19

liked a dataset about 1 month ago

otoearth/otoSpeech-full-duplex-280h

Preview • Updated 11 days ago • 529 • 7

liked 8 models 9 months ago

pyannote/speaker-diarization-3.1

Automatic Speech Recognition • Updated May 10, 2024 • 13.1M • 1.55k

pyannote/voice-activity-detection

Automatic Speech Recognition • Updated May 10, 2024 • 649k • 224

Qwen/Qwen2-Audio-7B-Instruct

Audio-Text-to-Text • Updated Jan 12, 2025 • 594k • 519

fixie-ai/ultravox-v0_5-llama-3_2-1b

Audio-Text-to-Text • 0.7B • Updated Nov 27, 2025 • 447k • 70

SWivid/F5-TTS

Text-to-Speech • Updated Mar 21, 2025 • 638k • 1.15k

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 7.4M • • 5.7k

coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 6.69M • 3.39k

nari-labs/Dia-1.6B

Text-to-Speech • Updated Jun 1, 2025 • 75k • • 2.83k