Shivam Kumar

shivamkumar

·

AI & ML interests

None yet

Recent Activity

liked a model 19 days ago

Raelina/Raehoshi-Anima

liked a model 24 days ago

prefeitura-rio/Rio-3.5-Open-397B

liked a model 25 days ago

View all activity

Organizations

upvoted a collection 29 days ago

Z-Image-Engineer

Various versions of my Z-Image-Engineer models. • 11 items • Updated Jun 6 • 11

upvoted a collection 30 days ago

Macaron-V1

2 items • Updated about 1 month ago • 7

upvoted a collection about 1 month ago

Raon

9 items • Updated May 21 • 46

upvoted a collection about 2 months ago

Ace-Step 1.5-xl

3 items • Updated Apr 2 • 81

upvoted a collection 4 months ago

pplx-embed

Diffusion-Pretrained Dense and Contextual Embeddings • 10 items • Updated May 26 • 100

upvoted 7 collections 5 months ago

MOSS-TTS

14 items • Updated 8 days ago • 36

VoxCPM

Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning • 5 items • Updated May 24 • 14

Nemotron-Personas

A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. • 10 items • Updated 20 days ago • 57

Z-Image

8 items • Updated 27 days ago • 156

Qwen3-ASR

7 items • Updated 12 days ago • 74

Text-To-Speech

https://kyutai.org/next/tts • 6 items • Updated Mar 2 • 27

GLiNER-decoder

A joint encoder-decoder GLiNER model for a scalable open-ontology entity recognition • 3 items • Updated Jan 29 • 18

upvoted 2 papers 5 months ago

X-Talk: On the Underestimated Potential of Modular Speech-to-Speech Dialogue System

Paper • 2512.18706 • Published Dec 21, 2025 • 1

Qwen3-TTS Technical Report

Paper • 2601.15621 • Published Jan 22 • 77

upvoted an article 5 months ago

Article

Introducing Waypoint-1: Real-time interactive video diffusion from Overworld

+3

lapp0, LouisCastricato, ScottieFox, shahbuland, xAesthetics

•

Jan 20

• 43

upvoted 2 collections 6 months ago

Qwen3-TTS

7 items • Updated Jan 22 • 369

sam-audio

9 items • Updated Mar 2 • 142

upvoted a paper 6 months ago

AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Paper • 2512.23343 • Published Dec 29, 2025 • 30

upvoted a collection 6 months ago

Nemotron Speech

Open, state-of-the-art, production‑ready enterprise speech models from the NVIDIA Speech research team for ASR, TTS, Speaker Diarization and S2S • 13 items • Updated 22 days ago • 59

upvoted a paper 8 months ago

Yan: Foundational Interactive Video Generation

Paper • 2508.08601 • Published Aug 12, 2025 • 1