Loris's picture

1

Loris

ExodyZz

·

xeqtion-lab

AI & ML interests

ALL OF EM

Recent Activity

upvoted a paper about 23 hours ago

Solaris: Building a Multiplayer Video World Model in Minecraft

reacted to sergiopaniego's post with 🚀 about 23 hours ago

What happens when you make an LLM drive a car where physics are real and actions can't be undone? I ported CARLA, the autonomous driving simulator, to OpenEnv and added training support via TRL + Hugging Face Spaces. The model interacts with the simulator through tool calls (observe, brake, change lane) and learns from a reward signal. In 50 training steps, Qwen 0.6B learns to swerve and brake to avoid pedestrians in emergency situations. The project supports text and vision (VLMs can see through a camera sensor), open-world driving with traffic, and multiple driving scenarios. This builds on the carla-env project by sinatras, which originally placed LLMs inside CARLA for evaluation. We extended it with vision, new scenarios, rubric-based rewards, and made it trainable end-to-end. Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/ CARLA env in OpenEnv: https://github.com/meta-pytorch/OpenEnv/tree/main/envs/carla_env Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla.py

reacted to YatharthS's post with 🔥 about 23 hours ago

Just open sourced LavaSR v2: a model that can enhance 5000 seconds of audio in 1 second while being higher quality than giant and slow 6gb diffusion models! It works with any sampling rate from 8-48khz and is nearly 5000x faster than competition while being superior in objective benchmarks. LavaSR v2 is Perfect for - Enhancing TTS models. - Fixing old audio datasets. - Restoring low quality recordings. You can check out the examples and run it locally or online: Repo: https://github.com/ysharma3501/LavaSR.git Demo: https://huggingface.co/spaces/YatharthS/LavaSR Model: https://huggingface.co/YatharthS/LavaSR

View all activity

Organizations

None yet

upvoted a paper about 23 hours ago

Solaris: Building a Multiplayer Video World Model in Minecraft

Paper • 2602.22208 • Published 2 days ago • 21

reacted to sergiopaniego's post with 🚀 about 23 hours ago

Post

1282

What happens when you make an LLM drive a car where physics are real and actions can't be undone?

I ported CARLA, the autonomous driving simulator, to OpenEnv and added training support via TRL + Hugging Face Spaces.

The model interacts with the simulator through tool calls (observe, brake, change lane) and learns from a reward signal.

In 50 training steps, Qwen 0.6B learns to swerve and brake to avoid pedestrians in emergency situations.

The project supports text and vision (VLMs can see through a camera sensor), open-world driving with traffic, and multiple driving scenarios.

This builds on the carla-env project by sinatras, which originally placed LLMs inside CARLA for evaluation. We extended it with vision, new scenarios, rubric-based rewards, and made it trainable end-to-end.

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/
CARLA env in OpenEnv: https://github.com/meta-pytorch/OpenEnv/tree/main/envs/carla_env
Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/carla.py

reacted to YatharthS's post with 🔥 about 23 hours ago

Post

1067

Just open sourced LavaSR v2: a model that can enhance 5000 seconds of audio in 1 second while being higher quality than giant and slow 6gb diffusion models!

It works with any sampling rate from 8-48khz and is nearly 5000x faster than competition while being superior in objective benchmarks.

LavaSR v2 is Perfect for
- Enhancing TTS models.
- Fixing old audio datasets.
- Restoring low quality recordings.

You can check out the examples and run it locally or online:

Repo: https://github.com/ysharma3501/LavaSR.git
Demo: YatharthS/LavaSR
Model: YatharthS/LavaSR