embedl/Cosmos-Reason2-2B-W4A16-Edge2 Image-Text-to-Text • 2B • Updated 3 days ago • 10.6k • 10
view post Post 139 Qwen3.5 on-device benchmarks on the Nvidia Jetson lineup are now live 🚀 We've added the latest Qwen3.5 models (08B - 9B) to our on-device inference benchmarks (Nvidia Jetson Orin Nano Super, AGX Orin, AGX Thor).👉 Explore TPS, TTFT, E2E latency, and TPOT. Measured on real hardware: embedl/Edge-Inference-Benchmarks🌟 Stay tuned for additional benchmarks and Embedl-optimized models: Enabling models run faster and on less expensive hardware. If you're working on edge LLM deployment, we'd love to discuss your use case. See translation 1 reply · 🚀 1 1 🔥 1 1 + Reply
embedl/Cosmos-Reason2-2B-W4A16-Edge2 Image-Text-to-Text • 2B • Updated 3 days ago • 10.6k • 10
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. • 15 items • Updated 6 days ago • 1