Efficient Intelligence and Systems

community

AI & ML interests

Low-bit Quantization of Large Language Models (LLMs)

Recent Activity

AaronHuangWei submitted a paper 1 day ago

LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation

AaronHuangWei submitted a paper 15 days ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Xingyu-Zheng authored a paper about 1 month ago

First-Order Error Matters: Accurate Compensation for Quantized Large Language Models

View all activity

Efficient-ML 's models 52

Efficient-ML/GPTQ-for-Qwen3

Updated May 12, 2025

Efficient-ML/Qwen3-awq

Updated May 7, 2025

Efficient-ML/Qwen3-8B-gptq-w8-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-14B-gptq-w4-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-14B-gptq-w4-128

Updated May 7, 2025 • 1

Efficient-ML/Qwen3-14B-gptq-w8-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-14B-gptq-w8-128

Updated May 7, 2025

Efficient-ML/Qwen3-14B-base-gptq-w8-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-14B-base-gptq-w8-128

Updated May 7, 2025

Efficient-ML/Qwen3-8B-gptq-w8-128

Updated May 7, 2025

Efficient-ML/Qwen3-8B-gptq-w4-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-8B-gptq-w4-128

Updated May 7, 2025

Efficient-ML/Qwen3-4B-gptq-w8-perchannel

Updated May 7, 2025

Efficient-ML/Qwen3-4B-gptq-w8-128

Updated May 7, 2025

Efficient-ML/Qwen3-4B-gptq-w4-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-4B-gptq-w4-128

Updated May 6, 2025

Efficient-ML/Qwen3-1.7B-gptq-w8-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-1.7B-gptq-w8-128

Updated May 6, 2025

Efficient-ML/Qwen3-1.7B-gptq-w4-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-1.7B-gptq-w4-128

Updated May 6, 2025

Efficient-ML/Qwen3-0.6B-gptq-w8-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-0.6B-gptq-w8-128

Updated May 6, 2025

Efficient-ML/Qwen3-0.6B-gptq-w4-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-0.6B-gptq-w4-128

Updated May 6, 2025

Efficient-ML/Qwen3-14B-base-gptq-w4-perchannel

Updated May 6, 2025

Efficient-ML/Qwen3-14B-base-gptq-w4-128

Updated May 6, 2025

Efficient-ML/Qwen3-8B-base-gptq-w8-perchannel

Updated May 5, 2025

Efficient-ML/Qwen3-8B-base-gptq-w8-128

Updated May 5, 2025

Efficient-ML/Qwen3-8B-base-gptq-w4-perchannel

Updated May 5, 2025

Efficient-ML/Qwen3-8B-base-gptq-w4-128

Updated May 5, 2025