Collection of Quantized Models for MoE
Krishna Teja Chitty-Venkata
AI & ML interests
LLM Optimization, Neural Architecture Search, Quantization, Pruning
Recent Activity
updated a model about 1 hour ago
inference-optimization/llama3_8b_6.0_bits_mode_hybrid_stiched published a model about 1 hour ago
inference-optimization/llama3_8b_6.0_bits_mode_hybrid_stiched updated a collection 15 days ago
HIGGS-per-tensor