ISTA-DASLab/Qwen2-72B-AQLM-PV-2bit-1x16
Text Generation • 12B • Updated • 3
None defined yet.
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling