Any possibilities Q6_KL?
#1
by johnlaborxxx - opened
Hi, wonderful model.
Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.
Hi, wonderful model.
Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.
Hello, llama-quantize supports Q6_K. So yes, Q6_K quant is possible. I will do it later.
Hi, wonderful model.
Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.
Done. Q6_K - KL quant uploaded. Enjoy π
I think this request is done π
LuffyTheFox changed discussion status to closed
Great, downloading and testing now, thanks for taking the time and efforts!