Any possibilities Q6_KL?

#1
by johnlaborxxx - opened

Hi, wonderful model.

Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.

Hi, wonderful model.

Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.

Hello, llama-quantize supports Q6_K. So yes, Q6_K quant is possible. I will do it later.

Hi, wonderful model.

Just wondering if it is possible on a Q6_KL gguf?
Q8 is too large yet Q4 is too small, where Q6 is a sweet spot for 32GB vram. Thanks.

Done. Q6_K - KL quant uploaded. Enjoy πŸ˜€

I think this request is done πŸ˜ƒ

LuffyTheFox changed discussion status to closed

Great, downloading and testing now, thanks for taking the time and efforts!

Sign up or log in to comment