Commit History

Fixed quantization_config.llm_int8_skip_modules, to avoid re-quantizes embed_tokens layers on load
159a450
verified

techwithsergiu commited on

Upload a quantized Text-Only BNB 4-bit model
b21ca6a
verified

techwithsergiu commited on