GLM-4.7-Flash-oQ3.5

This model was quantized using oQ mixed-precision quantization.

Quantization details

  • Model type: glm4_moe_lite
  • Bits: 3
  • Group size: 64
  • Format: MLX safetensors

Benchmark

Model File size MMLU JMMLU HELLASWAG ARC_CHALLENGE GSM8K
GLM-4.7-Flash-MLX-6bit 22.68 GB 71.3% 63.3% 69.0% 86.0% 89.3%
GLM-4.7-Flash-oQ3 12.93 GB 63.7% 56.3% 62.0% 80.3% 88.0%
GLM-4.7-Flash-oQ3.5 14.00 GB 63.7% 56.7% 59.3% 78.7% 84.0%
GLM-4.7-Flash-oQ4 16.4 GB 71.0% 60.0% 62.0% 84.3% 92.0%
GLM-4.7-Flash-REAP-23B-A3B-6bit 17.43 GB 62.3% 46.0% - - -
GLM-4.7-Flash-REAP-23B-A3B-oQ3 9.91 GB 53.3% 38.3% 47.7% 73.3% 73.3%
GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 10.62 GB 57.7% 49.3% - - -
GLM-4.7-Flash-REAP-23B-A3B-oQ4 12.51 GB 59.3% 43.0% 53.3% 78.7% 87.7%
GLM-4.7-Flash-REAP-23B-A3B-oQ5 15.21 GB 61.0% 45.3% 59.0% 81.0% 90.0%

Detail

Model Benchmark Accuracy Correct Total Time(s)
GLM-4.7-Flash-MLX-6bit MMLU 71.3% 214 300 533.4
GLM-4.7-Flash-MLX-6bit JMMLU 63.3% 190 300 260.3
GLM-4.7-Flash-MLX-6bit HELLASWAG 69.0% 207 300 305.7
GLM-4.7-Flash-MLX-6bit ARC_CHALLENGE 86.0% 258 300 200.5
GLM-4.7-Flash-MLX-6bit GSM8K 89.3% 268 300 813.9
GLM-4.7-Flash-oQ3 MMLU 63.7% 191 300 554.4
GLM-4.7-Flash-oQ3 JMMLU 56.3% 169 300 433.9
GLM-4.7-Flash-oQ3 HELLASWAG 62.0% 186 300 355.8
GLM-4.7-Flash-oQ3 ARC_CHALLENGE 80.3% 241 300 196.4
GLM-4.7-Flash-oQ3 GSM8K 88.0% 264 300 857.8
GLM-4.7-Flash-oQ3.5 MMLU 63.7% 191 300 564.6
GLM-4.7-Flash-oQ3.5 JMMLU 56.7% 170 300 439.6
GLM-4.7-Flash-oQ3.5 HELLASWAG 59.3% 178 300 335.4
GLM-4.7-Flash-oQ3.5 ARC_CHALLENGE 78.7% 236 300 192.8
GLM-4.7-Flash-oQ3.5 GSM8K 84.0% 252 300 859.4
GLM-4.7-Flash-oQ4 MMLU 71.0% 213 300 569
GLM-4.7-Flash-oQ4 JMMLU 60.0% 180 300 297.9
GLM-4.7-Flash-oQ4 HELLASWAG 62.0% 186 300 346.3
GLM-4.7-Flash-oQ4 ARC_CHALLENGE 84.3% 253 300 190.9
GLM-4.7-Flash-oQ4 GSM8K 92.0% 276 300 820.9
GLM-4.7-Flash-REAP-23B-A3B-6bit MMLU 62.3% 187 300 505.9
GLM-4.7-Flash-REAP-23B-A3B-6bit JMMLU 46.0% 138 300 239.7
GLM-4.7-Flash-REAP-23B-A3B-oQ3 MMLU 53.3% 160 300 602.7
GLM-4.7-Flash-REAP-23B-A3B-oQ3 JMMLU 38.3% 115 300 255.7
GLM-4.7-Flash-REAP-23B-A3B-oQ3 HELLASWAG 47.7% 143 300 346.8
GLM-4.7-Flash-REAP-23B-A3B-oQ3 ARC_CHALLENGE 73.3% 220 300 204.8
GLM-4.7-Flash-REAP-23B-A3B-oQ3 GSM8K 73.3% 220 300 1029.3
GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 MMLU 57.7% 173 300 555.1
GLM-4.7-Flash-REAP-23B-A3B-oQ3.5 JMMLU 49.3% 148 300 252.4
GLM-4.7-Flash-REAP-23B-A3B-oQ4 MMLU 63.3% 190 300 550.7
GLM-4.7-Flash-REAP-23B-A3B-oQ4 JMMLU 39.7% 119 300 250.9
GLM-4.7-Flash-REAP-23B-A3B-oQ4 MMLU 59.3% 178 300 547.7
GLM-4.7-Flash-REAP-23B-A3B-oQ4 JMMLU 43.0% 129 300 232.6
GLM-4.7-Flash-REAP-23B-A3B-oQ4 HELLASWAG 53.3% 160 300 300.5
GLM-4.7-Flash-REAP-23B-A3B-oQ4 ARC_CHALLENGE 78.7% 236 300 179.7
GLM-4.7-Flash-REAP-23B-A3B-oQ4 GSM8K 87.7% 263 300 748.4
GLM-4.7-Flash-REAP-23B-A3B-oQ5 MMLU 61.0% 183 300 617.8
GLM-4.7-Flash-REAP-23B-A3B-oQ5 JMMLU 45.3% 136 300 273
GLM-4.7-Flash-REAP-23B-A3B-oQ5 HELLASWAG 59.0% 177 300 353.6
GLM-4.7-Flash-REAP-23B-A3B-oQ5 ARC_CHALLENGE 81.0% 243 300 201.2
GLM-4.7-Flash-REAP-23B-A3B-oQ5 GSM8K 90.0% 270 300 1001.1
GLM-4.7-Flash-REAP-23B-A3B-oQ5 MMLU 61.0% 183 300 617.8
GLM-4.7-Flash-REAP-23B-A3B-oQ5 JMMLU 45.3% 136 300 273
GLM-4.7-Flash-REAP-23B-A3B-oQ5 HELLASWAG 59.0% 177 300 353.6
GLM-4.7-Flash-REAP-23B-A3B-oQ5 ARC_CHALLENGE 81.0% 243 300 201.2
GLM-4.7-Flash-REAP-23B-A3B-oQ5 GSM8K 90.0% 270 300 1001.1
Downloads last month
344
Safetensors
Model size
4B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RepublicOfKorokke/GLM-4.7-Flash-oQ3.5

Quantized
(77)
this model