vllm error

#2
by huang123chuan - opened

ValueError: The output_size of gate's and up's weight = 704 is not divisible by weight quantization block_n = 128.

I am also getting similar errors. I am trying to load the weights on 2 gpus.

Got the same error.

(EngineCore pid=95) ERROR 04-06 10:05:58 [core.py:1108] ValueError: The output_size of gate's and up's weight = 704 is not divisible by weight quantization block_n = 128.

Sign up or log in to comment