Man Cub's picture

Man Cub

mancub

·

AI & ML interests

None yet

Recent Activity

new activity about 10 hours ago

z-lab/Qwen3.6-27B-DFlash:Are we going to see an update to this model?

new activity 3 days ago

Intel/gemma-4-31B-it-int4-AutoRound:INT8 version for TP=2 / dual Ampere GPUs?

new activity 6 days ago

z-lab/Qwen3.6-27B-DFlash:RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

View all activity

Organizations

None yet

New activity in z-lab/Qwen3.6-27B-DFlash about 10 hours ago

Are we going to see an update to this model?

#15 opened about 10 hours ago by

New activity in Intel/gemma-4-31B-it-int4-AutoRound 3 days ago

INT8 version for TP=2 / dual Ampere GPUs?

#6 opened 22 days ago by

New activity in z-lab/Qwen3.6-27B-DFlash 6 days ago

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::Half

#10 opened 21 days ago by

New activity in AesSedai/Qwen3.6-35B-A3B-GGUF 10 days ago

Q6_K?

#1 opened about 1 month ago by

New activity in froggeric/Qwen-Fixed-Chat-Templates 14 days ago

final v16 does not appear to work correctly, it stops after the first prompt.

#19 opened 14 days ago by

v13 stops dead after the first response

#14 opened 16 days ago by

New activity in Minachist/Qwen3.6-35B-A3B-INT8-AutoRound 15 days ago

Crashes with newest vllm version (v0.20.1)

#1 opened 24 days ago by

New activity in froggeric/Qwen-Fixed-Chat-Templates 18 days ago

v11/v12 performance considerations with Claude Code?

#11 opened 18 days ago by

When using Claude Code, tool calls end up broken with this chat template in Qwen3.6-27B

#6 opened 19 days ago by

New activity in Minachist/Qwen3.6-27B-INT8-AutoRound 22 days ago

Good quant!

#1 opened 26 days ago by

New activity in QuantTrio/gemma-4-31B-it-AWQ 22 days ago

Does not appear to work with the new google drafter MTP model

#2 opened 22 days ago by

New activity in google/gemma-4-31B-it-assistant 22 days ago

Is it supposed to work in vllm?

#2 opened 22 days ago by

New activity in z-lab/Qwen3.6-27B-DFlash about 1 month ago

Avg Draft acceptance rate is low.

#2 opened about 1 month ago by

New activity in LuffyTheFox/Qwen3.6-35B-A3B-Uncensored-Wasserstein-GGUF about 1 month ago

OOM and context limits reached too soon

#5 opened about 1 month ago by

New activity in rdtand/Qwen3.6-27B-PrismaQuant-5.5bit-vllm about 1 month ago

Unable to run on 3090

#1 opened about 1 month ago by

New activity in ubergarm/Qwen3.5-122B-A10B-GGUF about 1 month ago

How to split this model between 2 (3) GPUs and CPU/RAM ?

#12 opened 2 months ago by

New activity in QuantTrio/Qwen3.5-27B-AWQ about 2 months ago

My personal vLLM launch cmd on my old personal 2x3090 workstation

#1 opened 3 months ago by

New activity in mudler/gemma-4-26B-A4B-it-APEX-GGUF about 2 months ago

What was just updated and why?

#1 opened about 2 months ago by

New activity in adamjen/Devstral-Small-2-24B-Opus-Reasoning 2 months ago

How to use it with llama-server ?

#1 opened 2 months ago by

New activity in noctrex/Mistral-Small-4-119B-2603-MXFP4_MOE-GGUF 2 months ago

Poor performance and pretty lobotomized

#1 opened 2 months ago by