Align temp of `transformers` snippet with with vLLM/OpenAI examples
Hello,
This is similar to a previous discussion https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/9
The transformers snippet defaults to pure greedy decoding instead of gently stochastic temp=0.1, the vLLM/OpenAI snippets already define a constant TEMP=0.1, so that's fine.
Note this small PR only concerns the snippet example. Since these hparams are missing from generation_config.json, anyone loading the model with the config will still default to greedy.
I can PR for the config like the other day, if that's relevant @juliendenize
model_id = "mistralai/Mistral-Small-4-119B-2603"
gen_config = GenerationConfig.from_pretrained(model_id)
print(gen_config)
print("do_sample:", gen_config.do_sample) # None
print("temperature:", gen_config.temperature) # None
Thanks for the PR ! I think it is nice for the snippet code but for the generation_config.json i'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.
However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)
Thanks for the PR ! I think it is nice for the snippet code but for the
generation_config.jsoni'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)
You're right, now you made me remember that Qwen also specify different sampling hparams depending on if base, instruct or reasoning. Since Mistral 4 unifies different
capabilities, I agree it's better not to enforce it.
BTW for reasoning="high" we recommend using TEMP=0.7