Align temp of `transformers` snippet with with vLLM/OpenAI examples

#10

Hello,

This is similar to a previous discussion https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/9
The transformers snippet defaults to pure greedy decoding instead of gently stochastic temp=0.1, the vLLM/OpenAI snippets already define a constant TEMP=0.1, so that's fine.

Note this small PR only concerns the snippet example. Since these hparams are missing from generation_config.json, anyone loading the model with the config will still default to greedy.

I can PR for the config like the other day, if that's relevant @juliendenize

model_id = "mistralai/Mistral-Small-4-119B-2603"

gen_config = GenerationConfig.from_pretrained(model_id)
print(gen_config)
print("do_sample:", gen_config.do_sample) # None
print("temperature:", gen_config.temperature)  # None
juliendenize changed pull request status to merged
Mistral AI_ org

Thanks for the PR ! I think it is nice for the snippet code but for the generation_config.json i'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.

However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)

Thanks for the PR ! I think it is nice for the snippet code but for the generation_config.json i'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.

However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)

You're right, now you made me remember that Qwen also specify different sampling hparams depending on if base, instruct or reasoning. Since Mistral 4 unifies different
capabilities, I agree it's better not to enforce it.

Mistral AI_ org

BTW for reasoning="high" we recommend using TEMP=0.7

Sign up or log in to comment