Align temp of `transformers` snippet with with vLLM/OpenAI examples

#10

by casinca - opened Mar 17

base: refs/heads/main

←

from: refs/pr/10

Discussion Files changed

-0

casinca

Mar 17

Hello,

This is similar to a previous discussion https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/9
The transformers snippet defaults to pure greedy decoding instead of gently stochastic temp=0.1, the vLLM/OpenAI snippets already define a constant TEMP=0.1, so that's fine.

Note this small PR only concerns the snippet example. Since these hparams are missing from generation_config.json, anyone loading the model with the config will still default to greedy.

I can PR for the config like the other day, if that's relevant @juliendenize

model_id = "mistralai/Mistral-Small-4-119B-2603"

gen_config = GenerationConfig.from_pretrained(model_id)
print(gen_config)
print("do_sample:", gen_config.do_sample) # None
print("temperature:", gen_config.temperature)  # None

Align temp of `transformers` snippet with with vLLM/OpenAI examples5c4fc2bc

juliendenize changed pull request status to merged Mar 17

juliendenize

Mistral AI_ org Mar 17

Thanks for the PR ! I think it is nice for the snippet code but for the generation_config.json i'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.

However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)

casinca

Mar 17

•

edited Mar 17

Thanks for the PR ! I think it is nice for the snippet code but for the generation_config.json i'm not sure yet, i generally dislike enforcing some values by default especially for this model that fuses multiple tasks that may require tuning of hyperparameters.

However it's a great add you did for people to know they need to pass these arguments to trigger sampling :)

You're right, now you made me remember that Qwen also specify different sampling hparams depending on if base, instruct or reasoning. Since Mistral 4 unifies different
capabilities, I agree it's better not to enforce it.

patrickvonplaten

Mistral AI_ org Mar 17

BTW for reasoning="high" we recommend using TEMP=0.7

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment