Issue with response rendering (Response appearing in thought section, empty output)

#1
by dtrknt - opened

Hi, I encountered a strange issue using this model with Ollama.
The actual response is being rendered inside the "Thought" (reasoning) section, while the main output field remains empty. It seems the default template might be incorrectly triggering a reasoning mode or misusing the tag for this Qwen3-based model.
It seems the default Modelfile template in Ollama includes tags by mistake, causing the output to render in the 'Reasoning' box and breaking the actual response.

初めまして。
Ollamaで複数の量子化モデルを試したのですが、共通して問題が発生しております。
実際の回答が「思考中」のセクション内に表示され、メインの出力は空のまま終了してしまいます。
デフォルトのテンプレートが誤って推論モードを起動しているか、或いはタグを誤って使用している可能性があります。
お忙しいところ恐れ入ります。今後とも応援しております。

hello.

Check the prompt template with the following command:

ollama show --modelfile hf.co/dahara1/shisa-v2.1-qwen3-8b-UD-japanese-imatrix

Replace "hf.co/dahara1/shisa-v2.1-qwen3-8b-UD-japanese-imatrix" with the model you are using.

This command should display something like this:

TEMPLATE """{{- $lastUserIdx := -1 -}}
{{- range $idx, $msg := .Messages -}}
{{- if eq $msg.Role "user" }}{{ $lastUserIdx = $idx }}{{ end -}}
{{- end }}
{{- if or .System .Tools }}<|im_start|>system
{{ if .System }}{{ .System }}

{{ end }}
{{- if .Tools }}# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end -}}
<|im_end|>
{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}}
<think>{{ .Thinking }}</think>
{{ end -}}
{{ if .Content }}{{ .Content }}{{ end }}
{{- if .ToolCalls }}
{{- range .ToolCalls }}
<tool_call>
{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
</tool_call>
{{- end }}
{{- end }}{{ if not $last }}<|im_end|>
{{ end }}
{{- else if eq .Role "tool" }}<|im_start|>user
<tool_response>
{{ .Content }}
</tool_response><|im_end|>
{{ end }}
{{- if and (ne .Role "assistant") $last }}<|im_start|>assistant
<think>
{{ end }}
{{- end }}"""

Neither the model author nor the quantized version author has set up such a template.
This is a template created by Ollama.
While it is possible to create a modelfile with the correct prompt template, Ollama may not be able to interpret it.

Therefore, I recommend using llama.cpp instead of Ollama.
https://github.com/ggml-org/llama.cpp/releases

It’s an Ollama-specific issue, I see… got it.
I’ll go ahead and use llama.cpp instead.
Thank you for taking the time to reply, especially when you’re so busy! 🙏

Ollama特有の問題なのですね…承知いたしました。
llama.cppを使うことにします。
お忙しいところご回答ありがとうございます!

Sign up or log in to comment