OpenThinker-7B

Instructions to use open-thoughts/OpenThinker-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use open-thoughts/OpenThinker-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="open-thoughts/OpenThinker-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("open-thoughts/OpenThinker-7B")
model = AutoModelForCausalLM.from_pretrained("open-thoughts/OpenThinker-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use open-thoughts/OpenThinker-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "open-thoughts/OpenThinker-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "open-thoughts/OpenThinker-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/open-thoughts/OpenThinker-7B

SGLang

How to use open-thoughts/OpenThinker-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "open-thoughts/OpenThinker-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "open-thoughts/OpenThinker-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "open-thoughts/OpenThinker-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "open-thoughts/OpenThinker-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use open-thoughts/OpenThinker-7B with Docker Model Runner:
```
docker model run hf.co/open-thoughts/OpenThinker-7B
```

Kudos - really strong small model! MMLU-Pro benchmarks

by Philp - opened Feb 24, 2025

Discussion

Philp

Feb 24, 2025

I'm running the 4 bit mlx quantized model on a standard m4 macbook with LMStudio.
Some of the areas the performance is on par or better than 70B models.
Particularly strong in math, biology, business, chemistry, economics, engineering - may not be all.

MMLU-PRO Leaderboard:
https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro

2025-02-24 09:44:21.855992
{
"comment": "",
"server": {
"url": "http://127.0.0.1:1234/v1",
"model": "openthinker-7b-mlx@4bit",
"timeout": 600.0
},
"inference": {
"temperature": 0.0,
"top_p": 1.0,
"max_tokens": 2048,
"system_prompt": "The following are multiple choice questions (with answers) about {subject}. Think step by step and then finish your answer with "the answer is (X)" where X is the correct letter choice.",
"style": "multi_chat"
},
"test": {
"subset": 0.05,
"parallel": 1
},
"log": {
"verbosity": 0,
"log_prompt": true
}
}
Finished testing biology in .
Total, 24/35, 68.57%
Random Guess Attempts, 1/35, 2.86%
Correct Random Guesses, 0/1, 0.00%
Adjusted Score Without Random Guesses, 24/34, 70.59%
Finished testing business in 29 minutes 32 seconds.
Total, 26/39, 66.67%
Random Guess Attempts, 0/39, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 26/39, 66.67%
Finished testing chemistry in 1 hours 12 minutes 26 seconds.
Total, 33/56, 58.93%
Random Guess Attempts, 1/56, 1.79%
Correct Random Guesses, 0/1, 0.00%
Adjusted Score Without Random Guesses, 33/55, 60.00%
Finished testing computer science in .
Total, 10/20, 50.00%
Random Guess Attempts, 0/20, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 10/20, 50.00%
Finished testing economics in 19 minutes 23 seconds.
Total, 27/42, 64.29%
Random Guess Attempts, 1/42, 2.38%
Correct Random Guesses, 1/1, 100.00%
Adjusted Score Without Random Guesses, 26/41, 63.41%
Finished testing engineering in 43 minutes 56 seconds.
Total, 21/48, 43.75%
Random Guess Attempts, 0/48, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 21/48, 43.75%
Finished testing health in 17 minutes 18 seconds.
Total, 20/40, 50.00%
Random Guess Attempts, 0/40, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 20/40, 50.00%
Finished testing history in 11 minutes 13 seconds.
Total, 7/19, 36.84%
Random Guess Attempts, 0/19, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 7/19, 36.84%
Finished testing law in 36 minutes 52 seconds.
Total, 15/55, 27.27%
Random Guess Attempts, 0/55, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 15/55, 27.27%
Finished testing math in 46 minutes 10 seconds.
Total, 51/67, 76.12%
Random Guess Attempts, 0/67, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 51/67, 76.12%
Finished testing philosophy in 12 minutes 5 seconds.
Total, 10/24, 41.67%
Random Guess Attempts, 0/24, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 10/24, 41.67%
Finished testing physics in 46 minutes 30 seconds.
Total, 37/64, 57.81%
Random Guess Attempts, 0/64, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 37/64, 57.81%
Finished testing psychology in 14 minutes 49 seconds.
Total, 18/39, 46.15%
Random Guess Attempts, 0/39, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 18/39, 46.15%
Finished testing other in 23 minutes 39 seconds.
Total, 22/46, 47.83%
Random Guess Attempts, 0/46, 0.00%
Correct Random Guesses, division by zero error
Adjusted Score Without Random Guesses, 22/46, 47.83%
Finished the benchmark in 6 hours 14 minutes 1 seconds.
Total, 321/594, 54.04%
Random Guess Attempts, 3/594, 0.51%
Correct Random Guesses, 1/3, 33.33%
Adjusted Score Without Random Guesses, 320/591, 54.15%
Token Usage:
Prompt tokens: min 917, average 1382, max 2479, total 744828, tk/s 33.19
Completion tokens: min 30, average 900, max 2047, total 485308, tk/s 21.63
Markdown Table:

overall	biology	business	chemistry	computer science	economics	engineering	health	history	law	math	philosophy	physics	psychology	other
54.04	68.57	66.67	58.93	50.00	64.29	43.75	50.00	36.84	27.27	76.12	41.67	57.81	46.15	47.83

EtashGuha

OpenThoughts org Feb 25, 2025

This is amazing. Thank you so much for doing these evals for us! At 21 TPS, how long did the total eval take?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment