Argonne 2.5-instruct

Argonne 2.5-instruct starts from the Argonne 2.5-base checkpoint and is tuned in two stages.

Training pipeline

First, supervised fine-tuning (SFT) adapts the base checkpoint on HuggingFaceH4/ultrachat_200k using the train_sft split. That stage used NVIDIA H100 NVL hardware with 1,024-token sequences, batch size 24, gradient accumulation 2, learning rate 2e-5, and 100 warmup steps.

Second, direct preference optimization (DPO) refines the SFT checkpoint on KatoHF/chatbot_arena_binarized with the chat_refine_strict recipe. That stage used NVIDIA H100 PCIe hardware with 1,024-token sequences, batch size 4, gradient accumulation 8, learning rate 5e-6, beta 0.2, and 10 warmup steps.

The published checkpoint is stored in bfloat16 and split across 5 safetensor shards for easier loading.

Training data

Tokenizer

This model uses the Qwen3 tokenizer family via the Qwen2Tokenizer compatibility class.

Source code

The release was built from the GitHub main branch codebase: https://github.com/PursuitOfDataScience/ArgonneAI/tree/main

Key scripts:

Recommended inference config

Item Value
Context length 1,024 tokens
Temperature 0.8
Top-p 0.9
Repetition penalty 1.3
No-repeat n-gram size 4
Seed 444

These settings are the recommended defaults for inference.

Inference

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "PursuitOfDataScience/Argonne2.5-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    dtype=torch.bfloat16,
)

device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

prompt = "Write a short paragraph about scientific computing at Argonne National Laboratory."
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].to(device)

seed = 444
torch.manual_seed(seed)
if device.startswith("cuda"):
    torch.cuda.manual_seed_all(seed)

output_ids = model.generate(
    input_ids,
    max_length=input_ids.shape[1] + 128,
    temperature=0.8,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.3,
    no_repeat_ngram_size=4,
)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

Usage notes

  • Load with trust_remote_code=True.
  • The custom generate method accepts repetition_penalty and no_repeat_ngram_size.
  • The sweep-derived repetition controls are available in the repository's custom generation loop, not the checkpoint's built-in generate method.
  • Weights are published as 5 bf16 safetensor shards.
  • The instruct checkpoint inherits the base tokenizer and chat template.

Citation

@misc{argonne25instruct,
  author = {PursuitOfDataScience},
  title = {Argonne 2.5-instruct},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/PursuitOfDataScience/Argonne2.5-instruct}
}
Downloads last month
929
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including PursuitOfDataScience/Argonne2.5-instruct