Text Generation
Transformers
Safetensors
step3p5
conversational
custom_code
Eval Results
Step-3.5-Flash / .eval_results /gpqa_diamond.yaml
hzwer's picture
Add evaluation results from Step 3.5 Flash paper
ab446a3
raw
history blame contribute delete
196 Bytes
- dataset:
id: Idavidrein/gpqa
task_id: diamond
value: 83.5
date: '2026-02-11'
source:
url: https://arxiv.org/abs/2602.10604
name: Step 3.5 Flash Paper
user: SaylorTwift