Abacus-cve-v1.1

Abacus-cve-v1.1 is an iterative version of Abacus-cve, fine-tuned on an expanded dataset for security vulnerability fixing tasks.

What's New in v1.1

Compared to Abacus-cve, this version is trained on an expanded dataset:

Model Description

Abacus-cve-v1.1 is based on Qwen3-32B and fine-tuned from scratch using 18.8k distilled agent traces from CVE reproduction tasks. The traces were generated using Claude Opus 4.5 with a Mini SWE-Agent harness through the CVE-Factory pipeline.

Training Results

Evaluated on LiveCVEBench-verified and PatchEval-verified with temperature=0.6, avg@5:

Model LiveCVEBench PatchEval Terminal-Bench-2.0 Avg
Qwen3-32B (base) 8.96 ± 1.75 5.64 ± 1.37 5.41 ± 1.70 6.67
Abacus-cve (v1.0) 36.50 ± 1.52 21.94 ± 1.46 20.14 ± 2.68 26.19
Abacus-cve-v1.1 (Ours) 40.33 ± 1.36 24.32 ± 0.76 21.57 ± 1.67 28.74
Qwen3-Coder-30B 11.29 ± 1.36 9.25 ± 0.95 11.01 ± 2.43 10.51
Qwen3-Coder-480B 29.14 ± 0.26 18.06 ± 0.72 25.17 ± 2.04 24.12
MiniMax-M2 40.44 ± 1.42 25.11 ± 0.92 48.31 ± 2.44 31.28
Kimi-K2.5 44.48 ± 1.32 32.07 ± 1.40 41.44 ± 3.12 39.33
GPT-5.4 40.98 ± 1.62 32.95 ± 0.85 32.81 ± 2.16 35.58
Claude Sonnet 4 34.79 ± 0.83 24.76 ± 1.98 26.52 ± 2.59 28.69
Claude Sonnet 4.5 44.92 ± 2.71 29.16 ± 1.46 41.35 ± 1.38 38.47
Claude Opus 4.5 51.58 ± 1.64 35.68 ± 1.00 60.67 ± 2.50 49.31

Key findings:

  • v1.1 vs v1.0: +3.83 on LiveCVEBench, +2.38 on PatchEval, +1.43 on Terminal-Bench-2.0
  • Scaling potential: Performance gains from 4k to 18.8k traces demonstrate continued improvement with more data, suggesting further scaling could yield additional gains
  • Competitive performance: Matches Claude Sonnet 4 level on security tasks with a 32B model

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Luoberta/Abacus-cve-v1.1")
tokenizer = AutoTokenizer.from_pretrained("Luoberta/Abacus-cve-v1.1")

Related Resources

Citation

@misc{luo2026cvefactory,
  title={CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability},
  author={Xianzhen Luo and Jingyuan Zhang and Shiqi Zhou and Rain Huang and Chuan Xiao and Qingfu Zhu and Zhiyuan Ma and Xing Yue and Yang Yue and Wencong Zeng and Wanxiang Che},
  year={2026},
  eprint={2602.03012},
  archivePrefix={arXiv},
  primaryClass={cs.CR},
  url={https://arxiv.org/abs/2602.03012}
}
Downloads last month
842
Safetensors
Model size
677k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Luoberta/Abacus-cve-v1.1

Base model

Qwen/Qwen3-32B
Finetuned
(445)
this model
Quantizations
1 model

Dataset used to train Luoberta/Abacus-cve-v1.1

Paper for Luoberta/Abacus-cve-v1.1