SafeSci Data and Models
Collection
SafeSci Data and safety-enhanced LLMs via finetuning. β’ 4 items β’ Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
This is the LoRA-finetuned version of the original model on SafeSciTrain to enhance the safety. This model support standard (text) behaviors and contextual behaviors.
π To test the model, please use vllm>=0.11.1. Code to use the model can be found in our github π»
Performance on Safety Knowledge questions. We present the accuracy of seven fields and their average.
| Chem. | Bio. | Med. | Mat. | Eng. | Phy. | Psy. | Avg. | |
|---|---|---|---|---|---|---|---|---|
| Qwen3-8B | 0.52 | 0.56 | 0.56 | 0.68 | 0.68 | 0.67 | 0.68 | 0.59 |
| + LoRA | 0.77 | 0.42 | 0.84 | 0.77 | 0.63 | 0.53 | 0.69 | 0.70 |
| Qwen3-14B | 0.56 | 0.65 | 0.53 | 0.75 | 0.66 | 0.59 | 0.66 | 0.60 |
| + LoRA | 0.84 | 0.45 | 0.88 | 0.86 | 0.70 | 0.55 | 0.71 | 0.75 |
| Llama3.1-8B | 0.46 | 0.75 | 0.57 | 0.66 | 0.53 | 0.56 | 0.62 | 0.57 |
| + LoRA | 0.79 | 0.42 | 0.81 | 0.72 | 0.53 | 0.56 | 0.68 | 0.66 |
Safety rate on Safety Risk questions. We present the rejection rate of seven fields and their average.
| Chem. | Bio. | Med. | Mat. | Eng. | Phy. | Psy. | Avg. | |
|---|---|---|---|---|---|---|---|---|
| Qwen3-8B | 0.37 | 0.41 | 0.21 | 0.52 | 0.16 | 0.23 | 0.14 | 0.31 |
| + LoRA | 0.83 | 0.95 | 0.94 | 0.85 | 0.19 | 0.28 | 0.08 | 0.64 |
| Qwen3-14B | 0.31 | 0.37 | 0.15 | 0.39 | 0.14 | 0.16 | 0.11 | 0.26 |
| + LoRA | 0.76 | 0.90 | 0.53 | 0.94 | 0.26 | 0.53 | 0.14 | 0.60 |
| Llama3.1-8B | 0.49 | 0.55 | 0.69 | 0.87 | 0.27 | 0.33 | 0.20 | 0.41 |
| + LoRA | 0.94 | 0.86 | 0.97 | 0.99 | 0.32 | 0.29 | 0.21 | 0.58 |
@misc{zhu2026safescisafetyevaluationlarge,
title={SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond},
author={Xiangyang Zhu and Yuan Tian and Qi Jia and Kaiwei Zhang and Zicheng Zhang and Chunyi Li and Kaiyuan Ji and Dongrui Liu and Zijian Chen and Lu Sun and Renrui Zhang and Yan Teng and Jing Shao and Wei Sun and Xia Hu and Yu Qiao and Guangtao Zhai},
year={2026},
eprint={2603.01589},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2603.01589},
}