AI & ML interests
None defined yet.
Recent Activity
MultiRL/qwen3_4b_easy_rl_our_adv_final
4B
•
Updated
•
44
MultiRL/qwen3_1.7b_easy_rl_final_group_norm
2B
•
Updated
•
417
MultiRL/qwen3_1.7b_easy_rl_final_gamma_1
2B
•
Updated
•
45
MultiRL/qwen3_4b_base_easy_rl_final
4B
•
Updated
•
1
MultiRL/qwen3_4b_base_sft_final
4B
•
Updated
•
28
MultiRL/qwen3_4b_easy_rl_new
4B
•
Updated
•
27
MultiRL/qwen3_1.7b_easy_rl_gspo
2B
•
Updated
•
2
4B
•
Updated
•
14
MultiRL/qwen3_1.7b_easy_rl_final_step120
2B
•
Updated
•
43
MultiRL/qwen3_4b_medium_rl_final
4B
•
Updated
•
47
MultiRL/qwen3_4b_sft_one_act
4B
•
Updated
•
17
MultiRL/qwen3_1.7b_easy_rl_reinforce_ori
2B
•
Updated
•
93
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0.5
2B
•
Updated
•
1
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_1
2B
•
Updated
•
1
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0
2B
•
Updated
•
1
MultiRL/qwen3_1.7b_sft_one_act
2B
•
Updated
•
24
MultiRL/qwen3_1.7b_easy_rl_final
2B
•
Updated
•
606
MultiRL/qwen3_4b_easy_rl_final
4B
•
Updated
•
15
MultiRL/qwen3_1.7b_sft_final
2B
•
Updated
•
1.88k
MultiRL/qwen3_4b_sft_final
4B
•
Updated
•
38
MultiRL/qwen3_1.7b_easy_rl_new
2B
•
Updated
•
4
MultiRL/qwen3_4b_standard_medium_rl
MultiRL/qwen3_4b_standard_easy_rl
4B
•
Updated
•
8
MultiRL/qwen3_4b_medium_rl_progress_C
MultiRL/qwen3_4b_medium_rl
MultiRL/qwen3_4b_instruct_sft
4B
•
Updated
•
6
MultiRL/qwen3_1.7b_easy_rl_test_task_group
MultiRL/qwen3_1.7b_easy_rl_test
2B
•
Updated
MultiRL/qwen3_1.7b_sudoku_sft
2B
•
Updated
•
2
MultiRL/qwen3_1.7b_easy_reinforce_batch_32_by_pass
2B
•
Updated
•
1