MultiRL

non-profit

AI & ML interests

None defined yet.

Recent Activity

KimSHine updated a model 4 days ago

MultiRL/qwen3_4b_sudoku_multi_act_sft_final_new

KimSHine published a model 4 days ago

MultiRL/qwen3_4b_sudoku_multi_act_sft_final_new

KimSHine updated a model 5 days ago

MultiRL/qwen3_4b_sudoku_one_act_sft_final_new

View all activity

MultiRL 's models 171

MultiRL/qwen3_4b_easy_rl_our_adv_final

4B • Updated Dec 22, 2025 • 44

MultiRL/qwen3_1.7b_easy_rl_final_group_norm

2B • Updated Dec 22, 2025 • 417

MultiRL/qwen3_1.7b_easy_rl_final_gamma_1

2B • Updated Dec 18, 2025 • 45

MultiRL/qwen3_4b_base_easy_rl_final

4B • Updated Dec 18, 2025 • 1

MultiRL/qwen3_4b_base_sft_final

4B • Updated Dec 17, 2025 • 28

MultiRL/qwen3_4b_easy_rl_new

4B • Updated Dec 16, 2025 • 27

MultiRL/qwen3_1.7b_easy_rl_gspo

2B • Updated Dec 16, 2025 • 2

MultiRL/qwen3_4b_sft_new

4B • Updated Dec 15, 2025 • 14

MultiRL/qwen3_1.7b_easy_rl_final_step120

2B • Updated Dec 15, 2025 • 43

MultiRL/qwen3_4b_medium_rl_final

4B • Updated Dec 15, 2025 • 47

MultiRL/qwen3_4b_sft_one_act

4B • Updated Dec 14, 2025 • 17

MultiRL/qwen3_1.7b_easy_rl_reinforce_ori

2B • Updated Dec 14, 2025 • 93

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0.5

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_1

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0

2B • Updated Dec 14, 2025 • 1

MultiRL/qwen3_1.7b_sft_one_act

2B • Updated Dec 14, 2025 • 24

MultiRL/qwen3_1.7b_easy_rl_final

2B • Updated Dec 13, 2025 • 606

MultiRL/qwen3_4b_easy_rl_final

4B • Updated Dec 13, 2025 • 15

MultiRL/qwen3_1.7b_sft_final

2B • Updated Dec 11, 2025 • 1.88k

MultiRL/qwen3_4b_sft_final

4B • Updated Dec 11, 2025 • 38

MultiRL/qwen3_1.7b_easy_rl_new

2B • Updated Dec 6, 2025 • 4

MultiRL/qwen3_4b_standard_medium_rl

4B • Updated Dec 6, 2025

MultiRL/qwen3_4b_standard_easy_rl

4B • Updated Dec 5, 2025 • 8

MultiRL/qwen3_4b_medium_rl_progress_C

4B • Updated Dec 5, 2025

MultiRL/qwen3_4b_medium_rl

4B • Updated Dec 4, 2025

MultiRL/qwen3_4b_instruct_sft

4B • Updated Dec 1, 2025 • 6

MultiRL/qwen3_1.7b_easy_rl_test_task_group

2B • Updated Dec 1, 2025

MultiRL/qwen3_1.7b_easy_rl_test

2B • Updated Nov 30, 2025

MultiRL/qwen3_1.7b_sudoku_sft

2B • Updated Nov 28, 2025 • 2

MultiRL/qwen3_1.7b_easy_reinforce_batch_32_by_pass

2B • Updated Nov 26, 2025 • 1