LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_gdpo_192000_episodes_seed_42_no_cl_IS Text Generation • 0.6B • Updated 1 day ago • 56
LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_gdpo_192000_episodes_seed_42 Text Generation • 0.6B • Updated 1 day ago • 51
LorenaYannnnn/20260217-Qwen3-0.6B_sycophancy_warmup_16000_ep_OURS_cl_SELF_gdpo_192000_episodes_seed_42 Updated 1 day ago • 58
LorenaYannnnn/20260217-Qwen3-0.6B_grpo_sycophancy_warmup_baseline_192000_episodes_seed_42 Text Generation • 0.6B • Updated 1 day ago • 51
LorenaYannnnn/20260217-Qwen3-0.6B_grpo_warmup_16000_episodes_seed_42 Text Generation • 0.6B • Updated 4 days ago • 1.44k
LorenaYannnnn/20260216-Qwen3-0.6B_warmup_grpo_OURS_cl_0.6B_128000_episodes_seed_42 Text Generation • 0.6B • Updated 6 days ago • 72
LorenaYannnnn/20260216-Qwen3-0.6B_warmup_grpo_baseline_128000_episodes_seed_42 Text Generation • 0.6B • Updated 6 days ago • 69
LorenaYannnnn/20260216-Qwen3-0.6B_warmup_grpo_OURS_cl_SELF_128000_episodes_seed_42 Updated 6 days ago • 71
LorenaYannnnn/20260216-Qwen3-no_nonfactual_irrelevance-0.6B_grpo_warmup_24000_episodes_seed_42 Text Generation • 0.6B • Updated 6 days ago • 667
LorenaYannnnn/20260215-Qwen3-0.6B_grpo_warmup_24000_episodes_seed_42 Text Generation • 0.6B • Updated 7 days ago • 341
LorenaYannnnn/20260211-Qwen3-0.6B_helpful_instructions_baseline_448000_episodes_seed_42 Updated 11 days ago • 45
LorenaYannnnn/20260211-Qwen3-0.6B_helpful_instructions_cl_correction_448000_episodes_seed_42 Updated 11 days ago • 44
LorenaYannnnn/20260203-Qwen3-0.6B_mmlu_sycophan_new_vanilla_always_cl_partial_deepseek_627984_ep_seed_42 Updated 19 days ago • 96
LorenaYannnnn/20260203-Qwen3-0.6B_mmlu_sycophancy_new_baseline_627984_episodes_seed_42 Updated 19 days ago • 96
LorenaYannnnn/20260129-Qwen3-1.7Base_MATH700heldout_vanilla_always_cl_consis_partial_llama_652224_ep_s_42 Updated 24 days ago • 21
LorenaYannnnn/20260126-Qwen3-1.7B-Base_MATH_700_heldout_baseline_652224_episodes_seed_42 Updated 27 days ago • 50
LorenaYannnnn/20260126-Qwen3-1.7Base_MATH700heldout_vanilla_no_cl_m_wrong_consis_partial_llama_652224_ep_s_42 Updated 27 days ago • 43
LorenaYannnnn/20260123-Qwen3-1.7B-Base_MATH_m_cl_sep_no_cl_m_inc_partial_llama_719328_episodes_seed_42 Updated 30 days ago
LorenaYannnnn/20260120-Qwen3-1.7Base_MATH_m_cl_sep_keep_0.5_no_cl_m_inc_partial_llama_719328_episodes_seed_42 Updated Jan 20
LorenaYannnnn/20260119-Qwen3-0.6B-Base_MATH_answer_box_baseline_719328_episodes_seed_42 Updated Jan 19
LorenaYannnnn/20260118-Qwen3-1.7B-Base_gsm8k_m_cl_sep_no_cl_m_wrong_partial_llama_1195680_episodes_seed_42 Updated Jan 18
LorenaYannnnn/20260118-Qwen3-1.7B-Base_MATH_m_cl_sep_no_cl_m_wrong_partial_llama_719328_episodes_seed_42 Updated Jan 18
LorenaYannnnn/20260117-Qwen3-1.7B-Base_gsm8k_m_cl_separate_always_cl_partial_llama_1195680_episodes_seed_42 Updated Jan 17
LorenaYannnnn/20260117-Qwen3-1.7B-Base_math_answer_box_baseline_719328_episodes_seed_42 Updated Jan 17
LorenaYannnnn/20260117-Qwen3-1.7B-Base_gsm8k_minimal_answer_box_prompt_baseline_1195680_episodes_seed_42 Updated Jan 17
LorenaYannnnn/20260115-Qwen3-0.6B_gsm8k_m_cl_sep_norm_always_cl_partial_llama_1434816_episodes_seed_42 Updated Jan 16
LorenaYannnnn/20260113-Qwen3-0.6B_gsm8k_dgpo_no_cl_main_wrong_cl_partial_1_llama_1434816_episodes_seed_42 Updated Jan 14
LorenaYannnnn/20260112-Qwen3-0.6B_gsm8k_main_cl_separate_no_cl_main_incorrect_1_llama_1434816_episodes_seed_42 Updated Jan 13
LorenaYannnnn/20260105-Qwen3-0.6B_gsm8k_no_classmate_main_incorrect_1_llama_1434816_episodes_seed_42 Updated Jan 7