phuongntc/qwen3_06b_grpo_noSFT_multievalsumviet2_nopenalty Text Generation • Updated 11 days ago • 16
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_fix1000 Text Generation • 0.6B • Updated 14 days ago • 13
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_final Text Generation • 0.6B • Updated 15 days ago • 20