arxiv:2505.12929
Zhihe Yang
zhyang2226
AI & ML interests
Trustworthy RL & Offline RL
Recent Activity
liked
a model
about 2 months ago
tencent/HunyuanImage-3.0
liked
a model
5 months ago
tencent/HunyuanVideo
authored
a paper
5 months ago
Mitigating Hallucinations in Large Vision-Language Models via DPO:
On-Policy Data Hold the Key