Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning Paper • 2506.03136 • Published Jun 3 • 25
Vibe Checker: Aligning Code Evaluation with Human Preference Paper • 2510.07315 • Published Oct 8 • 32
VeriEquivBench: An Equivalence Score for Ground-Truth-Free Evaluation of Formally Verifiable Code Paper • 2510.06296 • Published Oct 7
Strengthening Programming Comprehension in Large Language Models through Code Generation Paper • 2508.12620 • Published Aug 18
Static Analysis as a Feedback Loop: Enhancing LLM-Generated Code Beyond Correctness Paper • 2508.14419 • Published Aug 20