Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Open Agent Evaluation Laboratory
university
https://boxiyu.github.io/
BoshCavendish
BoxiYu
boxi-yu-194b63279
Activity Feed
Follow
2
AI & ML interests
Code Agent, Benchmark Augmentation
Recent Activity
CWCY
updated
a dataset
1 day ago
OpenAgentLab/SWE-ABS
Bertsekas
authored
a paper
8 months ago
How Should I Build A Benchmark? Revisiting Code-Related Benchmarks For LLMs
Bertsekas
authored
a paper
8 months ago
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
View all activity
Team members
2
models
0
None public yet
datasets
1
OpenAgentLab/SWE-ABS
Viewer
•
Updated
1 day ago
•
500
•
12