Benchmark Submission: HTTP 429 on all IPs (Bug?)
Hi,
I've been trying to submit predictions to the SteuerEx Benchmark
(https://steuerllm.i5.ai.fau.de/benchmark) for Claude Opus 4,
but every attempt returns:
"You have already submitted 1 time(s). Maximum 1 submission(s) per IP allowed."
I've tried 4 completely different networks/IPs (home WiFi, mobile hotspot,
VPN Zurich, VPN Tirana) β all return the same 429 error. This suggests
the server might be checking something other than the IP, or there's
a bug in the submission system.
My predictions.json is validated (115 answers, all IDs 1001-1115,
UTF-8, no empty entries).
Could you look into this? I'd love to test multiple models on the benchmark.
Also β is there any way to get access to the reference statements
for local evaluation?
Thanks for the great benchmark!
It was a configuration issue on our side. Can you try:
curl -s -X POST https://steuerllm.i5.ai.fau.de/benchmark/submit -F "model_name=TestModel" -F "key=XX" -F "file=@/home/user/test_submission.json" 2>&1
{"message":"Submission received and queued for evaluation","queue_position":1,"status_url":"/status/4c112a8e38783049","submission_id":"4c112a8e38783049","success":true}
curl -s https://steuerllm.i5.ai.fau.de/benchmark/status/ID 2>&1 | python3 -m json.tool
{
"model_name": "TestModel",
"progress": 42,
"queue_position": 1,
"queue_size": 0,
"status": "evaluating",
"timestamp": "2026-02-13T10:18:57.815690"
}
The gold answers will be released when the service is taken down. For now we want to prevent training polution.
thanks