Frontier-grade agentic coding datasets (Demo→10k→50k→100k): tests-as-truth, diff-first patches, tool-safe workflows.