Show HN: New eval from SWE-bench team evalutes LMs based on goals not tickets

(codeclash.ai)

3 points | by lieret 8 hours ago ago

1 comments