I suspect an era where software output matters less than leverage. If one engineer with AI can replace weeks of work, metrics like LOC and story points become almost meaningless
Agree that LOC and Story points are not metrics that needs tracking. What is happening is that same model and provider can use a very different experience in terms of [Turns, Tokens, Time and to certain extent the results]. This is expected and to some extent acceptable. In this scenario how will we write a test case? What is success? What moves the needle to say that a commit made the software perform 20% better?
Agree. The challenge is that same model/provider can use dramatically different numbers for thefirst two items. Maintainability is an invariant of quality of the system. Measuring that is a good metric for sure.
I suspect an era where software output matters less than leverage. If one engineer with AI can replace weeks of work, metrics like LOC and story points become almost meaningless
Agree that LOC and Story points are not metrics that needs tracking. What is happening is that same model and provider can use a very different experience in terms of [Turns, Tokens, Time and to certain extent the results]. This is expected and to some extent acceptable. In this scenario how will we write a test case? What is success? What moves the needle to say that a commit made the software perform 20% better?
Software should be measured exactly the same as before:
* execution speed
* execution resource cost
* average time to debug issues
* average to add/remove features
Agree. The challenge is that same model/provider can use dramatically different numbers for thefirst two items. Maintainability is an invariant of quality of the system. Measuring that is a good metric for sure.
[flagged]