built an agent skill that reads every git diff instead of counting PRs for annual reviews

Managing 200+ engineers across a global platform. Every review cycle I hit the same wall.

DORA metrics tell me how fast the team is moving. They don't tell me whether the code going through the pipeline is good. Deployment frequency, lead time, change failure rate are team throughput metrics. They say nothing about individual engineering quality.

The result: dashboards reward volume and punish architects. An engineer who reduces 2,000 lines to 400 (making a module maintainable for three new people) gets scored as low output. An engineer with high PR volume who ships debug code to production gets scored as high output.

I ended up reading every commit diff per major developers per quarter. Manually. That obviously doesn't scale, so I automated it.

The tool reads diffs, not line counts. Tracks accuracy (how often code ships clean), identifies design improvements and anti-patterns, maps codebase structure and tech debt, and produces growth reviews with specific commit evidence.

No scores. No rankings. Low accuracy often just means that person owns the riskiest module. Context is everything.

Open-sourced it: https://github.com/anivar/contributor-codebase-analyzer

Interested to hear how other engineering leaders handle this. Do you trust dashboards? Manual review? Something hybrid?

submitted by /u/an1var
[link] [comments]