Git Blame in the Age of AI: How Teams Are Tracking Code Attribution
When 40% of commits touch AI-generated code, git blame stops telling the full story. Here's how teams are handling attribution, what tools exist, and the honest tradeoffs in each approach.
Published by GitIntel Research
TLDR
- •
git blameattributes code to the last human who committed it — which tells you nothing about whether an AI generated it, reviewed it, or whether the committer understood it. - • Teams with >25% AI code share are reporting that traditional ownership models break down in incident response: no one knows who "owns" the AI-generated function that failed.
- • Three approaches are emerging: commit message conventions, git notes, and file-level metadata — each with real tradeoffs in adoption friction and tooling support.
- • The honest position: perfect AI attribution is operationally expensive; risk-tiered tracking (track AI origin on critical paths, sample elsewhere) is the practical standard.
Why Git Blame Is Breaking
Git blame was built for a world where code authorship was binary: a human wrote this line. Even when code came from a third-party library, someone made a deliberate decision to copy or import it, and git blame would show you who. Attribution meant accountability: this person wrote this code, so they're the right person to ask about why it works this way.
AI-assisted development fractures that model in several ways simultaneously.
The committer is often the accepter, not the author. A developer runs Copilot, accepts a 40-line function with one edit, and commits. Git blame shows their name. They may understand what the function does. They may have verified it passes tests. They may not have read every line in detail before accepting — especially in a high-velocity session generating hundreds of lines.
The function may have been generated from a context that no longer exists. The prompt that produced it may have included context that's since changed. The developer who accepted it may have left the team. The conversation between them and the AI is gone. What remains is code that git blame attributes to someone who may no longer be able to explain the non-obvious parts.
At low AI code percentages, this is a manageable edge case. At 30–40% AI code share — which GitHub Octoverse 2025 data shows some teams hitting — it's a systemic problem. Incident response gets slower when no one can confidently say "I understand why this code does what it does."
What Teams Are Trying
Three approaches have emerged in engineering blog posts and team retrospectives through 2025–2026. None is universally adopted; each involves real tradeoffs.
Commit message conventions. The lowest-friction approach: teams add a structured tag to commit messages when commits are predominantly AI-generated. Common formats include [ai-assisted], Co-Authored-By: Copilot <copilot@github.com>, or the GitHub Copilot automatic attribution that some IDE integrations now inject. The advantage is zero tooling requirement — any git log or blame query can filter on this. The disadvantage is that it relies on developer discipline, degrades when developers forget or don't bother, and doesn't distinguish between "I reviewed every line" and "I accepted this without reading it."
Git notes. Git notes allow arbitrary metadata attached to commits without rewriting history. A post-commit hook or CI pipeline step can annotate each commit with structured AI attribution data: what tool was used, what percentage of changed lines were AI-generated, whether the PR had explicit AI review labeling. Notes are queryable via git log --notes and can be pushed to remotes. The disadvantage is that git notes have poor tooling support — GitHub, GitLab, and Bitbucket don't display them natively, making them invisible to teams that live in those UIs.
File-level metadata. Some teams maintain a separate attribution file (.ai-attribution.json or equivalent) that maps file paths to their AI generation history. Updated via CI hooks at merge time. The advantage is that it's queryable outside git and can power custom tooling — dashboards, blast-radius analysis, risk scorecards. The disadvantage is that it's a separate file to maintain, can go stale, and requires custom tooling to make actionable.
GitIntel's approach combines commit metadata and static analysis to infer AI code share without requiring teams to manually tag every commit — recognizing that manual tagging disciplines erode over time in real teams.
The Honest Tradeoffs
Perfect AI attribution is operationally expensive. Tagging every AI-assisted line in every commit, maintaining that metadata as code evolves through edits and refactors, and surfacing it reliably in every tool a team uses is a real engineering investment. For most teams, it's not worth doing perfectly.
What is worth doing is risk-tiered tracking: understand AI origin on the code paths where it matters most (auth, payments, data migrations, core business logic) and sample or skip it elsewhere. This is the same tradeoff teams make with test coverage — 100% coverage is expensive and often not justified; 100% coverage on critical paths is a reasonable target.
The AI comprehension debt angle matters here. Attribution isn't just about blame — it's about knowing where to focus review effort. A critical auth function that was AI-generated two years ago, accepted by a developer who's since left, and never deeply reviewed is a different risk profile than a utility function that formats dates. Attribution data makes that risk gradient visible.
What Good Looks Like in 2026
Teams with mature AI attribution practices share a few characteristics. They have a defined policy — not necessarily comprehensive attribution, but a documented decision about what level of attribution is required for what categories of code. They have tooling that makes tagging low-friction enough that developers actually do it. And they use attribution data for operational decisions, not just retrospective analysis.
The most actionable operational use is pre-incident: during planning, before a major deployment, knowing which high-criticality paths contain significant AI-generated code that hasn't been deeply reviewed. This doesn't mean blocking deployments — it means directing code review effort appropriately. If you're deploying a change to a payment flow and 60% of that flow is AI-generated with no deep review in its history, that's where you put your pre-deploy review time.
The secondary use is incident retrospective. When something fails in production and the blast radius includes AI-generated code, having that attribution data changes the post-mortem. The question isn't just "what broke" but "how was this code produced and reviewed, and what does that tell us about our process?"
Where Tooling Is Heading
The major platforms are moving toward first-class AI attribution support, albeit slowly. GitHub has indicated that Copilot-generated code will receive richer attribution metadata in future Copilot Enterprise features — tracking not just that Copilot was used but the specific completion that contributed to a commit. This would make git blame actually informative for AI-generated code rather than just pointing to the human who committed it.
Sourcegraph's Cody has similar attribution tracking in its enterprise roadmap. JetBrains' AI Assistant captures session data that could in principle be used for attribution. The infrastructure for this data exists; the challenge is getting it to the surface in the tools where developers actually ask "who wrote this and why."
Until native platform support matures, the practical path for teams is the commit convention approach — low friction, some signal, improving over time — combined with repository-level analysis tools that can infer AI code share statistically without requiring perfect developer tagging.
Sources
- GitHub Copilot Enterprise documentation, AI attribution features, 2025–2026
- Stack Overflow Developer Survey 2026 — AI tool usage and attribution awareness
- InfoQ, "Code ownership in AI-assisted development teams," March 2026
- The Register, "Who wrote that bug? AI attribution in production codebases," February 2026
- Sourcegraph engineering blog, "Cody attribution tracking," Q1 2026