ResearchMarch 30, 2026 · 8 min read

AI Tools Leave Invisible Fingerprints in Your Git History. Researchers Can ID Them at 97% Accuracy.

A January 2026 arXiv paper analyzed 33,580 PRs from 5 AI coding agents (Codex, Copilot, Devin, Cursor, Claude Code) and identified which tool wrote the code at 97.2% F1-score — even when developers didn't disclose it.

Published by GitIntel Research

TLDR

Researchers analyzed 33,580 pull requests from five AI coding agents (Codex, Copilot, Devin, Cursor, Claude Code) to identify behavioral fingerprints
Using 41 features from commit messages, PR structure, and code characteristics, they achieved a 97.2% F1-score identifying which tool wrote the code — even without disclosure
Each tool has a distinct signature: Codex writes long multiline commits, Claude Code favors dense conditional logic, Devin produces verbose PR descriptions
This has major implications for repo governance: projects that ban or require disclosure of specific AI tools now have a technical enforcement path
Gartner forecasts 60% of new code will be AI-generated by end of 2026. The fingerprinting problem is about to get much larger.

Devin stands apart from both: its most distinctive feature is PR description verbosity — detailed step-by-step reasoning traces that explain not just what the code does, but the decision process behind it. This mirrors Devin's design as a fully autonomous agent that documents its own work for human review.

Why This Matters Beyond Academia

1. Repository governance without disclosure mandates

Many open source projects now require disclosure of AI-generated contributions — LLVM added rules in late 2025, cURL's maintainer has been vocal about the problem. But enforcement has always depended on developer honesty. If a developer uses Cursor to write a PR and submits it under their own account without disclosure, there's been no way to detect it. The fingerprinting research changes that. A 97.2%-accurate classifier running on PR metadata is a viable enforcement tool.

2. Training data contamination in AI research

The paper explicitly flags this: datasets that claim to separate "human-written" from "AI-generated" code based on submitter identity are likely mislabeled. If developers use AI tools but don't disclose it, research built on those datasets has a contamination problem. The authors argue fingerprinting should be applied retroactively to fix existing datasets used for AI safety and capability evaluations.

3. Enterprise IP and licensing compliance

Some enterprise AI coding licenses restrict which tools can be used on certain codebases — for IP indemnification reasons, export controls, or regulatory requirements. If an employee uses a non-approved AI tool to generate production code, the company currently has no way to detect it. Fingerprinting creates an audit trail where none existed before.

The Scale Problem: 60% of Code Will Be AI-Generated by Year End

The stakes of this research get clearer when you look at current adoption numbers. Gartner forecasts that 60% of new code will be AI-generated by the end of 2026. At Google and Microsoft, 30% of new code already is. Anthropic's 2026 Agentic Coding Trends Report found that developers now use AI in 60% of their engineering work.

The attribution problem scales with adoption. Right now, explicit Co-Authored-By trailers — the mechanism GitIntel currently uses to detect AI commits — only capture the cases where developers voluntarily disclose. As our earlier scan of 13 major open source repos showed, that floor sits at around 5.8% of commits across the projects we measured.

{[ { value: "5.8%", label: "AI commits detectable via Co-Authored-By (floor)" }, { value: "~30%", label: "Estimated true AI share at major tech companies" }, { value: "97.2%", label: "Accuracy of fingerprinting-based detection" }, ].map((stat) => (

{stat.value}

{stat.label}

))}

The gap between 5.8% and ~30% represents the invisible AI code — committed without attribution, invisible to current tooling. The fingerprinting research suggests this gap is technically closeable. Not with certainty — a 97.2% classifier still produces false positives — but well enough to shift the governance conversation from "we can't know" to "we can measure."

Attribution vs. Detection: Two Different Problems

It's worth being precise about what fingerprinting is — and what it isn't. The arXiv paper's classifier does agent-level identification: given a PR, predict whether it came from Codex, Copilot, Devin, Cursor, or Claude Code. It answers "which tool?" not just "was AI involved?"

GitIntel currently operates on the attribution layer: scanning commit trailers for explicit Co-Authored-By metadata. This is precise when the data exists and has zero false positives — but it misses everything that wasn't disclosed. Fingerprinting is the complement: imperfect but broad, catching what attribution misses.

# What gitintel scan currently detects (high precision, limited recall)
gitintel scan --format table
→ Finds: commits with Co-Authored-By: Claude Code <...>
→ Misses: AI code committed without attribution

# What fingerprinting adds (broader recall, lower precision)
# Feature detection based on commit message structure,
# diff shape, conditional density, PR verbosity
→ Finds: behavioral signatures matching known agent patterns
→ Flags: probable AI commits even without trailers

The research points toward a future where these two approaches complement each other: attribution scanning for the cases where developers disclose, behavioral fingerprinting for the cases where they don't. Together, they close the visibility gap that currently makes it impossible to answer "how much of our codebase is actually AI-generated?" with any confidence.

The Open Source Governance Moment

The timing of this research is not coincidental. In early 2026, NIST launched its AI Agent Standards Initiative focused on ensuring autonomous coding agents can be adopted "with confidence" through industry-led standards and open-source protocol development. The EU AI Act enforcement clock starts in August 2026.

What the fingerprinting paper establishes is a technical foundation for the governance layer that regulators and project maintainers are demanding. Disclosure requirements are only enforceable if enforcement is possible. For the first time, there is peer-reviewed evidence that it is — at 97.2% accuracy, using only the metadata that already lives in your git history.

What this means for engineering leaders

Policy without tooling is theater. If your AI code policy requires disclosure and you're not running any detection, you're not enforcing it.
Your audit trail has gaps. Current git history tells you who committed. It doesn't tell you which tool wrote the code — unless you have attribution data.
The research will get more accurate. 97.2% is already high. As more labeled training data becomes available, these classifiers will improve.

Study Limitations

The arXiv paper (2601.17406) is compelling but not without caveats:

Training data came from repos where AI use was disclosed — so the classifier learned from cases where developers used agents "straight" without deliberate obfuscation
Developers who intentionally edit AI-generated commit messages or code structure before committing may defeat the fingerprint
The five-class problem (which of five tools?) is easier than a general binary detection problem ("was any AI involved?")
As AI tools evolve and update their default behaviors, fingerprint stability over time is unknown

Start with what's detectable today

GitIntel scans your git history for explicit AI attribution — the high-precision layer. Run it on any repo in under a minute to get your current AI commit percentage across branches and authors.

# Install
curl -fsSL https://gitintel.com/install.sh | sh

# Scan your repo
cd your-repo && gitintel scan

# See which agents contributed
gitintel scan --format json | jq '.agents'

View on GitHub

Open source (MIT) · Local-first · No data leaves your machine

Source: "Fingerprinting AI Coding Agents on GitHub," arXiv:2601.17406, January 2026. Anthropic 2026 Agentic Coding Trends Report. Gartner AI forecast 2026.

Related reading on GitIntel: