From Declarations
to Demonstrations
Evidence Infrastructure for AI Governance Claims
Patrick Etoua · FactNotebook
DOI: 10.5281/zenodo.20490281Most AI governance failures are not failures of policy. They are failures of evidence — the gap between what a system claims to do and what it demonstrably does. FactNotebook is built to make that gap measurable, across any evidence channel, against any governance claim.
Abstract
Current AI governance tools rely on declarations, questionnaires, and manual audits — producing what we term paper compliance: documentation that asserts conformity without demonstrating it. We propose a structured framework based on observable technical evidence collected from multiple sources: source code, configuration, automated tests, operational logs, external system connectors, and cryptographic records.
Our framework introduces a six-level evidence taxonomy (E0–E6), an Evidence Quality dimension, an Evidence Resolution Principle, a Contradiction Detection mechanism, and a three-dimensional assessment model separating compliance score, assurance score, and evidence strength.
Evidence Taxonomy E0–E6
No evidence found in analysed artefacts. Does not assert absence.
Control described in documentation or policy. Claim stated, not verified.
Static analysis confirms the control pattern in source code architecture.
Automated tests execute and verify the control in a controlled environment.
Confirmed to have run in production conditions. Operational logs, audit trails.
Execution evidence is cryptographically chained. Authenticity independently verifiable.
Correlated event chains reconstruct end-to-end proof that the regulatory control functioned.
"AI compliance is a software engineering problem.
It requires software engineering evidence."