⚠️
Demo dossier — synthetic runtime data. These audits are published for demonstration purposes. Runtime traces were synthetically generated to illustrate the behavioral audit methodology. Systems are anonymised. Full production dossiers with live execution evidence are available under NDA — contact@factnotebook.com
⚠️
STATIC ANALYSIS — Limited confidence Code and documentation were analysed statically — the audit engine did not execute the system live. Where session traces were provided, behavioural findings derive from those traces, not from live execution. Some verdicts rely on heuristic signals and are labelled accordingly; they may not reflect actual runtime behaviour. For a full technical dossier with executable evidence and SHA-256 seal, contact contact@factnotebook.com.

Remediation Playbook

agents-for-openbb

Audit ID: CSVA-20260614-9BE11290 | Generated: 2026-06-17 09:47 UTC

This playbook covers only controls requiring action. Controls already demonstrated at E4/E5/E6 are excluded — see section below.


✅ Controls Already Demonstrated — No Action Required

The following checkpoints have E4/E5/E6 runtime evidence and do not require remediation:

Confidence-Based Human Routing, Contextual Memory Limitation, Data Traceability, Data Cleansing & Anonymisation, Authority Delegation, System Explainability, Human-in-the-Loop Mechanism, Escalation to Human

These controls are demonstrated by runtime execution traces or correlated event chains. Maintain the evidence chain to preserve this status.

Note on apparent contradictions: a checkpoint may appear both here and in the remediation table below when the mechanism was observed but its enforcement on critical decisions was not. Example: a human-in-the-loop mechanism may exist (demonstrated) while human approval enforcement for critical decisions remains unverified (remediation required). These are distinct controls.


Developer Custom Tests — Prove Your Specific Implementation

CAMSVA generated 4 test template(s) for checkpoints that require knowledge of your specific architecture to verify.

These cover mechanisms CAMSVA cannot detect automatically:
HITL via Slack/email, approval queues, middleware-level checks, external notification systems.

How it works

1. Open the template → implement your specific proof
2. Run: python -m pytest .factdna/custom_tests/ -v
3. If PASS → re-run: python camsva.py --project . to seal
4. Passing custom tests override the SKIPPED verdict
   and are sealed cryptographically in the final dossier

Available templates

Why this matters: A passing developer test carries more evidential weight than a generic heuristic check. It proves YOUR implementation of the control, in YOUR architecture, with YOUR test — sealed with the audit ID CSVA-20260614-9BE11290.


7. 🛠️ REMEDIATION PLAN — ACTIONS PER CHECKPOINT


🎯 Quick-wins — Highest-impact remediations

Each row is a missing root control whose remediation automatically unlocks
dependent child controls. The % indicates the share of regulatory weight for the article concerned.

Priority Root checkpoint Article Potential impact
🔴 P1 Provider Identity Article 25 100.0% du poids article → unlocks 1 cross-article control(s)
🔴 P1 Risk Register Article 9 87.1% du poids article → unlocks 5 cross-article control(s)
🔴 P1 Data Inventory Article 10 52.8% du poids article → unlocks 3 cross-article control(s)
🔴 P1 Logging Implementation Article 12 45.5% du poids article → unlocks 2 cross-article control(s)
🔴 P1 Secure Format Policy Article 15 33.3% du poids article → unlocks 2 cross-article control(s)
🔴 P1 Human Validation Article 14 29.5% du poids article → unlocks 2 cross-article control(s)

Each row corresponds to an AI Act checkpoint for which no sufficient evidence was identified
within the analyzed scope, and describes the evidence gap to close.
The approaches listed are illustrative examples, not prescriptions — any equivalent control
that produces the expected evidence is acceptable. Implementation choices remain the
responsibility of the system owner.

Remediation Dashboard

Reg. Severity Priority Checkpoint Article Example remediation approach Indicative effort
GATE ARTICLE 🔴 P1 CRITIQUE Audit Trail Article 12 Add an AuditLogger call before each critical decision. 2–4 days
GATE ARTICLE 🔴 P1 CRITIQUE Automatic Blocking Linked to Human Rejection Article 14 Analyse and fix checkpoint 'Automatic Blocking Linked to Human Rejection' per Article 14. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Bypass Detection Article 15 Analyse and fix checkpoint 'Bypass Detection' per Article 15. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Human Validation Article 14 Implement a human approval gate before any critical automated decision. 1–2 wks
GATE ARTICLE 🔴 P1 CRITIQUE PII Masking Before External Transmission Article 10 Add a PII masking layer before any transmission to an external LLM or API. 3–5 days
GATE ARTICLE 🔴 P1 CRITIQUE Unsafe Serialization Formats Article 15 Replace pickle.load() with a secure format (safetensors, ONNX, joblib with verification). 2–5 j
🔴 HIGH 🔴 P1 CRITIQUE Decision Record Structure Article 12 Enrich the log structure with mandatory accountability fields. 3–5 days
GATE ARTICLE 🔴 P1 CRITIQUE Prompt Guardrail / Injection Detection Article 15 Integrate a semantic guardrail (Llama Guard or equivalent) before LLM transmission. 3–5 days
🔴 HIGH 🔴 P1 CRITIQUE Risk Mitigation Article 9 Analyse and fix checkpoint 'Risk Mitigation' per Article 9. To be estimated
🔴 HIGH 🔴 P1 CRITIQUE Bias Metrics Article 10 Integrate bias metrics (Disparate Impact, Equal Opportunity) into the test pipeline. 3–5 days
🔴 HIGH 🔴 P1 CRITIQUE Logging Implementation Article 12 Configure a centralised logger (logging.getLogger) writing to persistent storage. 1–2 days
🟠 MEDIUM 🔴 P1 CRITIQUE Logging Integrity Article 12 Verify that log functions actually write to DB/file (no pass or print stubs). 1–3 days
GATE ARTICLE 🔴 P1 CRITIQUE Physical Dataset Existence Article 10 Analyse and fix checkpoint 'Physical Dataset Existence' per Article 10. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Data Inventory Article 10 Analyse and fix checkpoint 'Data Inventory' per Article 10. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Real Execution Traces Article 12 Analyse and fix checkpoint 'Real Execution Traces' per Article 12. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Risk Matrix Article 9 Analyse and fix checkpoint 'Risk Matrix' per Article 9. To be estimated
GATE ARTICLE 🔴 P1 CRITIQUE Risk Ownership Assignment Article 9 Assign a named Risk Owner for each risk in the register. 1 day
GATE ARTICLE 🔴 P1 CRITIQUE Risk Register Article 9 Create a formalised risk register (JSON, YAML or doc section) listing all identified risks. 2–3 days
🔴 HIGH 🟠 P2 MAJEUR Agent Tool Scope Article 14 Restrict the agent tool catalogue to the strict minimum (least privilege principle). 1–3 days
🔴 HIGH 🟠 P2 MAJEUR Error Handling Article 15 Wrap critical calls in try/except blocks returning generic errors. 1–2 days
🔴 HIGH 🟠 P2 MAJEUR Continuous Monitoring Article 9 Analyse and fix checkpoint 'Continuous Monitoring' per Article 9. To be estimated
🔴 HIGH 🟠 P2 MAJEUR Dataset Quality Article 10 Analyse and fix checkpoint 'Dataset Quality' per Article 10. To be estimated
🔴 HIGH 🟠 P2 MAJEUR Limitations Disclosure Article 13 Analyse and fix checkpoint 'Limitations Disclosure' per Article 13. To be estimated
🟠 MEDIUM 🟠 P2 MAJEUR Model Card Article 11 Write a model card (intended use, limits, metrics, version). 2–3 j
🟠 MEDIUM 🟠 P2 MAJEUR System Architecture Article 11 Analyse and fix checkpoint 'System Architecture' per Article 11. To be estimated
🔴 HIGH 🟠 P2 MAJEUR User Notice Article 13 Write a user notice explaining the system's operation and limits. 1–2 days
🔴 HIGH 🟠 P2 MAJEUR Input Robustness Article 15 Add schema validation (Pydantic/jsonschema) on all user inputs. 2–4 days
🔴 P1 STRATÉGIQUE STUB_IMPLEMENTATION_RATIO Global Remplacer les fonctions vides (pass/stub) détectées par une implémentation réelle. 3–6 sem CODE

Remediation detail with example approaches

Audit Trail — Audit Trail (Article 12)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 2–4 days
Type CODE

Action required: Add an AuditLogger call before each critical decision.

Implementation example:

AuditLogger.log_event(event='decision', resource_id=res_id)

Expected evidence (how to prove this is fixed):

audit.log file or audit_events DB table with sample entries (decision_id, timestamp, actor, input_hash, output)

Risk if not remediated:

⚠️ Without decision traceability, incident investigation and regulatory inspection become impossible. Art. 12 §1 mandatory.

Human Validation — Human Validation (Article 14)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 1–2 wks
Type CODE

Action required: Implement a human approval gate before any critical automated decision.

Implementation example:

if not human_approval_cb(decision=result, actor=user): raise HumanApprovalRequired()

Expected evidence (how to prove this is fixed):

hitl.py + screenshot of approval workflow + sample approval log entry

Risk if not remediated:

⚠️ Automated critical decisions may be executed without human intervention. Non-compliance with Art. 14 §4, direct enforcement action risk.

PII Masking Before External Transmission — PII Masking Before External Transmission (Article 10)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 3–5 days
Type CODE

Action required: Add a PII masking layer before any transmission to an external LLM or API.

Implementation example:

masked = pii_filter.mask(payload); response = llm_client.call(masked)

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Unsafe Serialization Formats — Unsafe Serialization Formats (Article 15)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 2–5 j
Type CODE

Action required: Replace pickle.load() with a secure format (safetensors, ONNX, joblib with verification).

Implementation example:

# AVANT: model = pickle.load(f)
# APRES: model = safetensors.load_file('model.safetensors')

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Decision Record Structure — Decision Record Structure (Article 12)

Field Value
Severity 🔴 HIGH
Priority 🔴 P1 CRITIQUE
Effort 3–5 days
Type CODE

Action required: Enrich the log structure with mandatory accountability fields.

Implementation example:

{'decision_id': str(uuid.uuid4()), 'actor_id': user.id, 'model_version': MODEL_VER, 'input_hash': sha256(input)}

Expected evidence (how to prove this is fixed):

Sample log entry: {decision_id, actor_id, model_version, input_hash, output, timestamp}

Risk if not remediated:

⚠️ AI decisions cannot be attributed or reconstructed. Required for conformity assessment under Art. 12 §2.

Prompt Guardrail / Injection Detection — Prompt Guardrail / Injection Detection (Article 15)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 3–5 days
Type CODE

Action required: Integrate a semantic guardrail (Llama Guard or equivalent) before LLM transmission.

Implementation example:

safe_input = guardrail.check(user_input); if not safe_input.is_safe: raise PromptInjectionError()

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Risk Mitigation — Risk Mitigation (Article 9)

Field Value
Severity 🔴 HIGH
Priority 🔴 P1 CRITIQUE
Effort To be estimated
Type ?

Action required: Analyse and fix checkpoint 'Risk Mitigation' per Article 9.

Expected evidence (how to prove this is fixed):

Code implementing mitigation + reference to risk_id in RISK_REGISTER + test confirming mitigation active

Risk if not remediated:

⚠️ Identified risks with no mitigation action. Regulatory gap under Art. 9 §2.

Bias Metrics — Bias Metrics (Article 10)

Field Value
Severity 🔴 HIGH
Priority 🔴 P1 CRITIQUE
Effort 3–5 days
Type CODE

Action required: Integrate bias metrics (Disparate Impact, Equal Opportunity) into the test pipeline.

Implementation example:

from fairlearn.metrics import demographic_parity_difference
    dpd = demographic_parity_difference(y_true, y_pred, sensitive_features=gender)

Expected evidence (how to prove this is fixed):

fairness_report.json or model_card.md section with: protected_groups, metrics (TPR, FPR, equalized_odds)

Risk if not remediated:

⚠️ No evidence of bias evaluation. High-risk AI without fairness metrics exposed to Art. 10 §2 non-compliance.

Logging Implementation — Logging Implementation (Article 12)

Field Value
Severity 🔴 HIGH
Priority 🔴 P1 CRITIQUE
Effort 1–2 days
Type CODE

Action required: Configure a centralised logger (logging.getLogger) writing to persistent storage.

Implementation example:

import logging; logger = logging.getLogger('ai_system'); logger.addHandler(FileHandler('audit.log'))

Expected evidence (how to prove this is fixed):

audit.log or audit_events table with persistent entries (not stdout only)

Risk if not remediated:

⚠️ Logs written to stdout are lost at process restart. Non-persistent logging fails Art. 12 §1.

Logging Integrity — Logging Integrity (Article 12)

Field Value
Severity 🟠 MEDIUM
Priority 🔴 P1 CRITIQUE
Effort 1–3 days
Type CODE

Action required: Verify that log functions actually write to DB/file (no pass or print stubs).

Implementation example:

def log_event(self, **kw): self.db.insert('audit_log', kw)  # NON: pass ou print()

Expected evidence (how to prove this is fixed):

Test confirming log entries written to DB/file (not just print). Log rotation config.

Risk if not remediated:

⚠️ Logs that only print to stdout provide no durable audit trail.

Risk Ownership Assignment — Risk Ownership Assignment (Article 9)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 1 day
Type DOC

Action required: Assign a named Risk Owner for each risk in the register.

Implementation example:

risks.yaml:
- id: RISK-001
      owner: 'Chief Risk Officer'
      contact: 'risk-owner@company.example'

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Risk Register — Risk Register (Article 9)

Field Value
Severity ⛔ BLOCKING
Priority 🔴 P1 CRITIQUE
Effort 2–3 days
Type DOC

Action required: Create a formalised risk register (JSON, YAML or doc section) listing all identified risks.

Implementation example:

risks.yaml:
- id: RISK-001
      name: Algorithmic bias
      probability: MEDIUM
      impact: HIGH

Expected evidence (how to prove this is fixed):

risks.yaml or RISK_REGISTER.md with: id, probability, impact, mitigation, owner, review_date

Risk if not remediated:

⚠️ Without a risk register, all downstream risk management obligations (Art. 9) cannot be demonstrated. Potential regulatory exposure under Article 9.

Agent Tool Scope — Agent Tool Scope (Article 14)

Field Value
Severity 🔴 HIGH
Priority 🟠 P2 MAJEUR
Effort 1–3 days
Type CODE

Action required: Restrict the agent tool catalogue to the strict minimum (least privilege principle).

Implementation example:

ALLOWED_TOOLS = ['search', 'summarize']  # Supprimer: 'delete', 'send_email', 'execute_code'

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Error Handling — Error Handling (Article 15)

Field Value
Severity 🔴 HIGH
Priority 🟠 P2 MAJEUR
Effort 1–2 days
Type CODE

Action required: Wrap critical calls in try/except blocks returning generic errors.

Implementation example:

try: result = model.infer(input)
    except InferenceError: return {'error': 'Service unavailable', 'code': 503}

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Model Card — Model Card (Article 11)

Field Value
Severity 🟠 MEDIUM
Priority 🟠 P2 MAJEUR
Effort 2–3 j
Type DOC

Action required: Write a model card (intended use, limits, metrics, version).

Implementation example:

MODEL_CARD.md: Model — <name> v<version> | Intended use: <domain task> | Limitation: <known out-of-scope conditions>

Expected evidence (how to prove this is fixed):

MODEL_CARD.md with: intended_use, limitations, performance metrics, bias assessment, version

Risk if not remediated:

⚠️ Users and deployers cannot assess system capabilities. Art. 13 transparency obligation not met.

System Architecture — System Architecture (Article 11)

Field Value
Severity 🟠 MEDIUM
Priority 🟠 P2 MAJEUR
Effort To be estimated
Type ?

Action required: Analyse and fix checkpoint 'System Architecture' per Article 11.

Expected evidence (how to prove this is fixed):

SYSTEM_DESCRIPTION.md or Annex IV-compatible technical documentation

Risk if not remediated:

⚠️ No technical documentation for conformity assessment. Required under Art. 11 and Annex IV.

User Notice — User Notice (Article 13)

Field Value
Severity 🔴 HIGH
Priority 🟠 P2 MAJEUR
Effort 1–2 days
Type DOC

Action required: Write a user notice explaining the system's operation and limits.

Implementation example:

User guide — Section 1: This AI system assists <domain task>. It does not replace expert human judgement.

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.

Input Robustness — Input Robustness (Article 15)

Field Value
Severity 🔴 HIGH
Priority 🟠 P2 MAJEUR
Effort 2–4 days
Type CODE

Action required: Add schema validation (Pydantic/jsonschema) on all user inputs.

Implementation example:

class InputSchema(BaseModel): query: str = Field(max_length=2000); ...

Expected evidence (how to prove this is fixed):

Evidence requirements depend on implementation architecture. Examples: test report, runtime logs, configuration snapshot, CI/CD validation report.

Risk if not remediated:

⚠️ Regulatory gap — see article reference.


Architecture context — Topology 🕸️ MICRO-MESH


8. 📈 REMEDIATION ROADMAP

⚠️ These are estimated scores — not predictions or guarantees.
They assume all recommended controls are implemented as described in the Playbook,
all evidence artifacts are accepted, and no new findings emerge during review.

Work Phase Estimated Score Evidence Strength Indicative Timeline
CURRENT STATE 57.2/100 EVIDENCE INSUFFICIENT — Available technical evidence is insufficient to support a positive assessment within the analyzed scope. Technical remediation required. Now
PHASE 1 — Critical gaps ~75/100 (estimated) Gate article gaps resolved 4–6 weeks
PHASE 2 — Full evidence ~91/100 (estimated) Evidence sufficient for review 10–14 weeks

Estimation basis:
Phase 1 — assumes: all BLOCKING checkpoints addressed, gate articles pass.
Phase 2 — assumes: 80% of HIGH/MEDIUM controls evidenced at E2 or above.
These scores are indicative estimates, not certification guarantees.
An independent conformity assessment may produce different results.


Indicative Effort Estimate

Phase Checkpoints Indicative Effort
Phase 1 — BLOCKING 19 High (months)
Phase 2 — HIGH 16 High (months)
Phase 3 — MEDIUM/LOW 1 Low (days)

Effort levels are indicative only. Actual effort depends on team size, stack,
architecture and organisational maturity — none of which are assessed by this audit.
No cost estimate is provided for this reason.

Phase 1 priority: l'alignement technique des Article 12 — resolves immediate regulatory exposure and unblocks dependent controls.


LEGAL STATUS: TECHNICAL EVIDENCE REPORT — This document is an automated factual report. It documents technical alignment with EU AI Act control points. It does not constitute legal advice or regulatory certification.

Methodology Notice
Evidence levels (E0–E5), contradiction detection, assurance scoring and control mapping are defined in the FactNotebook Technical Evidence Framework.
View methodology →
💬 Feedback
Does this report convince you? ×