Skip to content

Cryptographic Provenance for AI Outputs

Author: Anton Sokolov, TalTech / Zetes · Date: 2026-05-05 · Status: draft v0.5 · Source: github.com/sapsan14/paper-pki-ai-act · Keywords: EU AI Act · eIDAS · ETSI EN 319 · PKI · Provenance · RFC 3161 · Post-quantum cryptography · Agent governance

(draft v0) The EU AI Act (Regulation 2024/1689) imposes obligations on integrity, transparency, record-keeping, and human oversight of AI systems — particularly under Articles 10, 12, and 13 of the high-risk regime (Chapter III, Section 2). We argue that a substantial portion of these obligations can be operationalised by reusing the trust infrastructure that PKI engineers already deploy under eIDAS (Regulations 910/2014 and 2024/1183) and ETSI EN 319 102 / 132 / 401 / 411: cryptographic signatures over canonical payloads, X.509 path validation, OCSP/CRL revocation, RFC 3161 timestamps, and policy-driven hash-chained audit ledgers. We present a reference architecture — the Enterprise Agent Trust Framework (EATF / Aletheia) — positioned as a Trusted AI Evidence Layer that binds AI outputs and governed actions to signed, timestamped Evidence Packages, with optional ML-DSA hybrid signatures (NIST FIPS 204) for post-quantum readiness. We give an Article-by-Article mapping with explicit conformance probability bands and we deliberately confine Article 10 to integrity and provenance rather than ML data quality. We position the framework as a substrate, not a vertical product, and report on five partner-integration deployments (environmental monitoring, building audit, education, with medical and KYC verticals planned) that ride on the same primitives. The paper is intended for trust architects, PKI practitioners, and AI deployers in the EU, and offers an implementation hypothesis that invites engineering critique.

The EU AI Act (Regulation 2024/1689) enters into application in stages between 2025 and 2027, and the engineering surface for high-risk AI systems — Chapter III, Section 2 — Articles 9–15 — lands squarely on auditability, traceability, and human oversight. At the same time, the European trust services regime under eIDAS (Regulation 910/2014, and its 2024/1183 update) and the ETSI EN 319 family has, for over a decade, codified concrete answers to integrity, non-repudiation, and time provenance questions — but those answers are not, by default, applied to AI artefacts.

This paper closes the gap. We do not propose a new compliance taxonomy; instead we ask: which Article-level obligations of the AI Act can be discharged by reusing PKI-native primitives, and where does the PKI toolbox stop being sufficient? Following the author’s prior position notes [@sokolov_eatf_2026], we name the reusable layer the Trusted AI Evidence Layer, and we are deliberate that it codifies evidence of integrity, not evidence of clinical or ML utility [@sokolov_signing_layers_2026].

Substrate, not vertical. A second framing choice underpins this work: EATF / Aletheia is a substrate with which sector-specific verticals (environmental monitoring, building audit, education, medical, KYC) clip on without forcing [@sokolov_substrate_2026]. We treat the substrate as the unit of contribution and the verticals as evidence of its sufficiency under load, rather than the other way around. This is relevant because much of the recent commercial discussion conflates signing AI outputs with governing AI behaviour — they are different problems with different solutions.

Contributions.

  1. An Article-by-Article mapping of high-risk AI Act obligations (Articles 10, 12, 13, with 14 explicitly adjacent) to PKI-native levers under eIDAS and ETSI EN 319 102 / 132 / 401 / 411 (table in §4).
  2. A reference architecture, EATF / Aletheia, framed as a Trusted AI Evidence Layer, that produces .aep Evidence Packages: hash → sign → (optional) RFC 3161 timestamp → (optional) ML-DSA hybrid signature (NIST FIPS 204) for post-quantum readiness.
  3. A serial-uniqueness analysis under eIDAS 2 transition (informed by draft 2025/1943) with three architectural options ranked by conformance probability bands (90–95% / 75–88% / 35–60%) [@sokolov_serial_uniqueness_2026].
  4. An honest delineation of where the PKI lens breaks down — for Article 10’s data-governance dimension, for Article 14’s human-oversight dimension, and for distinguishing integrity from clinical utility in regulated domains [@sokolov_signing_layers_2026; @sokolov_water_layers_2026].
  5. A working open-source implementation under a permissive licence, with offline .aep verification, in production across environmental monitoring (water quality, Estonian Terviseamet open data), building audit (TADF), and education (MATx) verticals.

Non-goals. This paper is not a compliance certificate; we do not assert that any given EATF deployment satisfies the AI Act in toto. We do not address training-data quality as a cryptographic property — that is a product/ML question, not a PKI one. We do not adjudicate AI safety in any normative sense; we describe a substrate that makes AI outcomes reviewable and attributable, not safe.

(draft v0.3.)

The AI Act (Regulation 2024/1689) entered into force in 2024 and applies in stages: prohibited practices first, then general-purpose AI (GPAI), then the high-risk regime, then governance and penalties. Authoritative dates are maintained by the EU AI Act Service Desk [@eu_ai_act_service_desk]; secondary instruments (the Digital Omnibus proposal, harmonised standards under Article 40, evidence-format guidance) are still being negotiated and may continue to evolve into 2027.

The high-risk regime is defined in Chapter III, Section 2 (Articles 9–15) and a system enters that regime by satisfying either Article 6(1) (safety-component pathway) or Article 6(2) plus Annex III (use-case pathway). The deployer surface, distinct from the provider surface, adds Articles 26 (deployer obligations) and 27 (Fundamental Rights Impact Assessment, where required). The deployer surface is the operational concern of this paper: a PKI-native realisation has to travel through the chain provider → deployer → end-user, and Article 26’s “use as instructed” language has direct PKI-engineering consequences (deployer-signed deployment manifests, system-card bindings).

This paper concentrates on Articles 10, 12, and 13 — the trio with the strongest PKI fit — and treats Article 14 (human oversight) as adjacent (§4). Articles 9 (risk management), 11 (technical documentation), and 15 (accuracy / robustness / cybersecurity) are referenced but not the focus: Article 9 is a process obligation that PKI does not directly address; Article 11 maps onto signed system-card bundles that the §4 mapping covers as a corollary; Article 15 sits at the runtime / governance boundary, which is out of scope by the provenance-vs-governance separation made in §2.5.

The eIDAS regime under Regulation 910/2014 — and its 2024/1183 update (“eIDAS 2”) — defines trust services as a regulated category in EU law. The taxonomy includes electronic signatures, electronic seals (institutional analogue of signatures), electronic timestamps, and electronic delivery, each available in qualified form (the “Q” prefix) when produced by a Qualified Trust Service Provider (QTSP) under audit. The legal effect of a qualified instrument travels across Member States by force of the Regulation itself, which is the property that makes the regime worth reusing for AI evidence: deployers in any EU jurisdiction can rely on the same primitives without re-ground-truthing the framework.

The ETSI EN 319 family is the engineering toolbox underneath. ETSI EN 319 102-1 [@etsi_en_319_102] gives the canonical procedure for the creation and validation of advanced digital signatures (CAdES, XAdES, PAdES, JAdES profiles); EN 319 132-1 [@etsi_en_319_132] specifies the XAdES profile in detail; EN 319 401 [@etsi_en_319_401] sets general policy requirements for trust service providers; EN 319 411-1 [@etsi_en_319_411] specifies certificate-issuing TSP requirements; EN 319 412-1 [@etsi_en_319_412] specifies certificate profiles. The relationship is hierarchical: EN 319 401 is the general policy frame, specialised by 411-1 for certificate issuers, profiled by 412-1 for the certificates themselves, and consumed by 102-1 / 132-1 at signing time. We reference these standards as a toolbox, not a curriculum; the section is for orientation, not certification.

A specific transition concern under eIDAS 2 is the serial-uniqueness rule (draft 2025/1943), which extends the serialNumber uniqueness requirement from per-CA to TSP-wide for qualified TSPs. The shift is operationally non-trivial: a TSP with multiple CAs (typical at the qualified-issuer scale) must coordinate serial issuance across silos, not just within them. We discuss three architectural options in §5.4 and their conformance probability bands (Central Serial Authority 90–95%; global pre-issuance registry 75–88%; random with post-factum scan 35–60%) [@sokolov_serial_uniqueness_2026]. This question matters for the AI Act mapping because, as soon as evidence-signing CAs become qualified, they inherit the same uniqueness obligation.

Estonia is a useful sanity check on what the regime looks like in production: nearly every adult holds a qualified certificate via the ID-card or Mobiil-ID schemes; the Trusted List (LOTL → EE-TL) is consumed by client software including Open-EID and DigiDoc; RFC 3161-grade timestamps are used routinely in private-sector contracts. The Estonian deployment shape is what we have in mind when we talk about “deployer-facing verifiability” in §4.3.

RFC 3161 [@rfc3161] defines the time-stamp protocol used in this paper as the canonical mechanism for time provenance. We treat the audit ledger as a hash-chained, append-only structure of signed events [@aletheia_repo §audit-ledger], with per-tenant block-size tuning introduced in the late-April 2026 substrate update.

2.4 Post-quantum signatures (NIST FIPS 203/204/205)

Section titled “2.4 Post-quantum signatures (NIST FIPS 203/204/205)”

Post-quantum migration follows NIST FIPS 203 (ML-KEM) [@nist_fips_203], FIPS 204 (ML-DSA) [@nist_fips_204], and FIPS 205 (SLH-DSA) [@nist_fips_205]. EATF supports a hybrid RSA + ML-DSA-65 signing mode; we discuss certificate inflation, signature size, and the practical “harvest now, decrypt later” timeline in §5.

2.5 Agent identity vs. provenance vs. governance — three orthogonal layers

Section titled “2.5 Agent identity vs. provenance vs. governance — three orthogonal layers”

A frequent confusion in 2025–2026 commercial discussion is the collapse of three distinct concerns into a single “AI trust” surface. We separate them:

  • Identitywho is the agent? Addressed by MCP [@mcp_spec], agent OIDC profiles, agent X.509 certificates.
  • Provenancewhat did the agent produce, on what input? Addressed by EATF / Aletheia, C2PA [@c2pa_spec], and similar signed-evidence schemes.
  • Governance — was the agent allowed to do that? Addressed by policy engines, delegation chains, kill-switches, action firewalls (e.g. Visa CLI’s authorize step before a payment rail [@aletheia_vision §4.1]).

Identity and provenance are orthogonal: knowing who the agent is does not, on its own, prove what it did. We position EATF strictly on the provenance layer; identity and governance are complementary neighbours.

(draft v0 — explicit scope.)

What EATF defends.

  • Integrity of AI outputs at rest and in transit. A signed .aep package is tamper-evident: a single-bit modification of the canonical payload invalidates the signature.
  • Non-repudiation of action chains. When a deployer can present a signed evidence package binding (input, output, model version, policy version, timestamp) to a verifiable signing key, the ex-post audit question shifts from “did this happen?” to “what was the policy at the time?”.
  • Time provenance. RFC 3161 timestamps anchor evidence in legally recognisable time, and qualified TSAs anchor it in eIDAS-grade time.
  • Revocation-aware verification. OCSP / CRL state at signing time is captured in the evidence package, so verification does not silently degrade to “trust on first use” months later.

What EATF does not defend.

  • Model accuracy or correctness. A signature does not make a wrong answer right [@sokolov_signing_layers_2026].
  • Training-data quality. Hashing a dataset does not establish that it represents the population, that it is leakage-free, or that it is unbiased [@sokolov_water_layers_2026]. We confine Article 10 (§4) to integrity and provenance of the artefact, not data governance in a broader ML sense.
  • Clinical or domain validity. In regulated domains (medical, legal), a signed AI output is not equivalent to a human practitioner’s judgement. We say so explicitly to avoid layer-confusion in procurement [@sokolov_signing_layers_2026].
  • Runtime behaviour. EATF is an evidence layer; runtime monitoring, rate-limiting, and anomaly detection are out of scope and live one layer up (governance).
  • Network-level adversaries against the signing host. We assume the signing host is operated under ETSI EN 319 401-grade policy; we do not propose cryptographic countermeasures to a compromised signer.

Adversaries we consider.

  1. External tamperer. Modifies an AI output after generation but before audit. Defended by signature.
  2. Mid-stream replacer. Substitutes a different evidence package for the same nominal action. Defended by signature + evidence-id binding.
  3. Time-rewriter. Asserts an action occurred earlier or later than reality. Defended by RFC 3161 timestamps.
  4. Long-horizon adversary (PQC). Records signed traffic today, waits for cryptographically relevant quantum capability. Defended (in part) by hybrid ML-DSA mode; full migration discussed in §5.

Adversaries we do not consider.

  • An adversary with QTSP key custody. (eIDAS-grade physical and policy controls are assumed.)
  • An adversary controlling the policy engine that decides which actions to sign in the first place. (That is governance, not provenance.)
  • An adversary substituting a different model under the same model identifier. We discuss model-card binding in §5.3 but treat the question as adjacent.

The threat model above is informal; this subsection sketches the security argument we believe holds, leaving the formal reduction to a follow-up paper.

For each adversary in §3, we outline the winning condition and the EATF property that prevents it.

Adversary 1 (External tamperer). Wins if the verifier accepts a modified payload as if it were the original. The signature scheme is EUF-CMA secure under standard assumptions (RSA-PSS in the random-oracle model, ECDSA / Ed25519 by their published reductions, ML-DSA-65 under Module-LWE + Module-SIS hardness). The evidence package binds the signature to the SHA-256 of the canonical payload bytes, so a payload modification is a hash modification is a signature mismatch.

Adversary 2 (Mid-stream replacer). Wins if the verifier accepts a substituted package as the package the deployer issued. Because the signed-data scope includes ph || ts || pid (where pid is a deployer-issued UUID v4 unique to this package), substituting a different package means changing at least one of these fields, which breaks the signature. Note that this defends against package-level substitution; it does not defend against an adversary who can request new signatures from the deployer itself, which is governance.

Adversary 3 (Time-rewriter). Wins if the verifier accepts an incorrect signing time. RFC 3161 timestamps anchor the claimed signing time in a TSA’s public key. A time-rewriter must either break the TSA signature (requires TSA key compromise; out of scope) or substitute a different TSA token (the token covers ph, so substitution requires a colliding payload hash; bounded by SHA-256 collision resistance).

Adversary 4 (Long-horizon PQC). Wins by a future quantum capability that breaks classical signatures over previously captured packages. Hybrid mode requires both a classical and an ML-DSA-65 signature to validate (under “both” verifier policy). A future quantum attacker breaking RSA-PSS / ECDSA still cannot validate as the original signer without breaking ML-DSA-65 as well, which currently has no known quantum attack better than generic Grover-style search (reducing security from 192-bit to ~96-bit, which remains comfortable).

A formal cryptographic proof — game-based reduction from EUF-CMA(RSA-PSS) ∧ EUF-CMA(ML-DSA) ∧ collision-resistance(SHA-256) to evidence-package-unforgeability under the threat model above — is a follow-up paper. The construction is straightforward; the contribution of this paper is the regulatory mapping, not the cryptographic primitive.

(draft v0 — core contribution. Full prose discussion of each row will expand in v1; the table below is the load-bearing claim.)

AI Act ArticlePKI-native obligation interpretationeIDAS / ETSI leverEATF realisation
Art. 10 — Data and data governanceIntegrity & provenance of artefacts only — explicitly not a cryptographic claim about training-data qualityEN 319 102-1 signature creation; canonical payload formats.aep Evidence Package: hash, signature, optional dataset hash
Art. 11 — Technical documentationExportable, verifiable system descriptions (“system card”)EN 319 102-1; CAdES enveloped signaturesSigned system-card bundles, model + policy version metadata
Art. 12 — Record-keeping / logsTamper-evident, time-anchored, append-only logsEN 319 401 / 411-1 (TSP policy); RFC 3161 (timestamps)Hash-chained audit ledger, signed events, per-tenant block-size
Art. 13 — Transparency to deployersVerifiable, machine-readable representations of agent capability + limitsEN 319 102-1 validation; trusted-list consumptionChain-of-trust validation + verification UI/API at h2oatlas.ee/verify/...
Art. 14 — Human oversight (adjacent)Cryptographic anchor for human approval events; does not substitute for oversightEN 319 102-1 signed events; threshold delegation patternsApproval-event signatures; PKI makes outcomes reviewable and attributable, not safe
Art. 15 — Accuracy, robustness, cybersecurityVerification endpoints, anomaly hooks (boundary with governance)(adjacent — runtime concerns)Out of scope (governance layer)
Art. 26 — Deployer obligationsSigned deployer attestations referencing system-card bundlesEN 319 401 (TSP policy); FRIA referencingDeployer-signed deployment manifest
Art. 27 — FRIASigned FRIA artefacts referencing the deployer manifestEN 319 102-1; CAdESFRIA evidence package linked to deployer manifest

Article 10 of the AI Act, taken literally, asks for “data and data governance” practices for high-risk AI: training and validation datasets must be relevant, representative, free of errors, and complete. PKI cannot make any of these claims. What PKI can do is bind a deployer’s assertion about a dataset to a verifiable hash of that dataset, and bind the model card that references the dataset to the same hash. This is the narrow reading we adopt: EATF proves that the data hash claimed in a model-card is the data hash of the artefact bound to that model-card. EATF cannot prove that the underlying data is fit for purpose, representative, or unbiased. Conflating the two is the single most common source of misplaced trust in the 2025 “AI compliance” space [@sokolov_water_layers_2026]. The narrow reading is conservative, which is the point: a regulator who reads Article 10 expansively is not entitled to assume PKI has done their job for them.

4.2 Article 12 — record-keeping as a hash-chained ledger

Section titled “4.2 Article 12 — record-keeping as a hash-chained ledger”

Article 12 requires logs that allow the AI system’s operation to be traced. The PKI-native realisation is a hash-chained, append-only audit ledger of signed events, each carrying an RFC 3161 timestamp [@rfc3161] and a reference to the policy and model versions in effect. Per-tenant block-size tuning, introduced in the late-April 2026 EATF substrate update, lets deployers trade latency for batch-signing efficiency without breaking the chain property. The ledger is queryable by deployer, by agent, by action type, or by time window — and every query result is itself signable. Article 12’s “automatic recording of events” maps precisely onto this construction: the events are signed, their order is hash-chained, and their time provenance is independently verifiable.

4.3 Article 13 — transparency through verification, not narrative

Section titled “4.3 Article 13 — transparency through verification, not narrative”

Article 13 obliges providers to give deployers enough information to interpret system output appropriately. Common practice in 2025 is to write a “system card” as prose. We argue that the prose is necessary but insufficient: a deployer also needs a machine-readable representation that they can verify against the actual deployed artefacts. EATF binds the system card, the model identifier, and the policy version into the Evidence Package metadata, and exposes verification at h2oatlas.ee/verify/.... A deployer (or downstream auditor) can therefore answer “is the system I am running today the system the provider documented?” without trusting either party. Chain-of-trust validation under EN 319 102-1 [@etsi_en_319_102] provides the formal grounding.

4.4 Article 14 — adjacent, not satisfied

Section titled “4.4 Article 14 — adjacent, not satisfied”

Article 14 mandates effective human oversight: humans must be able to fully understand, monitor, intervene in, and override the AI system. A cryptographic anchor on approval events is necessary but not sufficient for oversight to be meaningful. PKI gives us a verifiable record that a human approved at time T under policy P; it does not make that approval informed, deliberate, or non-rubberstamped. We say so explicitly: the PKI lens makes outcomes reviewable and attributable, not overseen in the policy sense. HCI work on the design of meaningful approval flows, and policy work on what constitutes “meaningful” oversight, sit one layer up. We therefore mark Article 14 as adjacent in the mapping table — EATF supplies primitives that any Article-14-compliant flow can reuse, but supplying them is not the same as discharging the obligation.

4.5 Articles 11, 26, 27 — corollary mappings

Section titled “4.5 Articles 11, 26, 27 — corollary mappings”

The remaining articles in the mapping table fall out of the §4.1–§4.3 reading as corollaries; we treat them briefly, in document order.

Article 11 — Technical documentation. Article 11 + Annex IV require the provider to maintain technical documentation that allows authorities to assess conformity. The PKI-native realisation is straightforward: the documentation is a structured artefact (system card + model card + policy pack + risk-management notes), it is canonicalised (CBOR or JCS), it is signed under a CAdES profile, and the signed bundle’s hash is referenced from every Evidence Package emitted by that system. A regulator who fetches one Evidence Package can chase its mch (model-card hash) field to retrieve the exact technical documentation in force at that signing time. This is direct, not adjacent: the Article 11 obligation maps onto a single bundle and a single binding edge from the runtime evidence to the documentation, both verifiable offline.

Article 26 — Deployer obligations. Article 26 imposes obligations on the deployer (the entity using a high-risk AI system in operation), as distinct from the provider (the entity placing it on the market). Two obligations have direct PKI fits: 26(2) (use as instructed by the provider) and 26(5) (logging). The first maps onto a deployer-signed deployment manifest — a CAdES-signed JSON or CBOR object asserting which provider system, version, model, and policy the deployer has activated. The second is the same Article 12 ledger from §4.2, scoped to the deployer rather than the provider. The substrate-vs-vertical framing of §1 is partly an Article 26 artefact: each deployer (“vertical”) signs their own manifest under their own credentials, riding on the provider’s substrate.

Article 27 — Fundamental Rights Impact Assessment. Article 27 requires deployers in specific Annex III contexts (public services, banking, insurance, employment-decision contexts) to conduct a FRIA before deployment. The FRIA is an artefact: a structured document asserting that the deployer has considered fundamental-rights impact under specific facts. The PKI realisation signs the FRIA under the deployer’s credentials, references the Article 26 deployer-signed deployment manifest by hash, and publishes the resulting bundle to a verifier such as h2oatlas.ee/verify/. Where the FRIA references training-data populations or model behaviour under particular conditions, those references are themselves artefact-level claims and can be hashed and re-bound. We do not claim that signing a FRIA makes the assessment correct; we claim that signing it makes the assessment audit-traceable in the operational sense Article 27 implies.

The pattern across all three corollary mappings is the same: each article asks for an artefact, and PKI provides a verifiable container for that artefact bound by hash to the runtime evidence that produced or relied on it.

(See figures/fig01-mapping-overview.svg — to be rendered via MiMo-V2-Omni from the schematic in figures/fig01-mapping-overview.txt.)

(draft v0 — architecture skeleton; quantitative numbers and per-vertical deployment data to land in v1.)

EATF / Aletheia is a Java 21 / Spring Boot 3.5 backend with a partner- integration CLI (eatf) [@aletheia_repo]. The signing pipeline is:

canonical payload → hash (SHA-256 / SHA3-256)
→ sign (RSA-PSS or ML-DSA-65 hybrid)
→ optional RFC 3161 timestamp
→ Evidence Package (.aep)
→ audit-ledger event (hash-chained)

Each .aep package is offline-verifiable: a third party with the public bundle (cert chain + TSA cert + payload metadata) can verify the signature without contacting the signing host.

The hybrid mode produces a single .aep carrying both a classical signature (RSA-PSS or ECDSA) and an ML-DSA-65 signature [@nist_fips_204]. Verifiers MAY require either, both, or all-of-the-above according to deployment policy. Certificate inflation is the primary operational concern; we discuss size implications and rotation cadence in §5.5.

Each evidence package binds:

  • input hash (canonical AI input)
  • output hash (canonical AI output)
  • model identifier (semver + build hash) and optionally a model- card URL
  • policy version (semver of governance policy in effect at signing time)
  • timestamp (RFC 3161 from a configured TSA)

This binding is the artefact that AI Act Articles 12 and 13 ask for in operational terms.

Draft Regulation 2025/1943 promotes serialNumber uniqueness from per-CA to TSP-wide for qualified TSPs. We analyse three architectural options [@sokolov_serial_uniqueness_2026]:

  • A — Central Serial Authority. All CAs in the TSP request serial numbers from a single highly-available authority that guarantees atomicity and audit. Conformance probability 90–95%.
  • B — Global pre-issuance registry. All issuance flows through a CertManager that consults a global registry. Conformance 75–88%, contingent on flow completeness.
  • C — Random + post-factum scan. Issue with high-entropy serials, scan periodically, remediate collisions. Conformance 35–60% — weak under a shall be unique reading.

We adopt option A in EATF. The discussion belongs in this paper because the same TSP-wide thinking maps directly onto AI evidence-signing CAs: if AI provenance scales to enterprise, evidence-signing CAs face the same uniqueness problem.

  • eatf init — provision a deployer key pair and registration bundle.
  • eatf doctor — self-test signing, TSA reachability, OCSP staleness.
  • eatf sign — produce .aep for a content payload.
  • eatf verify — offline verification of .aep.
  • eatf agents sync — register agents under a deployer.

Per-tenant block-size for the audit ledger and audit_event hooks on /api/sign were introduced in the late-April 2026 substrate update.

In April 2026 we observed an OCSP-responder-key-binding expiry in a production EJBCA deployment that required manual CSR renewal owing to HSM driver constraints (Utimaco virtual HSM clone limitations). The incident is informative because it shows that PKI-native trust is not free: it inherits the operational rigour of the underlying CA. Any “AI signing” platform that does not pay this cost is not delivering the legal effect that AI Act Article 12 implies [@sokolov_serial_uniqueness_2026 §incident-addendum].

For reproducibility, we describe the .aep package as a CDDL-style schema (RFC 8610). The wire format is currently CBOR; a JSON profile exists for browser verifiers but is not the canonical form. Field names are chosen for compactness over readability — the readable form is reconstructed by eatf verify --explain.

CDDL schema (RFC 8610) for the .aep envelope
aep = {
v: uint, ; format version, currently 1
pid: bstr .size 16, ; package id (UUID v4)
ts: tstr, ; emission time, RFC 3339 UTC
pl: payload, ; the signed assertion
sig: signatures, ; one or more signatures
? tsa: tstr, ; RFC 3161 TSA token, base64
? rev: revocation, ; OCSP / CRL state at sign time
? meta: { * tstr => any }, ; deployer metadata
}
payload = {
ph: bstr .size 32, ; SHA-256 of canonical payload bytes
ct: tstr, ; content type, e.g. "model-card",
; "ai-output", "audit-event"
ih: bstr .size 32, ; SHA-256 of canonical input
oh: bstr .size 32, ; SHA-256 of canonical output
mid: tstr, ; model id (semver + build hash)
? mch: bstr .size 32, ; model-card hash (optional)
pol: tstr, ; policy version (semver)
dep: tstr, ; deployer id (URI)
? agt: tstr, ; agent id (URI, optional)
}
signatures = [+ signature]
signature = {
alg: tstr, ; "RSASSA-PSS-SHA256",
; "ECDSA-P256-SHA256",
; "ML-DSA-65", "Hybrid-RSA-MLDSA"
kid: tstr, ; key id (URI to certificate)
cert: bstr, ; signing certificate (DER)
chain: [* bstr], ; intermediate certs (DER)
val: bstr, ; signature bytes
}
revocation = {
ocsp: bstr, ; OCSP response (DER, RFC 6960)
; captured at sign time
? crl: bstr, ; CRL (DER), if used
fresh_at: tstr, ; RFC 3339 timestamp of fetch
}

Canonical payload bytes are produced by deterministic CBOR encoding (RFC 8949 §4.2.1) of the payload map. The ph field is the SHA-256 of those bytes; signatures cover ph || ts || pid so that replays carrying a different ts or pid are distinguishable.

Hybrid signatures appear as two entries in the signatures array under the same kid but different alg values, one classical (RSASSA-PSS-SHA256) and one post-quantum (ML-DSA-65). Verifiers configured for “either” accept on the first valid match; verifiers configured for “both” require all entries to validate. We do not combine signatures into a single composite primitive at this stage; NIST is still settling on hybrid signature formats [@nist_fips_204], and a forward-compatible “split” representation is operationally safer for now.

Evidence-id binding. The pid field is included in the signed data, so a third party cannot lift a signature off one package and re-attach it to another with a different pid. This defends against the mid-stream replacer adversary in §3.

Offline verification. eatf verify checks: (i) signature(s) validate against cert; (ii) cert chains to a configured trust anchor under EN 319 102-1; (iii) OCSP / CRL state in rev was fresh at fresh_at and is consistent with the certificate; (iv) the optional TSA token in tsa validates against the TSA’s public key and binds ph. No network access is required.

A reference verifier in Java is shipped with the EATF CLI; an independent Python verifier (aletheia-py-verify, ~700 lines) is provided so that no party need trust the EATF code base to verify. Both are exercised by a suite of cross-implementation conformance tests.

(draft v0.3 — analytic targets with justifications. Measured numbers to land in v1 alongside the public benchmark harness in aletheia-ai/bench/.)

We evaluate EATF along four dimensions: signing throughput, verification cost, vertical fit (qualitative), and conformance probability bands for the eIDAS 2 transition.

Target band. Median end-to-end latency for eatf sign (payload submission → .aep emission, including TSA round-trip) under 200 ms in classical mode, under 350 ms in hybrid PQC (RSA-PSS-SHA256 + ML-DSA-65). p99 under 750 ms in both modes.

Justification. The classical path is dominated by three costs: SHA-256 of the canonical payload (negligible for payloads under 1 MB), RSA-PSS-2048 signing (~1–2 ms on contemporary hardware with PKCS#11 HSM access via Utimaco / Thales drivers), and the RFC 3161 round trip. Public TSAs typically respond in 80–150 ms; private TSAs co- located with the signer drop that to ~10 ms. We budget the bulk of the latency to the TSA round trip and reserve the remainder for serialisation and policy-engine handoff.

The hybrid mode adds an ML-DSA-65 signature; software-only ML-DSA-65 on commodity hardware is currently in the 5–15 ms range [@nist_fips_204], with hardware acceleration improving substantially when available. Adding ML-DSA increases the signed bytes (signature size ~3.3 KB vs RSA-2048’s 256 B) but does not double the round trip; the targets above absorb the increment without doubling.

Target band. Median eatf verify (offline path) under 50 ms classical, under 80 ms hybrid. p99 under 150 ms in both modes when the OCSP / CRL state is captured in the package (the typical case) and no live network fetch is needed.

Justification. Offline verification consists of: parse CBOR envelope (sub-millisecond), validate RSA-PSS / ML-DSA signature(s) (3–10 ms each), validate certificate path under EN 319 102-1 (1–5 ms for typical 3-level chains), validate the captured OCSP response (1–3 ms), validate the optional TSA token (5–10 ms). The 50 ms target leaves comfortable headroom for batched verification.

A separate live verification mode, in which the verifier re-fetches fresh OCSP, has a different envelope dominated by network latency (80–300 ms typical). We recommend live verification only for high- stakes verifications under deployer policy; routine verifications should rely on the embedded OCSP captured at sign time.

Five partner integrations exercise the substrate:

  • TADF Auditor (building audit, shipped to production 2026-04-26 on tadf-audit.h2oatlas.ee). Evidence packages wrap the rendered DOCX audit report. Signing volume: dozens of reports per week per auditor, comfortably under any throughput concern.
  • EU AI Act Legal Advisor (mock, on eu-ai-act.h2oatlas.ee). Demonstrator that signs the agent’s regulatory-mapping output. Used as a concrete demo when explaining the substrate to non-technical audiences.
  • Water Analyst (water-quality-ee, daily P(violation) bulletin under preparation, see §7). Daily batch: ~5 model-card-bound bulletin packages plus per-site verification artefacts. Volume modest; visibility high (citizen-facing publication).
  • Medical AI (seed, dependent on a domain co-founder). The highest-stakes vertical, deliberately staged after the substrate is hardened. The integrity-vs-clinical separation argued in §3 is the load-bearing claim for this vertical’s procurement story.
  • KYC agent (backlog). Downstream of Visa CLI’s authorize; the evidence package documents the reasoning behind a transaction approval / hold. Latency-sensitive (target: under 200 ms in the authorisation path).

The qualitative finding from these integrations is that the substrate is uniform; the verticals are not. Adapting the substrate to each vertical happens at the metadata-and-policy boundary, not in the signing primitives. This is the operational evidence behind the substrate-not-product framing of §1.

For the eIDAS 2 serial-uniqueness transition (§5.4), we estimate conformance under three architectural options:

OptionDescriptionConformance
A — Central Serial AuthorityAtomicity at TSP scope; HA replication, signed audit log of issuance90–95%
B — Global pre-issuance registryAll issuance flows through a CertManager that consults the registry75–88% (depends on flow completeness)
C — Random + post-factum scanHigh-entropy serials; periodic collision scan and remediation35–60% (weak under “shall be unique” reading)

The bands are rough confidence intervals derived from operational experience with similar coordination problems [@sokolov_serial_uniqueness_2026]; they are not claimed as statistically valid. We adopt option A in EATF.

We make no claims about the AI outputs signed by EATF. We measure the trust layer, not the model. This is consistent with §3 (threat model) and §4 (Article 10 narrowing).

We do not evaluate against an explicit adversary (game-based threat model). The threat model in §3 is informal; a formal cryptographic analysis of the canonical-payload binding under standard models (EUF-CMA for the underlying signature schemes, with appropriate reduction to evidence-id binding) is a follow-up paper.

We position EATF against five adjacent lines of work, each of which solves a related but distinct problem.

Content-credentials (C2PA) [@c2pa_spec] address media provenance: where did this image come from, what edits were applied, who attested. The C2PA manifest is itself a signed structure carried in JPEG/PNG/MP4 metadata. The match with EATF is partial: both produce signed provenance bundles, but C2PA is media-shaped (image-level hashes, edit history) while EATF is action-shaped (AI input → AI output). The two compose: a C2PA-signed image consumed by an AI agent can be referenced from an EATF Evidence Package as the agent’s input hash, and the resulting AI output (e.g. a caption) can in turn be C2PA-signed if it is media. We do not subsume C2PA; we use it.

SLSA / in-toto (supply-chain attestation) [@slsa_spec; @in_toto]. SLSA defines integrity levels for build pipelines and uses in-toto attestations as signed metadata about how an artefact was produced. The match with EATF is structural: both bind a signature to a canonical statement about an artefact’s lineage. The difference is domain: SLSA is concerned with software builds and tamper-evident CI, EATF with AI runtime decisions. We borrow the attestation predicate pattern — a typed, schema-validated claim — and we differ in placing the attestation in CBOR rather than DSSE, and in carrying revocation state inline (SLSA defers revocation to PKI’s normal channels; EATF captures the OCSP response at sign time so verifiers do not need to re-fetch).

W3C Verifiable Credentials (VC) and Decentralized Identifiers (DID) [@w3c_vc_2_0; @w3c_did_1_1]. VC defines a JSON-LD-shaped container for signed claims about subjects; DIDs provide subject identifiers without a central registrar. The match with EATF is in the signed-claim pattern. The differences are practical: VC’s JSON-LD canonicalisation has been a recurring source of implementation hazard [@vc_jsonld_canonicalisation_paper], and VC presumes a wallet-style holder–issuer–verifier triangle that does not naturally fit AI deployers. EATF’s CBOR-based wire format and QTSP-grade certificate chain take a different operational path; we think both are reasonable, and an EATF-to-VC bridge for use cases where wallet semantics matter is a sensible follow-up.

Model Context Protocol (MCP) [@mcp_spec]. MCP defines agent identity, capability discovery, and tool invocation. The match with EATF is complementary: MCP answers “who is this agent and what can it do?”, EATF answers “what did it actually do, on what input, under what policy?”. We make this explicit in §2.5: identity, provenance, and governance are three orthogonal layers, and conflating them produces the same failure mode as conflating authentication with authorisation. An MCP-aware deployer can reference an MCP server identifier as the agt field in the EATF payload and bind the two together.

Visa CLI (action-firewall pattern) [@aletheia_vision §4.1]. An experimental payment rail in which Aletheia’s authorize precedes any payment action initiated by an AI agent. Not a substitute for PSD2/SCA/AML; a complementary governance layer. We mention it because it crystallises the substrate-vs-vertical thesis in §1: the same provenance substrate is reused for a payment-rail vertical, with the additional governance gate added at the rail.

Trustworthy AI assurance standards. ISO/IEC 23894 (AI risk management) [@iso_23894], IEEE 7000-series (ethical design), NIST AI Risk Management Framework [@nist_ai_rmf]. These are upstream of this paper as motivation: they describe what trustworthy AI should look like at a process level. They do not translate obligations into PKI-native artefacts, which is the gap this paper closes.

The “zero-trust for AI” framing. Several 2025–2026 commercial frameworks rebrand zero-trust principles as AI governance. Zero-trust is an identity-and-access discipline addressing network and session boundaries; provenance is a separate layer addressing artefact integrity over time. Conflating them produces the same failure mode as conflating authentication with authorisation: the framework can look comprehensive while leaving entire classes of post-hoc audit unanswerable.

Recent academic provenance work. Several recent IACR ePrints and S&P papers address signed AI provenance from a cryptographic angle — for instance, transcript binding for LLM auditability and zero-knowledge attestations of model evaluation. We position this paper as complementary to that line: where those works ask what cryptographic primitive can attest a model’s behaviour?, we ask what existing trust services can be reused, and what does the resulting Article-by-Article mapping look like? The full literature review is deferred to v1.

(draft v0.)

The PKI lens is sufficient for a substantial portion of the AI Act high-risk surface — Articles 10 (read narrowly), 12, and 13 — and is insufficient for Article 14 (human oversight), the broader data- governance dimension of Article 10, and runtime concerns under Article 15. We have given an Article-by-Article mapping, a reference implementation (EATF / Aletheia) framed as a Trusted AI Evidence Layer, three architectural options for eIDAS 2 serial-uniqueness with explicit conformance probability bands, and pointers to five partner integrations as evidence of substrate sufficiency.

Open problems we do not solve here:

  1. Article 10’s data-quality dimension. The PKI lens cannot extend here without a separate ML / data-governance regime. What standardisation can complement signed provenance with signed fitness claims?
  2. Article 14 human-oversight binding. Approval events are trivially signable. Meaningful oversight is not. HCI and policy work outside the PKI domain.
  3. Cross-jurisdictional trust-list interop for AI evidence. EU Trusted Lists exist for trust services. An analogous list of AI evidence-signing CAs does not. Should it?
  4. PQC + audit-ledger longevity. A 30-year retention policy under simultaneous classical and PQC algorithm deprecation is open.
  5. Agent-of-agent delegation. When agents act on behalf of agents on behalf of users, the responsible-authority chain in the audit trail is not yet standardised.
  6. Privacy-preserving evidence packages. Today’s .aep packages identify the deployer. Can we sign without disclosing? Selective disclosure under verifier-binding is plausible but unimplemented.

We invite engineering critique; the reference implementation is public, and every claim in this paper that touches the implementation can be checked by eatf verify against the published evidence packages.


To Bart Symons (PKI Consortium / Zetes) and Jos De Wachter for guidance on the eIDAS / ETSI dimensions; to the TalTech Vanem arendajaks course (Team 67) for hosting the EATF defence on 2026-05-05; and to the partner-integration deployments that provided real-world feedback.

All Evidence Packages discussed in this paper are verifiable offline against the public CLI at https://github.com/sapsan14/aletheia-ai. The signed source of this paper is published at h2oatlas.ee/verify/paper-pki-ai-act/<version>.