Monitoring and Telemetry
Summary#
Action logging, anomaly alerts, and tamper-evident data
Applicability#
| Certification Level | Status | Description |
|---|---|---|
| L1Supervised Operational Reliability | Required | Applicable ACRs must be satisfied for L1 certification. |
| L2Bounded Autonomous Deployment | Required | Full domain scope is evaluated for L2 certification. |
| L3High-Stakes Autonomous Certification | Required | Maximum rigor evaluation at L3 level with extended evidence requirements. |
Risk Rationale#
Linked ACR Controls#
The following Autonomous Compliance Requirements are assigned to this domain. Each ACR defines a specific, testable control with its own evaluation method, classification, and evidence requirements.
All autonomous actions SHALL be logged with action type, timestamp, input context, decision rational
All autonomous actions SHALL be logged with action type, timestamp, input context, decision rationale, and outcome.
Decision boundary logs SHALL record the factors that determined why one action was chosen over alter
Decision boundary logs SHALL record the factors that determined why one action was chosen over alternatives.
The system SHALL provide replay capability allowing operators and auditors to reproduce past behavio
The system SHALL provide replay capability allowing operators and auditors to reproduce past behavior from logged data.
Real-time anomaly alerts SHALL be generated for boundary violations, unusual patterns, and operation
Real-time anomaly alerts SHALL be generated for boundary violations, unusual patterns, and operational anomalies.
Telemetry logs SHALL be tamper-evident, with any modification detectable through integrity verificat
Telemetry logs SHALL be tamper-evident, with any modification detectable through integrity verification.
Structured logging with consistent schema SHALL enable automated analysis and cross-system correlati
Structured logging with consistent schema SHALL enable automated analysis and cross-system correlation.
Telemetry data retention policies SHALL be defined and enforced proportional to the system's risk cl
Telemetry data retention policies SHALL be defined and enforced proportional to the system's risk classification.
Dashboard and reporting capabilities SHALL be provided for operational monitoring and compliance ver
Dashboard and reporting capabilities SHALL be provided for operational monitoring and compliance verification.
Telemetry infrastructure SHALL NOT itself become a single point of failure for the autonomous system
Telemetry infrastructure SHALL NOT itself become a single point of failure for the autonomous system.
Telemetry access controls SHALL prevent unauthorized viewing, modification, or deletion of operation
Telemetry access controls SHALL prevent unauthorized viewing, modification, or deletion of operational logs.
Log collection latency SHALL NOT exceed defined maximum delay from event occurrence to log availabil
Log collection latency SHALL NOT exceed defined maximum delay from event occurrence to log availability.
The system SHALL implement log rotation and archival without loss of data or query capability.
The system SHALL implement log rotation and archival without loss of data or query capability.
Telemetry SHALL capture sufficient context to reconstruct the system's state at any point in time wi
Telemetry SHALL capture sufficient context to reconstruct the system's state at any point in time within the retention window.
The system SHALL implement health and heartbeat monitoring with configurable alerting for missed che
The system SHALL implement health and heartbeat monitoring with configurable alerting for missed check-ins.
Telemetry data SHALL be exportable in standardized formats for third-party analysis.
Telemetry data SHALL be exportable in standardized formats for third-party analysis.
Cross-organization telemetry correlation SHALL be supported for multi-party autonomous operations.
Cross-organization telemetry correlation SHALL be supported for multi-party autonomous operations.
Telemetry SHALL include performance metrics sufficient for SLA monitoring and compliance.
Telemetry SHALL include performance metrics sufficient for SLA monitoring and compliance.
Alert fatigue mitigation SHALL be implemented through intelligent alert grouping, deduplication, and
Alert fatigue mitigation SHALL be implemented through intelligent alert grouping, deduplication, and escalation.
The system SHALL implement distributed tracing for operations spanning multiple services or componen
The system SHALL implement distributed tracing for operations spanning multiple services or components.
Telemetry collection SHALL be resilient to network partitions with buffering and eventual delivery g
Telemetry collection SHALL be resilient to network partitions with buffering and eventual delivery guarantees.
The system SHALL provide configurable telemetry verbosity levels appropriate to operational context.
The system SHALL provide configurable telemetry verbosity levels appropriate to operational context.
Telemetry data SHALL be time-synchronized across all system components with documented maximum skew
Telemetry data SHALL be time-synchronized across all system components with documented maximum skew tolerance.
The system SHALL log all configuration changes with the previous value, new value, change source, an
The system SHALL log all configuration changes with the previous value, new value, change source, and timestamp.
Telemetry SHALL support automated compliance checking against ACR requirements.
Telemetry SHALL support automated compliance checking against ACR requirements.
The system SHALL implement anomaly detection on telemetry streams to identify unusual behavioral pat
The system SHALL implement anomaly detection on telemetry streams to identify unusual behavioral patterns.
Telemetry schema evolution SHALL be backward-compatible to maintain historical query capability.
Telemetry schema evolution SHALL be backward-compatible to maintain historical query capability.
Systems certified at Assurance Class B (Monitored) SHALL maintain an active CAPO connection and deli
Systems certified at Assurance Class B (Monitored) SHALL maintain an active CAPO connection and deliver telemetry batches at minimum monthly frequency. A gap of two consecutive months without telemetry batch delivery SHALL be automatically flagged by the CAPO as an Assurance Lapse condition.
Systems certified at Assurance Class C (Continuously Assured) SHALL maintain a persistent, real-time
Systems certified at Assurance Class C (Continuously Assured) SHALL maintain a persistent, real-time CAPO telemetry connection. Any telemetry gap exceeding 24 hours (excluding documented scheduled maintenance windows of up to 4 hours, maximum twice per month) SHALL be automatically flagged by the CAPO as an Assurance Lapse condition triggering the 72-hour remediation window.
The CAPO SHALL deliver SLA-bound alerting for Assurance Class C systems: Critical-severity complianc
The CAPO SHALL deliver SLA-bound alerting for Assurance Class C systems: Critical-severity compliance events SHALL generate alerts within 5 minutes of detection; Emergency-severity events SHALL generate alerts within 60 seconds. Failure to meet these SLAs for material events constitutes an Assurance Lapse condition.
For Platform Certifications, the vendor SHALL document in the Reference Environment Specification wh
For Platform Certifications, the vendor SHALL document in the Reference Environment Specification whether the ARA Behavioral Telemetry SDK is deployed in the reference environment, and if not, the alternative telemetry mechanism used to support ACRs with Continuous or Quarterly evaluation frequency. The AVB SHALL assess whether the documented telemetry mechanism provides evaluation-equivalent coverage.