Axiado’s Secure AI Copilot Overview
Axiado Corporation, a leading AI-driven, hardware-anchored platform security solutions company provides Secure AI Copilot system designed to enhance the efficiency and effectiveness of management plane for administrators and users in performing various tasks. It provides recommendations, automates tasks, and enhances decision-making processes through a suite of sub-copilots or modules. Below is an outline of the different functional domains covered by the copilot, as depicted in Figure-1.
Figure-1: Copilot Functional Domains
Here is a brief summary of some of the copilot’s key capabilities:
- Boot Time Vulnerability Detection: This module detects boot time attacks during the boot time power sequencing using a log-based AI-ML method, enhancing server-control plane/BMC security.
- Runtime Security and Anomaly Detection and Reporting: This module detects and reports attacks on the overall BMC ecosystem, including Redfish and supply chain attacks, network connectivity, etc. It generates LLM-based insightful reports that are easily understandable by administrators.
- Vulnerability Discovery and Mitigation in Binaries: Utilizes custom AI-ML models and CVE or third-party vulnerability databases to discover vulnerabilities in binaries, a method often used for zero-day vulnerability discovery. Additionally, this module offers lifecycle management support for binaries.
- Firmware Vulnerability Lifecycle Management: Provides end-to-end firmware lifecycle management for all platform components (BIOS, FPGA, CPLD, etc.) of a server.
- Server Anomaly Diagnostics: Analyzes server failures and provides actionable insights to facilitate quick resolutions.
- Operational Awareness: Monitors system status and predicts anomalies before they occur, enhancing proactive maintenance and management.
- Dynamic Thermal Management (DTM): Axiado’s DTM system increases energy efficiency and reduces costs and environmental impact by using real-time data and predictive AI for optimal server fan control, outperforming traditional proportional integral derivative (PID) controllers.
- LLM-assistant UI: This AI-driven interface facilitates easy interaction with the BMC for security monitoring and administration tasks. The LLM provides easily understandable summaries and insights on complex issues, aiding in root cause analysis. (Currently in development).
Axiado Platform Capabilities
Axiado’s single-chip SoC, the Trusted Control/Compute Unit (TCU), encompasses the copilot modules outlined above. As depicted in Figure 2, the TCU features a quad-core application CPU subsystem designed for running critical applications like BMC, four programmable AI engines to accelerate machine learning models, and a Secure-NIC for enhanced network access and security. Additionally, it includes a secure vault with a Trusted Platform Module (TPM) for cryptographic key storage and platform integrity verification, as well as hardware Root of Trust (HRoT) for secure platform booting. For more detailed information, refer to the Axiado AX3000-2000 Product Brief.
Figure 2: Axiado’s TCU is the world’s first purpose-built, fully integrated AI-driven hardware security platform.
Security Capabilities of the AI-ML Copilot
Detection and Mitigation Against Boot Time Power Sequencing Attacks: At boot time, a BMC is responsible for power sequencing other components in a server system. For instance, according to NVIDIA’s MGX server architecture specification, “The BMC is responsible for enabling standby power to the server board (FPGA/CPU/GPU complex) and bringing the FPGA out of reset. After the FPGA is enabled, the BMC can, upon a user command or configuration, enable Run Power, initiating the CPU Boot.”
In this complex multistep handshake among the control plane components, interruptions due to a security attack could prevent the CPU from booting, resulting in a Denial of Service (DoS) attack. Attacks could stem from compromised control plane components (e.g., FPGA) to BMC communication or direct attacks on the BMC. Our TCU-resident ML model detects any alterations in this power sequence from the BMC logs, sends alerts, and produces detailed reports pinpointing the precise root cause, such as a GPIO pin set incorrectly, with comprehensive contextual explanations using LLM. This copilot also offers configurable policies for implementing necessary mitigations, such as auto-restarts.
Runtime Security and Anomaly Detection: Based on numerous past BMC attacks documented in the CVE database (e.g., CVE-2022-40259 – Arbitrary Code Execution via Redfish API, CVE-2023-34329 – Authentication Bypass via HTTP Header Spoofing) and insights from security researchers, BMC systems can be compromised due to vulnerabilities related to Redfish (e.g., security misconfigurations) or through related supply chain attacks. Beyond Redfish vulnerabilities, attackers could also uncover and exploit other weaknesses within the BMC ecosystem if it is not thoroughly hardened.
Configurable policies allow BMC administrators to select which security events to detect, report, and defend against, such as exceeding the maximum number of failed authentication attempts over the Redfish API or SSH to BMC, which could trigger significant security alerts. Moreover, the copilot can detect complex attacks, such as attempts to maliciously inject and persist attack scripts, triggering security-sensitive operations. Some of the copilot’s attack detection strategies are based on patterns described in the MITRE ATT&CK® framework.
Firmware and Binaries Vulnerability Discovery and Mitigation: Complex systems like NVIDIA MGX have several firmware updatable components. Axiado’s firmware vulnerability management copilot not only detects vulnerabilities but also provides clear pathways for mitigation through updates and active lifecycle management, ensuring firmware remains secure against potential exploits.
Upon the arrival of a new firmware image, this module generates a Software Bill of Materials (SBOM), if initial firmware information is available, and binary code vectors using an Edge Language Model. For vulnerabilities detection, it utilizes SBOM and Retrieval Augmented Generation (RAG) mechanisms, which use binary code vectors to identify potential vulnerabilities. This AI-ML model leverages CVE or third-party vulnerability databases to discover vulnerabilities in binaries. If vulnerabilities are found, the copilot recommends firmware updates to patch identified issues, which can be managed either manually or automated. Figure-3, depicts the overall flow of the detection and mitigation.
Figure-3: Vulnerability detection and management flow
For more complex images such as BMC images and OS kernel images, the presence of numerous different configurations and modules presents greater challenges for vulnerability detection. The security copilot leverages SBOM and an LLM-driven RAG mechanism to identify vulnerabilities in kernel images under different configurations. Specifically, study conducted in collaboration between UC Riverside and UC Berkeley (Wei Song et. al. [1]) shows that the existence of kernel object restructuring causes a lack of robustness in existing rule-based methods. However, the AI-ML-driven security copilot is not affected by this limitation.
Boot time configuration validation and DICE validation: The copilot validates all BMC configurations and settings of other server board components against an administrator-defined policy. It alerts and reports any deviations, using contextual data generated through LLM. Such reports are extremely valuable for security teams and server administrators.
Anomaly Detection and Reporting: The BMC not only drives the power-on and boot sequences but also has extensive access to control and data plane components such as sensors, CPLD, and system BIOS settings. The copilot utilizes BMC’s deep knowledge of the underlying platform and ML to detect anomalies beyond what NVIDIA’s health monitoring engine NVSM provides and produces easy-to-understand reports using LLM with full context based on logs and/or real-time metrics.
Ransomware Detection Using ML: The ransomware copilot offers an easy-to-use portal known as a sandbox to upload and run suspicious binaries for detecting the ransomware type and generating a detailed report about the ransomware’s signature and behavior as TTPs (Tactics Techniques and Procedures). It uses XEN VMI for agentless data acquisition of the ransomware’s behavior and is integrated with leading Gen AI models and services (e.g., Open AI, Google Vertex AI). The copilot maintains a large dataset of different ransomwares, which is periodically updated with data collected from open sources.
As shown in Fig-4 and Fig-5, after analyzing a binary, copilot reports ransomware’s identity, ransomware launched applications, memory forensics, accessed registry entries, etc.
Checkout this video for a demo for uploading a suspicious file and for the analysis report produced by the ransomware copilot: Use-Case 5: Crowd-Source Ransomware Hunting.
Fig-4: Reported forensics
Fig-5: Post analysis report
Fig-6: Ransomware copilot architecture
_______________________
[1] DeepMem: Learning Graph Neural Network Models for Fast and Robust Memory Forensic Analysis
Availability
Axiado’s AX3000 and AX2000 TCUs as well as OCP DC-SCM 2.0 Compliant Axiado SCM3002 and Axiado SCM3003 are available now for purchase. Please contact Axiado for samples and pricing.