Blog

Securing Your Data Center Servers at the Component Level

As the operator of a large server fleet, your responsibility is to ensure the infrastructure running business-critical application workloads is secure and available. To this end, there are a number of security control frameworks and best practices that you can follow, such as the NIST Cybersecurity Framework (CSF). In this blog post, we’ll introduce an aspect of data center security that few are thinking about—except for attackers—along with how Eclypsium can help you close this gap simply.

What’s Inside Your Server Fleet?

Each of your servers is actually a constellation of hardware and firmware components from various manufacturers. A typical data center server has up to 30 different components that have some sort of updatable microcode or firmware. 

Just a few of the major components that go into a typical data center server.

The U.S. Government Accountability Office (GAO) noted that a major OEM had 65 direct suppliers and more than 200 second-tier suppliers. Altogether, the components that went into that OEM’s computer products were manufactured in factories located in 39 countries! The point here is that the supply chain for your servers and other IT infrastructure is incredibly complex.

Each of these components are vulnerable to attack. For each component, the attack surface grows as we add functionality. For example, CPU microcode today is vastly larger and more complex than it was before multicore processing was a thing. This inevitably leads to vulnerabilities in code that handles speculative execution, such as Spectre, Meltdown, Inception, and a long list of variants.

A related eye-opening factoid: The baseboard management controller (BMC) included in servers for lights-out management in data centers is often based on some type of embedded Linux operating system, and so will often contain open-source libraries like OpenSSH or the xz utils library that was targeted by nation-state attackers as a supply chain vector. This includes OpenBMC, iDRAC, iLO, and other BMCs. 

Attacking these server components does not require physical access. Many exploits can be executed over the local network or even from the internet if management interfaces are not segmented properly. This is a serious threat to traditional data centers and new AI infrastructure. For example, in January NVIDIA disclosed CVE-2023-31029 and CVE-2023-31030, 9.3 CVSS vulnerabilities in the NVIDIA DGX A100 baseboard management controller that an unauthenticated attacker could use for arbitrary code execution, denial of service, information disclosure, and data tampering.

What’s the Worst That Could Happen?

An information sheet released by the U.S. NSA and CISA last year explains the risk of firmware threats that evade OS-level security controls. In a section titled Malicious actors target overlooked firmware, the paper says, “A vulnerable BMC broadens the attack vector by providing malicious actors the opportunity to employ tactics such as establishing a beachhead with preboot execution potential. Additionally, a malicious actor could disable security solutions such as the trusted platform module (TPM) or UEFI secure boot, manipulate data on any attached storage media, or propagate implants or disruptive instructions across a network infrastructure.” In other words, compromised BMCs help attackers establish persistence and spread laterally throughout the network.

The problem is not easily solved. The NSA and CISA information sheet continues: “Traditional tools and security features including endpoint detection and response (EDR) software, intrusion detection/prevention systems (IDS/IPS), anti-malware suites, kernel security enhancements, virtualization capabilities, and TPM attestation are ineffective at mitigating a compromised BMC.”

The U.S. National Security Agency and CISA warn server operators to secure baseboard management controllers.

Data Center Component-Level Security Controls

What security controls address this risk? The CIS Critical Security Controls are a tremendous and well-respected resource for helping to prioritize security controls. The top two in order of importance are Inventory and Control of Enterprise Assets and Inventory and Control of Software Assets, respectively. The components inside your servers are included in these two primary controls, but how many organizations are including components in their asset inventory and management processes? This is a serious gap in most data center operations.

The top two CIS Security Controls are for hardware asset management (left) and software asset management (right) but most data center operators have very little visibility into the hardware and firmware components of their servers.

The Eclypsium supply chain security platform fills the gap in data center cybersecurity programs, providing not only component-level asset management, but also vulnerability management, compliance monitoring, and threat detection. With one solution, you have the component-level security capabilities in place:

  • Inventory (know what is in your environment) – You will be able to track which assets have which components. This is a prerequisite for the following two security controls, but also helps when responding to supply chain incidents where there is an SSD firmware bug that renders the disk inoperable or a TPM chip that’s generating insecure private keys.
  • Harden (identify and fix risk) – Eclypsium not only identifies vulnerabilities, but also insecure configurations such as Intel ME left in manufacturing mode or servers where Secure Boot is not enabled. Eclypsium also helps to monitor compliance with NIST 800-53 and other standards. Finally, you can validate update binaries and schedule firmware updates through the Eclypsium sensor.
  • Detect (detect backdoors and implants) – Because threats at the firmware level subvert OS-level security controls, this leaves a huge gap in most organizations’ detection strategies. Eclypsium’s Automata binary analysis system replicates the tools and techniques of security researchers to continuously monitor your server fleets for backdoors, implants, and other evasive threats. You can set baselines for devices and also send any alerts to a SIEM, SOAR, or other security tooling.
Eclypsium makes it easy to monitor and report on compliance with standards such as NIST 800-53 rev5 across your server fleets.

Component-level threats are serious and the risk needs to be mitigated. Traditional OS-level security tools cannot address these types of threats. Eclypsium’s supply chain security platform provides the visibility and security mechanisms necessary to protect your data center’s soft underbelly. 

If you’re ready to chat, we’d love to show you a demo.

Further Reading