Kubernetes Security is constantly evolving – keeping pace with enhanced functionality, usability and flexibility while also balancing the security needs of a wide and diverse set of use-cases.

Recently, the GKE Security team discovered a high severity vulnerability that allowed workloads to have access to parts of the host filesystem outside the mounted volumes boundaries. Although the vulnerability was patched back in September we thought it would be beneficial to write up a more in-depth analysis of the issue to share with the community.

We assessed the impact of the vulnerability as described in vulnerability management in open-source Kubernetes and worked closely with the GKE Storage team and the Kubernetes Security Response Committee to find a fix. In this post we’ll give some background on how the subpath storage system works, an overview of the vulnerability, the steps to find the root cause and the fix, and finally some recommendations for GKE and Anthos users.

Kubernetes Filesystems: Intro to Volume Subpath
The vulnerability, CVE-2021-25741, was caused by a race condition during the creation of a subpath bind mount inside a container, and allowed an attacker to gain unauthorized access to the underlying node filesystem and its sensitive files. We’ll describe how that system is supposed to work, and then talk about the vulnerability.

The volume subpath feature in Kubernetes enables sharing a volume in multiple containers inside a pod. For example, we could create a Pod with an InitContainer that creates directories with pre-populated data in a mounted filesystem volume. These directories can then be used by containers in the same Pod by mounting the same volume and optionally specifying a subpath field to limit what’s visible inside the container.

While there are some great use cases for this feature, it’s an area that has had vulnerabilities discovered in the past. The kubelet must be extra cautious when handling user-owned subpaths because it operates with privileges in the host. One vulnerability that has been previously discovered involved the creation of a malicious workload where an InitContainer would create a symlink pointing to any location in the host. For example, the InitContainer could mount a volume in /mnt and create a symlink /mnt/attack inside the container pointing to /etc. Later in the Pod lifecycle, another container would attempt to mount the same volume with subpath attack. While preparing the volumes for the container, the kubelet would end up following the symlink to the host’s /etc instead of the container’s /etc, unknowingly exposing the host filesystem to the container. A previous fix made sure that the subpath mount location is resolved and validated to point to a location inside the base volume and that it’s not changeable by the user in between the time the path was validated and when the container runtime bind mounts it. This race condition is known as time of check to time of use (TOCTOU) where the subject being validated changes after it has been validated.

These validations and others are summarized in the following container lifecycle sequence diagram.

Volume subpath validations before the container startup

A New TOCTOU Vulnerability: CVE-2021-25741
The latest vulnerability was discovered by performing a symlink attack similar to the one explained above, with the difference being that it constantly swapped the symlink with a directory in a tight loop, using the RENAME_EXCHANGE option with renameat(2). If the timing is just right, the kubelet will see the path as a directory and pass the validation check. Then the mount utility may find that the path is a symlink pointing to the host and follow it, exposing the host filesystem to the container. This is visualized in the following diagram:

The expectation and the attack outcome

The GKE Security and Storage teams worked closely to revise the fix done previously to find a solution. The previous fix takes several steps to ensure that the directory being mounted is safely opened and validated. After the file is opened and validated, the kubelet uses the magic-link path under /proc/[pid]/fd directory for all subsequent operations to ensure the file remains unchanged. However, we found out that all of the efforts were undone by the mount(8) linux utility which was dereferencing the procfs magic-link by default. Once the problem was understood, the fix involved making sure that the mount utility doesn’t dereference the magic-links by using the –no-canonicalize flag in the mount command.

The fix is in

Once the problem was well understood, we fixed it inside Kubernetes and quickly released the fix to GKE and Anthos. If GKE auto-upgrade is enabled in your clusters there’s no action on your part for this vulnerability, your nodes have already been patched. We strongly recommend that customers utilize auto-upgrades. Auto-upgrade gives peace of mind that your clusters are running with the latest patches.

GKE released a Google Kubernetes Engine security bulletin on this vulnerability, which detailed what customers can do to immediately remediate this issue across GKE and Anthos. We also provided guidance to customers who manually manage their node versions, ensuring that fixed releases were available in every region for our Static and Release Channels.

Moving forward
Google continues to invest heavily in the security of GKE and Kubernetes. We encourage users interested in finding vulnerabilities to participate in the Kubernetes bug bounty program and in the Google Vulnerability Rewards Program (VRP) which was recently expanded to cover GKE vulnerabilities. For the latest guidance on security issues, please follow our GKE Security Bulletins.

Press play for the first episode as host Aryeh Goretsky is joined by Zuzana Hromcová to discuss native IIS malware

The post Launching ESET Research Podcast: A peek behind the scenes of ESET discoveries appeared first on WeLiveSecurity

Press play for the first episode as host Aryeh Goretsky is joined by Zuzana Hromcová to discuss native IIS malware

The post Launching ESET Research Podcast: A peek behind the scenes of ESET discoveries appeared first on WeLiveSecurity

ESET researchers studied all the malicious frameworks ever reported publicly that have been used to attack air-gapped networks and are releasing a side-by-side comparison of their most important TTPs

The post Jumping the air gap: 15 years of nation‑state effort appeared first on WeLiveSecurity

The INTERPOL-led operation involved law enforcement from 20 countries and led to the seizure of millions of dollars in illicit gains

The post More than 1,000 arrested in global crackdown on online fraud appeared first on WeLiveSecurity

How scammers take advantage of supply chain shortages – Tips for safe online shopping this holiday season – Steps to take after receiving a data breach notice

The post Week in security with Tony Anscombe appeared first on WeLiveSecurity

‘Tis the season to avoid getting played by scammers hijacking Twitter accounts and promoting fake offers for PlayStation 5 consoles and other red-hot products

The post The triangle of holiday shopping: Scams, social media and supply chain woes appeared first on WeLiveSecurity

With the holiday shopping bonanza right around the corner, here’s how to make sure your online spending spree is hacker-free

The post Avoiding the shopping blues: How to shop online safely this holiday season appeared first on WeLiveSecurity

Threat actors have previously timed ransomware and other attacks to coincide with holidays and weekends

The post FBI, CISA urge organizations to be on guard for attacks during holidays appeared first on WeLiveSecurity

Receiving a breach notification doesn’t mean you’re doomed – here’s what you should consider doing in the hours and days after learning that your personal data has been exposed

The post What to do if you receive a data breach notice appeared first on WeLiveSecurity