Posted by Brandon Lum and Oliver Chang, Google Open Source Security Team
The past year has seen an industry-wide effort to embrace Software Bills of Materials (SBOMs)—a list of all the components, libraries, and modules that are required to build a piece of software. In the wake of the 2021 Executive Order on Cybersecurity, these ingredient labels for software became popular as a way to understand what’s in the software we all consume. The guiding idea is that it’s impossible to judge the risks of particular software without knowing all of its components—including those produced by others. This increased interest in SBOMs saw another boost after the National Institute of Standards and Technology (NIST) released its Secure Software Development Framework, which requires SBOM information to be available for software. But now that the industry is making progress on methods to generate and share SBOMs, what do we do with them?
Generating an SBOM is only one half of the story. Once an SBOM is available for a given piece of software, it needs to be mapped onto a list of known vulnerabilities to know which components could pose a threat. By connecting these two sources of information, consumers will know not just what’s in their software, but also its risks and whether they need to remediate any issues.
In this blog post, we demonstrate the process of taking an SBOM from a large and critical project—Kubernetes—and using an open source tool to identify the vulnerabilities it contains. Our example’s success shows that we don’t need to wait for SBOM generation to reach full maturity before we begin mapping SBOMs to common vulnerability databases. With just a few updates from SBOM creators to address current limitations in connecting the two sources of data, this process is poised to become easily within reach of the average software consumer.
OSV: Connecting SBOMs to vulnerabilities
The following example uses Kubernetes, a major project that makes its SBOM available using the Software Package Data Exchange (SPDX) format—an international open standard (ISO) for communicating SBOM information. The same idea should apply to any project that makes its SBOM available, and for projects that don’t, you can generate your own SBOM using the same bom tool Kubernetes created.
We have chosen to map the SBOM to the Open Source Vulnerabilities (OSV) database, which describes vulnerabilities in a format that was specifically designed to map to open source package versions or commit hashes. The OSV database excels here as it provides a standardized format and aggregates information across multiple ecosystems (e.g., Python, Golang, Rust) and databases (e.g., Github Advisory Database (GHSA), Global Security Database (GSD)).
To connect the SBOM to the database, we’ll use the SPDX spdx-to-osv tool. This open source tool takes in an SPDX SBOM document, queries the OSV database of vulnerabilities, and returns an enumeration of vulnerabilities present in the software’s declared components.
Example: Kubernetes’ SBOM
The first step is to download Kubernetes’ SBOM, which is publicly available and contains information on the project, dependencies, versions, and licenses. Anyone can download it with a simple curl command:
# Download the Kubernetes SPDX source document
$ curl -L https://sbom.k8s.io/v1.21.3/source > k8s-1.21.3-source.spdx
|
The next step is to use the SPDX spdx-to-osv tool to connect the Kubernetes’ SBOM to the OSV database:
# Run the spdx-to-osv tool, taking the information from the SPDX SBOM and mapping it to OSV vulnerabilities
$ java -jar ./target/spdx-to-osv-0.0.4-SNAPSHOT-jar-with-dependencies.jar -I k8s-1.21.3-source.spdx -O out-k8s.1.21.3.json
# Show the output OSV vulnerabilities of the spdx-to-osv tool
$ cat out-k8s.1.21.3.json
…
{
“id”: “GHSA-w73w-5m7g-f7qc”,
“published”: “2021-05-18T21:08:21Z”,
“modified”: “2021-06-28T21:32:34Z”,
“aliases”: [
“CVE-2020-26160”
],
“summary”: “Authorization bypass in github.com/dgrijalva/jwt-go”,
“details”: “jwt-go allows attackers to bypass intended access restrictions in situations with []string{} for m[\”aud\”] (which is allowed by the specification). Because the type assertion fails, \”\” is the value of aud. This is a security problem if the JWT token is presented to a service that lacks its own audience check. There is no patch available and users of jwt-go are advised to migrate to [golang-jwt](https://github.com/golang-jwt/jwt) at version 3.2.1″,
“affected”: [
{
“package”: {
“name”: “github.com/dgrijalva/jwt-go”,
“ecosystem”: “Go”,
“purl”: “pkg:golang/github.com/dgrijalva/jwt-go”
},
…
|
The output of the tool shows that v1.21.3 of Kubernetes contains the CVE-2020-26160 vulnerability. This information can be helpful to determine if any additional action is required to manage the risk of operating this software. For example, if an organization is using v1.21.3 of Kubernetes, measures can be taken to trigger company policy to update the deployment, which will protect the organization against attacks exploiting this vulnerability.
Suggestions for SBOM tooling improvements
To get the spdx-to-osv tool to work we had to make some minor changes to disambiguate the information provided in the SBOM:
- In the current implementation of the bom tool, the version was included as part of the package name (gopkg.in/square/go-jose.v2@v2.2.2). We needed to trim the suffix to match the SPDX format, which has a different field for version number.
- The SBOM created by the bom tool does not specify an ecosystem. Without an ecosystem, it’s impossible to reliably disambiguate which library or package is affected in an automated way. Vulnerability scanners could return false positives if one ecosystem was affected but not others. It would be more helpful if the SBOM differentiated between different library and package versions.
These are relatively minor hurdles, though, and we were able to successfully run the tool with only small manual adjustments. To make the process easier in the future, we have the following recommendation for improving SBOM generation tooling:
- SBOM tooling creators should add a reference using an identification scheme such as Purl for all packages included in the software. This type of identification scheme both specifies the ecosystem and also makes package identification easier, since the scheme is more resilient to small deviations in package descriptors like the suffix example above. SPDX supports this via external references to Purl and other package identification schemas.
SBOM in the future
It’s clear that we’re getting very close to achieving the original goal of SBOMs: using them to help manage the risk of vulnerabilities in software. Our example queried the OSV database, but we will soon see the same success in mapping SBOM data to other vulnerability databases and even using them with new standards like VEX, which provides additional context around whether vulnerabilities in software have been mitigated.
Continuing on this path of widespread SBOM adoption and tooling refinement, we will hopefully soon be able to not only request and download SBOMs for every piece of software, but also use them to understand the vulnerabilities affecting any software we consume. This example is a peek into a possible future of what SBOMs can offer when we bridge the gap to connect them with vulnerability databases: a new normal of worrying less about the risks in the software we use.
A special thanks to Gary O’Neall of Source Auditor for creating the spdx-to-osv tool and contributing to this blog post.