Introduction Link to heading
Imagine you’re the captain of a starship exploring a distant, uncharted galaxy. Your ship is equipped with advanced technology, powerful engines, and a crew of experts ready to face any challenge. But as you venture deeper into the unknown, you start encountering strange anomalies—systems malfunctioning, unexpected power surges, and sudden threats from unseen forces. You quickly realize that something within your ship’s systems is not as secure as you thought.
Upon closer inspection, you discover that your ship’s components were built from various sources, each with its own hidden flaws. These flaws, while small on their own, create vulnerabilities that could jeopardize your entire mission. As the captain, you understand that to protect your crew and complete your journey, you need a comprehensive map of every component, every piece of code, and every potential risk.
This map, in the world of software development, is known as a Software Bill of Materials, or SBOM. Just as a starship captain needs to know every detail of their vessel to navigate the stars safely, developers and security teams need an SBOM to ensure that the software they build and use is secure and reliable. Without it, they’re flying blind in a universe filled with potential threats. The SBOM also helps prove that your vessel is roadworthy (spaceworthy?), in that it complies with all regulations, and every component has been accounted for.
How does it work? Link to heading
Okay enough of the space metaphors. Before we understand how the SBOM helps us do all this fancy work, we need to familiarize ourselves with the software supply chain.
Every piece of modern software leverages third-party libraries to provide some functionality or utility. After all, it seems very sensible because “why reinvent the wheel,” right? We surely don’t need to redo the authentication functionality that someone else has already implemented in a million different ways. Being able to reuse someone’s code from PyPI or npm enables us to rapidly ship new features in our software. If we dig deeper, we realize that it’s not just these libraries but also various system-level utilities that are often open-sourced. Using custom JavaScript hosted on a remote URL to quickly ship out your front end, the web server that serves all your traffic, and even the operating system itself—all are examples of components within a software supply chain.
With each added component in the software supply chain, we introduce additional attack vectors, and by extension, increase the overall attack surface.
This is where an SBOM can be useful. It acts as a manifest of sorts, providing machine-readable details of all third-party dependencies being used, which can then be processed for automatic security scans. Once this ‘manifest’ is created, we can use ingest it into tools that create the map of our software supply chain for us. However, there is an additional step to complete before we can create an SBOM—Software Composition Analysis (SCA).
SCA is where we actually determine the direct dependencies (usually specified within a requirements.txt or package.json file, or an equivalent in your language of choice) and transitive dependencies (dependencies that your direct dependencies rely on). Once these dependencies are identified, we can pass them on to various commercial or FOSS analysis tools that are able to scan package versions to determine if they have been tagged as vulnerable, along with their CVE scores from various vulnerability databases. We can then use the results of our SCA analysis to generate a proper SBOM. Some crowd favourites for SCA are Snyk, OWASP Dependency-Check and Synopsys Black Duck
You will notice that SCA and SBOM Analysis seem veeeeeeery similar, but the key difference here is that SCA focuses solely on identifying and analyzing the components of your software, ensuring there are no unintentional vulnerabilities. Whereas, SBOM Analysis focuses on bringing to light potential security risks and compliance issues within the entire software supply chain.
The simplest way to differentiate them would be that:
SCA helps you understand what’s in your software, while SBOM Analysis is a deeper dive into the risks associated with those components, helping you secure the entire software supply chain.
How do I generate an SBOM??? Link to heading
There’s various options here, but my preference is always FOSS, so I will pick CycloneDX. There are various Github Actions you can deploy that can generate the SBOM as part of your workflow. You can also run this manually for your project, for example:
You can find additional options on the CycloneDX website or their Github. The output is usually in a JSON format, which is easily consumable by various analysis tools. I will attempt to demonstrate practically in a future part of this series.
What does an SBOM contain? Link to heading
The generated SBOM will usually contain various metadata about the package but some of the most useful will be:
- Version numbers — these help determine whether or not the package is vulnerable
- Vendor — this helps ensure the package is from who you expect it to be from
- License — useful for compliance against certain licenses
- Checksums/Hashes — prove that the integrity of a package has not been compromised since it was first released for that version
- Components — all sub-dependencies for this package and their metadata
We can use the information from SBOM to run further analysis to flag key metrics that help us decide whether we need to urgently upgrade packages, or even discontinue use if they’ve been hijacked by malicious maintainers. A great tool that does this is OWASP Dependency-Track which lets you perform continuous monitoring of your SBOM. There are alternatives (or supplements) like Bomber that do not require hosting unlike Dependency-Track.
What next? Link to heading
Having a continuous DevSecOps pipeline takes the burden off developers, and ensure there is a constant feedback loop, that performs automated checks for any vulnerable dependencies in your software supply chain. Adding SBOM generation as part of your CI/CD pipeline is crucial, and can maximise scan efficiency, and minimise any margin of error. In a future post, I will attempt to set up a Github repo, with workflows that generate and scan SBOMs. Stay tuned x
As a thank you for reading this, please enjoy my favourite photo of the universe 🌌 :
Look again at that dot. That’s here. That’s home. That’s us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their lives.