Producing Software Bill of Materials a.k.a SBOMs for Atlassian

Producing Software Bill of Materials a.k.a SBOMs for Atlassian

Atlassian’s SBOM platform generates detailed software inventories for compliance and security, using tools like Syft and cdxgen.

What is an SBOM?

An SBOM is a nested inventory, a list of ingredients that make up software components.

A Software Bill of Materials (SBOM) is a comprehensive inventory that documents all the software components, libraries, frameworks, modules, and dependencies used in a particular software application or system. Similar to a traditional bill of materials in manufacturing, an SBOM provides a detailed breakdown of the software components that make up a software product.

What does an SBOM contain?

An SBOM can contain a set of fields pertaining to dependencies identified in a software component(along with other metadata) such as

Example:

Why create an SBOM?

There are several compelling reasons to create Software Bill of Materials (SBOMs), which can be categorised into two primary domains:

Achieving Regulatory Compliance with Government Authorities

With the rise of regulations surrounding software security, SBOMs help companies meet compliance requirements established by government authorities and industry standards.

Enhancing Software Supply Chain within the Organisation

Building the centralised platform

The above section talks about the need of SBOMs from a high level. In addition, there were few use cases specific to Atlassian that motivated us to build this platform.

Now that we are clear with the “what” and “why” of SBOMs, digging into the “how” part below.

In Atlassian, we have built a platform which creates SBOMs for repositories based on incoming commits.

Here is a high level diagram explaining the workflow and a following brief explanation of each of the components.

High Level Architecture

Scheduler

Going further in depth, Schedular is actually a combination of multiple micro services banded together. Here is a workflow diagram on how it works:

The Listener service subscribes to commit webhooks, parses the events, and forwards relevant information to the Jobs service. This service utilizes SQS to queue and process the jobs, scheduling the corresponding tasks in Kubernetes. Simultaneously, it sends job information to the Results service, which maintains a record and the state of all jobs in the local datastore. Upon job completion—whether successful or erroneous—this service is called again to update the state accordingly.

SBOM Generator Jobs

The Job is responsible for cloning the respective repository and run tooling to generate the SBOM for the respective repository, which is further uploaded to respective S3 buckets.

Here is a detailed workflow of the SBOM generator

We actually execute multiple jobs for each repository, utilising three open-source tools: syft, cdxgen, and cyclone-dx-plugin. These tools enable us to generate Software Bill of Materials (SBOMs), which is why we see three jobs represented above. Each job runs a specific tool to produce its corresponding SBOM, which is then uploaded to S3.

The rationale to utilise three distinct tools for generating Software Bill of Materials (SBOMs) arises from the recognition that no single tool can effectively address all technology stacks. Consequently, we developed a strategy that capitalises on the unique strengths of each tool.

Syft has emerged as a leader in SBOM generation within the open-source community, offering extensive coverage across various technology stacks, which made it our primary choice for SBOM generation. However, it does not perform as well with Java projects(for example, not supporting non-lockfile use cases with Gradle). Therefore, we also employed additional tools tailored for Java, such as the cyclone-dx-plugin, which excels in maven and gradle Java projects, and cdxgen, which covers for Kotlin-based projects.

The SBOMs are organised into designated folders within the S3 bucket, following the structure: s3://bucket-name/{{version}}/{{year}}/{{month}}/{{day}}/.

For instance, an example path would be s3://sbom-inventory/v1/2024/01/31/.

We further aggregate the data during the final stage of processing (more details below).

SBOM Processor

The processor functions as an AWS Lambda that activates for each incoming file. Upon downloading the file, it identifies the format (SPDX or CycloneDX) and processes it using the corresponding custom parser.

The raw data generated by the parser is transmitted directly to the Atlassian data lake. An Aggregation job then operates on this raw data, deduplicating it and gathering the maximum number of attributes from the three tools. Finally, the processed data is sent to the snapshot table, making it readily accessible for other services.

How many SBOMs have been generated?

The project went live less than a year ago and has since been actively scanning repositories and generating Software Bill of Materials (SBOMs).

Common Challenges

Implementing the platform presents several challenges, particularly in selecting the appropriate tech stacks and tools for generating Software Bill of Materials (SBOMs). While open-source libraries can be adapted to meet all SBOM needs and requirements, some institutions may prefer to opt for enterprise solutions. Ultimately, as long as the chosen tools fulfil all necessary criteria, both open-source and enterprise options can effectively serve the purpose.

In addition, ensuring high data quality should be the top priority. This can be achieved by utilizing supported tools to measure both the completeness of Software Bill of Materials (SBOMs)—ensuring they contain all relevant fields—and the coverage of SBOMs—confirming that SBOMs are generated for all pertinent repositories and artefacts. As a solution, incorporating extra layers of enrichers for specific fields in an SBOM after its initial generation could be a beneficial option. For instance, license enrichers can be employed to accurately retrieve license information for third-party packages.

Conclusion

In today’s landscape, Software Bill of Materials (SBOMs) have emerged as an essential component, gaining recognition not only in the private sector but also among public institutions.

Implementing an SBOM platform within a large-scale company enhances internal operations by bolstering security, promoting transparency in supply chain management, increasing developer productivity, and improving overall software management.

Moreover, providing SBOMs for sellable products is not just a commendable practice; it is poised to become a standard expectation as customers grow increasingly aware of government regulations. We have already observed this trend, with customers beginning to request accompanying SBOMs for products shipped, a shift that has been noted within organisations like Atlassian.

Exit mobile version