Standardising Crypto: SCANOSS’s Open Data Journey
- Julian Coccia

- 29 minutes ago
- 7 min read

Open collaboration can transform isolated challenges into industry standards. That’s precisely what happened when SCANOSS applied its “open source virtuous circle” approach to cryptographic algorithms. What began as a customer request has evolved into a community-led effort to define, classify and detect algorithms consistently, benefiting developers, regulators, and security teams across the software ecosystem.
For a condensed overview, watch Matias D’aloia’s OSS EU 2025 talk, Know Your Crypto: Standardising and Detecting Crypto Algorithms the Open Source Way:
SCANOSS’s Open Data Journey Prior to cryptographic_algorithms_open_dataset
SCANOSS provides a subscription to the SCANOSS KB alongside fully open source tooling and open APIs:
GitHub tools: https://github.com/scanoss
Public API: https://github.com/scanoss/papi
Many customers integrate SCANOSS across the SDLC for compliance, security, and export control use cases. As a data company, SCANOSS treats openly shared datasets with care: releasing truly open datasets (e.g., CC0) creates clear benefits but must be balanced with intellectual-property stewardship.
In 2021, the Software Transparency Foundation (STF), in collaboration with SCANOSS as its Strategic Member, launched osskb.org. The service provides free, privacy-friendly access to a knowledge base and the same file and snippet detection mechanisms used in SCANOSS products, enriched with licence metadata. Integrated with tools such as FOSSology, ORT, FOSSLight, Theia IDE, and the SCANOSS open source tools. The osskb.org enables developers and compliance experts to create accurate, standardised SBOMs faster, improving transparency across the software supply chain. It is intended for academic use and sole contributors only, and is not suitable for commercial use due to usage limitations.
The following year, SCANOSS released purl2cpe, its first CC0-licensed open dataset mapping Package URLs to Common Platform Enumerations. This dataset supports large-scale vulnerability management and is now widely used by several tools.
However, contribution volume was lower than expected, a useful signal that SCANOSS needed a clearer engagement model for contributors. Together with osskb.org, purl2cpe validated the business case for selective open datasets and set the stage for SCANOSS’s next focus area: cryptographic algorithms.
From customer demands to open source
In 2021, several SCANOSS customers were asking for a new capability: to detect cryptographic algorithms within their codebases and dependencies. At the time, there was no common way across the industry to identify, describe, or categorise algorithms, let alone detect them efficiently at scale.
When designing a solution, SCANOSS quickly realised that this wasn’t a problem any single vendor could solve sustainably. True progress required community alignment. Following the open source playbook, SCANOSS saw that the best path forward was open data collaboration. The characterisation and declaration of cryptographic algorithms needed to come from shared expertise rather than a single company’s catalogue. No vendor could possibly maintain complete coverage of all algorithms, metadata, and use cases alone.
Between 2022 and 2023, SCANOSS released cryptographic algorithm detection as part of its product offering (Encryption Dataset). The response was highly positive, validating both market demand and SCANOSS’s technical direction. Once the commercial value was proven, SCANOSS was ready to open up the commodity layer of that feature, its underlying dataset.

In early 2024, SCANOSS released crypto_algorithms_open_dataset, under a CC0 license, hosted publicly on GitHub. The dataset included:
A starting list of around 60 cryptographic algorithms
A simple taxonomy inspired by SPDX to standardise identification
Keyword associations for each algorithm to support open source tooling developers

For SCANOSS, this was more than just another dataset, it was an invitation to collaborate. Customers, experts, and community contributors could now participate in shaping a shared foundation for cryptographic visibility.
Together with the STF, SCANOSS promoted the project through industry events such as FOSSNorth (2024 and 2025), OCX’24, OSPO Summit 2024, OSS Japan 2024, OCS 2024, FOSDEM 2025, and OSS EU 2025. Beyond conferences, SCANOSS and STF also engaged actively with online communities such as OpenChain, contributing to discussions and sharing progress updates to encourage broader participation. These combined efforts helped expand awareness and build early momentum around cryptographic algorithm transparency.
By late 2024, contributions to crypto_algorithms_open_dataset were coming directly from SCANOSS customers, experts in security, auditing, and export control whose input strengthened both the dataset and SCANOSS’s own detection capabilities.
SCANOSS’s outreach at events and within online communities also led to new research and commercial collaborations, including work on areas like quantum readiness.
Through these efforts, SCANOSS established closer collaboration with its customer base, built a healthy user community around the open dataset, and demonstrated its role as a credible open source contributor in a technically demanding field.
By early 2025, the dataset had grown to roughly 120 algorithms, thanks mainly to customer contributions. The taxonomy became more structured, the documentation more comprehensive, and the keyword mappings more accurate—reflecting real-world feedback from SCANOSS users.
Still, direct community contributions, particularly around taxonomy and algorithm declarations, fell short of expectations, highlighting the need for broader participation.
Laying the Groundwork for an Industry Initiative
SCANOSS took the next step in its open data journey: transferring the crypto_algorithms_open_dataset to a vendor-neutral environment to encourage broader collaboration. After evaluating several options, SCANOSS agreed with the STF that it would become the long-term steward of this and other SCANOSS open datasets.
To enable this transition, STF began designing the necessary collaboration framework, contribution processes, and tooling to host and grow open data projects under open source principles.
As a strategic member, SCANOSS committed additional technical and organisational support to help establish this foundation for sustainable community participation.
From SCANOSS Open Dataset to Industry Standard
At FOSS North 2024, Alexios Zavras, lead of the SPDX Outreach Working Group, approached SCANOSS with a proposal: to use the crypto_algorithms_open_dataset as a starting point for a new SPDX Cryptographic Algorithms List, similar to the well-known SPDX Licence List.
SCANOSS immediately recognised the value of this initiative.
SPDX is an ISO standard SBOM format already supported by SCANOSS open source tooling.
Aligning SCANOSS’s work with community efforts to standardise cryptographic algorithms.
Standardising algorithm definitions and taxonomy would improve interoperability across the SCA industry and strengthen SCANOSS' own product for customers.
It would also allow SCANOSS to focus its engineering effort on differentiation, specifically on advancing cryptographic algorithm detection.
This collaboration expanded as SCANOSS and STF agreed to contribute to the emerging SPDX Cryptographic Algorithms List, laying the foundation for and industry-wide standard.
SPDX Crypto Algorithms List
Work on the SPDX Cryptographic Algorithms List began in mid 2025. The initial version included 122 algorithms and their identifiers, derived directly from the SCANOSS crypto_algorithms_open_dataset (CC0).
The first draft of the SPDX Cryptographic Algorithms List is now available as a foundation for further contributions. SCANOSS began adopting the list as its upstream reference within crypto_algorithms_open_dataset, narrowing focus of this open data set to the characterization of those algorithms included in the list through keywords.
This integration ensures that community contributions flow seamlessly into SCANOSS products, turning open collaboration into continuous improvement.
For additional context on the SPDX Cryptographic Algorithms List, see Agustín Benito Bethencourt’s blog post. As an active contributor to the list, his perspective offers valuable insight into its goals and early progress.
STF as crypto_algorithms_open_dataset host
In 2025, STF began preparing to host SCANOSS’s open datasets, providing a neutral and transparent environment for their long-term maintenance. The migration is expected to conclude by the end of the year, establishing a stable home for continued collaboration and community contributions.

Looking Ahead
SCANOSS will continue to:
Contribute to SPDX Cryptographic Algorithm List, directly (SPDX Cryptography Group) and by supporting STF contributions, to grow and mature the List.
Support crypto_algorithms_open_dataset, soon hosted at STF, with additional keywords, in collaboration with other players.
Collaborate with CycloneDX to ensure that the Crypto Algorithms List is compatible with both standards.
Bring all these innovations to SCANOSS partners and customers through its product.
By 2026, these efforts aim to consolidate and scale the open source virtuous circle.

The journey to standardise cryptographic algorithms has shown how open collaboration can drive both innovation and resilience. By working with partners, communities, and standards bodies, SCANOSS is helping transform fragmented efforts into shared, measurable progress.
As SCANOSS looks ahead, the goal remains the same: to make transparency practical and collaboration scalable.
From this journey, several lessons stand out. Each reflects what SCANOSS has learned by turning transparency into a working model.
Open collaboration reveals demand. Working in the open surfaces real use cases early and opens new markets.
Standards reduce cost and risk. Shared taxonomies and identifiers speed integration, limit lock-in, and support compliance.
Focus engineering where it adds value. By sharing foundational datasets like the SPDX Cryptographic List, SCANOSS can direct its effort toward higher-value innovation in detection and workflow accuracy.
The virtuous circle compounds. Establish it, consolidate it, then scale it; be disciplined about where investment creates ROI.
Neutral governance unlocks scale. Trusted hosts (e.g., SPDX, STF, OpenChain) increase credible, sustained participation.
Crypto visibility is a must-have. Demand for cryptographic detection is rising across sectors; agility beats size.
Sustainability needs reciprocity. The benefits to contributors must flow back, measure and reinvest, not just contribute.
Cross-industry collaboration multiplies impact. Strategic open datasets create ecosystem value and tangible business outcomes.
For organisations seeking to strengthen their software visibility or contribute to the standardisation of cryptographic algorithms, visit https://www.scanoss.com.
About SCANOSS
SCANOSS is a data company specialising in open source software intelligence. It maintains a comprehensive knowledge base that includes all globally used open source software, enriched with metadata, advanced detection mechanisms at file and snippet levels, and an open API for flexible integration. The company’s main product is a subscription to that knowledge base, available as SaaS or on-premises. SCANOSS complements this offering with fully open source technologies and tools that allow customers to use data flexibly and efficiently while protecting privacy and IP. Learn more at https://www.scanoss.com.
About Software Transparency Foundation
STF promotes transparency, security, and compliance in software development across supply chains. STF provides osskb.org the back-end service where some of the most popular open source SCA tools integrate to enhance their open source software detection capabilities at both file and snippet level, for license compliance purposes. Learn more at softwaretransparency.org.
About SPDX
SPDX (Software Package Data Exchange) https://spdx.dev is an open standard used worldwide to communicate software bills of materials (SBOM) information, including components, licenses, security references, and now cryptographic algorithms. Maintained by the Linux Foundation, SPDX enables transparency, interoperability, and trust across the software supply chain.


