An Open Source Approach for Cybersecurity Information Sharing

An Open Source Approach for Cybersecurity Information Sharing

In our previous blog about the need for an open intelligence sharing platform with the government (or any other coordinating bodies), we illustrated the reason an open solution is the best possible approach to guaranteeing the highest degree of cybersecurity and compliance to standards, while enabling the community to have the freedom to decide which cybersecurity monitoring platform(s) to adopt.

What Is Needed and the State of the Art

Intelligence sharing is a complex topic, but we can clearly identify two distinct areas:

  1. The sharing of already ‘cooked’ packages of Indicators of Attack (IoA) and Indicators of Compromise (IoC), which isolate a particular threat and provide investigators with the ‘where what how’ of an attack.
  2. The ability to share details of new, zero day and emerging attacks by being able to see dozens or hundreds of sources of reconnaissance information from across the globe can quickly head off malware proliferation.

The first area of intelligence sharing involves distributing a list of IoCs (simple hashes or more complex conditions), a description of what the malware does, and other relevant information. In principle, such packages have no identifying information, and contain little to no correlation with the original site(s) where the indicators were spotted. Thus, they can and should be shared with peers or with a central body. Information sharing can be useful to enrich the IoC list, or to add commentary and additional intelligence about the origin of the malware itself, new behaviors, mitigation tactics, etc.

The second area that aims to detect emerging threats requires a certain degree of detail to be shared, because at the end of the day understanding that a new threat is affecting more sites requires data to be compared in order to identify emerging patterns. On the other hand, the secrecy of data is important to prevent attackers from leveraging the intelligence, and to protect the identities of the sharing participants. These two look like conflicting requirements, but with a proper set of rules and APIs, it’s possible to share only a minimum amount of information and still be able to observe common patterns among peers. Now we can ask ourselves: what does an open, cross-technology sharing architecture look like?

A well-known and widely accepted model to integrate data shared by multiple sources, including cybersecurity tools, is the SIEM model. Two or more technologies, potentially from different vendors and covering different areas of the cybersecurity spectrum, send their logs and alerts to a SIEM in batch or real-time. In some situations, these are normalized at the origin with an Information Model, but in other situations the data is normalized in the SIEM itself. Normalizing data into a common Information Model is important to allow generic, cross-technology correlation of events – which is made possible because the semantics of each field are well defined. For example, the well-known and deployed Common Event Format (CEF) defines that the “src” field must be the IPv4 address of the source of the attack or event – such a simple semantic simplifies how to parse the data (a src IP, not a MAC address, not a hostname or anything else) and to identify correlations and remediate as appropriate.

Let’s now imagine how this applies to information sharing with a government body acting as a central coordination center, like a “government SIEM”. This isn’t realistic today because this model requires sharing every possible detail, discovered by every technology, at the highest levels so that any kind of correlation can happen there. So, even though the SIEM model can be used within individual companies, or even with outside bodies like the government, it can’t be used ‘as-is’ to implement a privacy-preserving data sharing scheme to safely and collectively hunt for emerging threats.

A well-known threat intelligence sharing data format and API is Structured Threat Information Expressions (STIX) and Trusted Automated Exchange of Intelligence Information (TAXII). They have been designed to share intelligence in the form of Indicators of Compromise used for a particular attack. As such, it’s ideal when a threat has already been identified by a security research team. However, it has not been designed to preserve privacy and to be used to spot emerging threats observing data from multiple sites. Yes, it can be potentially extended to do so — for example by adding STIX namespaces and related rules/procedures — but the possibility of breaking compatibility with existing tools and effectively creating a new standard is real.

There are many other projects and formats dealing with intelligence sharing, such as OpenDXL (which mostly bends towards orchestration), CIF (which provides a framework and implementation for threat intel enrichment, sharing and management), MISP (which aims to build a collaborating community to detect and classify known threats) and OpenIOC (which, similar to STIX, aims to standardize the format of IOCs). All these projects are applicable and battle-tested for the first area of intel sharing, but don’t really fit the requirement to collaboratively hunt for emerging threats while preserving privacy.

In addition to these intel sharing standards, there’s also a good amount of government-led sharing projects that leverage them, including CyberSentry (which includes a good explanation on the data collected), and the Cybersecurity Risk information Sharing Program (CRISP), among several others in practice.

Anatomy of the Solution

Considering the widespread success of STIX/TAXII, as well as all of the above established formats and APIs, it’s clear that a standardized, transparent, well-documented, and well-governed approach to collectively detecting threats is crucial to encouraging adoption and continued development efforts.

The information sharing solution should be able to fill the gaps of emerging threat detection, while integrating well with the aforementioned threat sharing ecosystem, so we can avoid reinventing the wheel. Additionally, enabling the user community and vendors to leverage their existing cybersecurity investments rather than rebuilding or adopting new ones, and then growing from there, provides the greatest possible support to the overall objectives.

A fantastic solution to guarantee openness and transparency is to have an open-source implementation, which can be a focal point for a pragmatic approach with a practical tool used to test for compatibility and compliance with the various standards and implementations. One which is not a toy, but is production ready. One where every involved party (vendors, customers) can study every detail and line of code.

In the spirit of supporting open-source transparency, the APIs can be documented in a standard manner, for example using OpenAPI format, enabling faster evolution and validation of automated test suites. Additionally, ‘Sample’ clients could be provided demonstrating how to consume the APIs with the most common programming languages, in order to accelerate adoption and lower development costs.

A mechanism to share events or indicators without violating the privacy or secrecy of the data must be provided. Potentially, Differential Privacy can be considered to let contributors share more details, like what specific internal victims look like, but without revealing their true identities. The APIs and the system should enable machines to interact automatically with each other, without fear of information leakage, and when matches are detected, enable further investigations to be initiated upon the operator’s confirmation.

The main goal of the proposed solution is to detect new or previously unknown threats by being able to observe them among different sites belonging to different peers / contributors / companies. But this is not just a theoretical exercise or wishful thinking. Nozomi Networks is committed to driving and delivering just such an open source project with these objectives in mind. More details will be coming in 2022, and now we are actively engaging vendor and community involvement in this open framework and open API initiative. Reach out to us if you are interested in being part of the process.

Conclusions

As we have seen here, intelligence and knowledge sharing is a complex topic with interesting standards that solutions stand ready to use, today, to enable a central body to receive intelligence from a diverse set of contributors. Successful solutions in this area use open standards, with plenty of code and examples that are freely available for vendors to implement.

On the other hand, complex automatic or semi-automatic coordination between peers to detect emerging threats, while preserving privacy is a less explored area with no real standards, exchange formats, or APIs, but we have illustrated a potential approach to fill those gaps.

Nozomi Networks will be contributing a major open source project along the lines of this described approach and we expect to have broad industry participation and adoption that can accelerate the feasibility of security information sharing and data normalization across vendors for the benefit of everyone (except the attackers!).