Further thoughts on the evaluation of open source information

This is a fifth post in a series on information evaluation. In the previous posts we discussed the Admiralty Code as origin of information evaluation systems in use by military and intelligence services in NATO countries. Then we showed how law enforcement have adapted that methodology to their needs. Subsequently we discussed publications on an apparent, however unconfirmed, Russian system for source and information evaluation, and lastly we reflected on the criticism that is voiced in relation to the Admiralty Code in its current use.

In this post we dive into the intricacies of evaluating open sources and the information obtained from those sources. The aim is to lay some ground work for further thinking, so we mainly focus on the particular challenges that exist in relation to open sources. No solution is yet presented.

The challenge in general

In relation to open sources, there are some fundamental problems with the Admiralty Code methodology, which after all was devised for the evaluation of human sources. While dealing with human sources can be complex enough, the incredible variety of, and diversity between, open sources add another layer (or two) to the challenge. As Irwin and Mandel (2019: 507) note: “... source reliability may vary dramatically depending on the nature of the information provided, the characteristics of the source(s), and the circumstances of collection.”

Let’s start with the source evaluation of a simple newspaper article. For starters, an additional determinant is required compared to the evaluation of human sources, and that is ‘authenticity’. The Admiralty Code does not include a check for authenticity because generally when dealing with human sources a handler orally debriefs the individual. The source is a known real person, often recruited on purpose, and the evaluation can start with the reliability of that person. In dealing with open sources, however, there is no guarantee that the document or digital source is authentic. Various deception strategies can result in elaborate disinformation campaigns in which forgeries are only one of many tools. Hence, only when the authenticity of the source is confirmed, the next step can be taken.

And that step, evaluating the credibility of the source, is for open sources less straightforward than it seems. First we’d need to establish what we exactly consider to be the source: the article, the journalist or the newspaper? If we stick with the definition of a (primary) source as a document (in the broad sense) in which the information was recorded and relayed for the first time, we immediately run into trouble if we want to apply the concept of source evaluation according to the Admiralty Code. After all, trustworthiness and competence are human traits. For a human source we can establish whether he is competent, whether he observed the event to which the data relates with own eyes (and whether that would indeed have been possible), as well as to which extent is he is trustworthy / biased. That is not possible with a newspaper article.

An alternative approach could be to define the journalist who wrote the article as the source. That would make us deviate from the definition of a source, and even while there may be multiple arguments in favour of doing so, how would we then take into account the editor who may have changed the content? And should we look at the newspaper more broadly to see how its’ editorial policies and control might influence the credibility of the newspaper as a whole?

The latter again is tricky; there are plenty newspapers that have a disputable overall reputation which however does not affect the credibility of all their journalists. A case in point may be one of the largest tabloids in the Netherlands which generally scores quite low on credibility, however had one of the best informed financial crime journalists. Personally I believe you cannot evaluate a newspaper article without including the journalist in that evaluation, however, I’m less concerned about the editor and the newspaper.

Evaluating the information from a newspaper article alongside the Admiralty Code methodology is even trickier. More often that not articles contain multiple different pieces of information. Take for example a recent OCCRP article on a proxy who assists a sanctioned Russian oligarch in hiding his assets. The article contains multiple different pieces of information, which – if we follow the Admiralty Code methodology correctly – all should be evaluated separately. With articles so rich in detail, it makes little sense to assign a score to the article as a whole.

In fact, if we apply the evaluation methodology technically correct, we should treat the article as a secondary source and identify the primary sources for each piece of information mentioned, acquire the information from those primary sources, and only then apply the evaluation. That could be cumbersome of course and impossible when unnamed human sources are quoted by the journalist for the article. Moreover, often investigative journalism articles combine information from different sources into a single revelation.

All in all, it is not impossible to use the principles from the Admiralty Code to evaluate newspaper articles, it does however require a bit more thinking and some choices to be made. To further explore this, in the next section we look at a different type of open source.

ADS-B Exchange

Nowadays the world of open source is much richer than articles in newspapers and the evaluation of different sources may require different approaches. Let’s look at databases and use the ADS-B Exchange as an example here.

The ADS-B Exchange advertises itself as the world’s largest cooperation of ADS-B feeders, and the world’s largest public source of unfiltered flight data. Like similar flight data providers, such as Flightradar, the ADS-B Exchange is collating ‘ADS-B Out data‘ and MLAT data collected by volunteers all over the world. ADS-B-Out data is broadcasted by transponders in aircrafts and includes identification, position, heading, altitude and velocity of that aircraft. (NB. I will further forego an explanation of the MLAT data here, as for the purpose of this discussion there is no difference in the challenges of evaluation).

Flight safety is the primary function of the ADS-B system which eventually might replace radar for air traffic control. By collecting and collating this open data world-wide the ADS-B Exchange however also has become an interesting open source for historical flight data. Technically the collection itself can be qualified as SIGINT as these signals are captured from the air. In this case, because the data is collected and then collated into a database made publicly available on the ADS-B Exchange website, collecting data from that website could be qualified as part of OSINT activities.

Irwin and Mandel argue that current source reliability methods fail to distinguish between evaluating subjective sources versus objective sources (e.g., human sources versus sensors), as well as fail to distinguish between primary sources versus secondary/relaying sources (2019: 507). And they did not even consider the additional challenges that open sources bring along such as the fact this particular source combines objective sources (i.e. the captured signals) with subjective sources i.e., the human operators involved in the capturing, collating and interpreting of the data.

From an open source collection perspective, the ADS-B exchange database where the data was collated into information for the first time, is the primary source. However, the reliability of this source (the database) will depend on the volunteers who captured the data, the database administrator and even those who wrote the scripts to automatically feed the data into the database. Are they all trustworthy? Are they all competent? Didn’t they make a mistake – or worse – filter something out? Trusting a database and other automated systems essentially means that we need to trust the people who build, maintain and operate these sources. And the question is whether Admiralty Code methodology is still suitable to evaluate such sources.

We encounter similar complexity when we want to evaluate the information from the ADS-B Exchange according to the Admiralty Code methodology. The collected ADS-B data result into information about flights taken by planes, which can show – to use a recent example – the movements of Russian oligarchs. The data on which this information builds originates from transponders in the planes which in turn use data from GPS (or another type of high precision satellite navigation) and then broadcast it. In a rudimentary drawing, this leads to the following situation:

Rudimentary schematic of how ADS-B Exchange works

Planes can be followed real time on the ADS-B Exchange website and those with access can also obtain historical data. If we now would like to evaluate the credibility of that information, we need to realise how this information was compiled and that there are multiple moments in this process where data could have been altered.

For example, we know that GPS signals can be jammed which is being done by Russia and possibly others as well. Also the transponder(s) could be tampered with and (temporarily) be switched off, as for example frequently happens with AIS transponders of ships involved in illegal oil trade. The latter seems less likely for transponders in planes as the safety risks are much larger, especially in areas where there is dense air-traffic. Nonetheless the possibility exists and just like for example drug smugglers already for decades have been trying to avoid radar, I would not be surprised if in areas with less dense air traffic, transponders are being switched off.

Evaluation of the provenance is therefore an essential addition to the Admiralty Code methodology when evaluating the credibility of the information. Without understanding what distance exists between the source and the origin of the information, and via which path and nodes that information has reached the source from which it is collected, the credibility of the information cannot be evaluated properly.

In this particular example also potentially missing information – possibly as result of denial – is a relevant issue. It is not very likely that incorrect flight paths are recorded due to the way the system works. There are multiple feeders in different areas around the world, so there is sufficient redundancy in the actual collection. There are however quite some actors trying to keep their movements private and commercial databases generally allow to have information on flights removed for a fee. Therefore when evaluating the credibility of this type of information, completeness is a very relevant determinant. However, completeness is in the Admiralty Code methodology generally not one of the determinants used when the credibility of information is evaluated.

In sum

The two examples discussed above give a first glimpse of the complexity of a proper evaluation of open sources and the information derived from these sources. Of course in many cases also for open sources the Admiralty Code methodology is still relevant, if only to generate the awareness that sources and information need to be evaluated, and not to be taken at face value.

However, the above considerations suggest that supplementing the Admiralty Code with the determinant ‘authenticity’ for sources, ‘completeness’ for information, as well as an additional score for the provenance of the information, may be needed to obtain a thorough understanding of the value of the information from open sources.

The above thoughts are part of a work in progress, so comments are more than welcome.

References

Irwin, D. and D. Mandel (2019) ‘Improving information evaluation for intelligence production’, Intelligence and National Security, Vol. 34(4): pp. 503-525

(Photo credit: @mikevanschoonderwalt via Pexels)