Voter data left exposed on open internet-facing system

A data analytics firm contracted by the Republican National Committee (RNC), left a compilation of sensitive personal information of more than 198 million US citizens exposed on an open internet-facing system.

Published: July 05, 2017

Introduction

At a period where electoral processes have been challenged and concerns have been raised regarding the integrity of the US electoral process during the latest US elections, a data analytics firm contracted by the Republican National Committee (RNC), left a compilation of sensitive personal information of more than 198 million US citizens exposed on an open internet-facing system. The 1.1TB of data included names, dates of birth, home addresses, phone numbers, as well as political related data, and data described as “modelled” voter ethnicities and religions. All data were compiled from various sources for large-scale analytics operations by the aforementioned data analytics firm and at least two more RNC contractors.

This note contextualises the voter data exposure in terms of known cyber-threat types as these have been assessed in various reports (e.g. ENISA Threat Landscape (ETL)). Moreover, it highlights the security risks introduced when private information gets publicly exposed on internet-facing systems. Finally, this note underlines the importance of the human factor as a security risk, notably for governments, companies and organisations: previous assessments show that the majority of breaches are due to non-deliberate human actions that open doors to various types of attacks.

What happened?

On 12 June 2017, Chris Vickery, a cyber security risk analyst, discovered the unsecured repository containing the voter data on a cloud service. Vickery, deduced that the massive collection of data he encountered was prepared for complex analysis. Vickery, notified federal authorities on his findings and on 14 June 2017 the data analytics firm secured the repository, while taking the responsibility for the issue (the statement has been removed from its website). It is uncertain whether someone has managed to retrieve the data repository before the company secured it. According to the firm’s statement, an internal review is being conducted and has so far concluded that the data repository was left exposed on the internet since the 1^st of June 2017, and that their systems have not been compromised.

This is not the first time that voter data is exposed on the internet. Vickery has previously identified two more cases, but the latest case is one of the largest –if not the largest– known data exposures of its kind. Except voter data, other critical, personal or corporate data has been left exposed on internet-facing systems in the past. In some cases, the information leakage eventually led to data breaches.

Information about the Threat

The exposure of the US voter data is a case of an “Information leakage[1]” threat. The exposure seems to have been caused by negligence or human error, i.e. misconfiguration of the database, lack of protection measures to restrict public access to the data repository. Thus, it is most probably attributed to an “insider”[2] threat agent, irrespective of assessing the intention behind it. The most prevalent risk that follows this case of information leakage is the “Data Breach” threat. That is, the malicious exposure of leaked data with the aim of performing criminal actions.

In case the data falls (or has fallen) into wrong hands, “Identity Theft” is another threat that poses significant risk and is closely related to “Information leakage”. Having access to such data can potentially allow different threat agents to perform more targeted attacks, e.g. profiling (for example, based on political beliefs), marketing campaigns, fake news campaigns, spear-phishing, spam, etc., with various repercussions for an individual or group, e.g. they can lead to discrimination, exclusion, public opinion manipulation etc.

It is worthy to mention the risk of legal repercussions that entail the aforementioned threats and the risk of breaching data protection regulations, for both companies and organisations.

The following figure, shows how the identified threat interconnects with other known cyber-threats and forms a dynamic threat landscape. Contextualising a given issue/incident in terms of known cyber-threats provides a broader view of the issue and allows the proposal of recommendations originating from different threat assessments. Furthermore, it provides an insight into identified and potential threat agents.

Reduce the Security Risk

Data breaches already have a substantial financial cost to organisations and companies, albeit in several cases it only requires following best security practices to avoid them. The following recommendations have resulted by also consulting mitigation vectors proposed by the ETL:

Do not host any personal/sensitive/corporate information on internet-facing systems if the systems have not been adequately secured. Internet-exposed services must be properly configured and firewall-protected in advance as default configurations can potentially allow access without any authentication. The reduction of data access rights according to the principle of least privileges is advised. Besides security researchers, various threat actors (e.g. cyber criminals, nation states, etc.) are actively looking for unprotected low-hanging fruits on the internet. Hence, hosting such data online while missing even the simplest protection measures is at least problematic.
Performance of data classification to assess and reflect the level of protection needed according to data categories. Every company/organisation needs to perform security and risk assessment in order to be in a position to identify critical assets and apply the proper security mechanisms to protect them.
Proper cloud services outsourcing. In the case that companies/organisations choose to outsource cloud services to third parties, they need to make sure that the third-party has the right policies, personnel, infrastructure, and security mechanisms in place, to effectively protect their data before ensuing their trust to them.
Usage of encryption of sensitive data, both in transit and in rest. Private data and that is hosted online should use strong encryption both while in transit and when at rest. In that way, even in the case of a data breach data should be protected.
Use best security practices. The use of strong unique passwords for user accounts, 2-factor authentication (2FA), secure password storage, as well as the monitoring of private networks for suspicious/malicious activity, are just some of the basic security practices to be followed in any environment.

Be compliant to personal data protection regulations.The exposure of private data underscores the responsibilities owed by private corporations and organisations. Moreover, the processing of data collected from various sources for uses beyond their initial collection purpose and scope and without the individual’s (for whom the data is about) explicit consent is another serious issue. Companies and organisations need to pay attention to, and be compliant with the underlying EU data protection legal framework and in particular the new GDPR entering into force in May 2018.

[1] As described in ETL , “Information leakage is a category of cyber-threats abusing weaknesses of run-time systems, of components configuration, programming mistakes and user behaviour in order to leak important information”.

[2] According to ETL, “one of the main actors to threaten their organisations, both intentionally and unintentionally. Intention, negligence and error are the three sources of threats attributed to this group, intention is source of the fewer incidents. Most typical are violations of existing security policies through negligence and user errors”.