1.2 Billion Private Data Records Have Been Found on an Unsecured Server

1.2 billion people affected by a data leak

Those of you with an active interest in cybersecurity won't be too surprised to find out that researchers have disclosed the discovery of yet another wide-open server that held a massive amount of personal information. This time, the scale of the leak is absolutely mind-boggling, but what is even more disappointing than that is the fact that when you learn what happened exactly, you'll see just how inevitable this incident was.

Researchers discover 4TB of personal information on an unsecured Elasticsearch server

On October 16, Vinny Troia and Bob Diachenko stumbled upon an Elasticsearch server that was not protected by a password and was accessible to anyone who had a browser and knew where to look. The two are not new to this sort of thing. In fact, Bob Diachenko, in particular, is responsible for the disclosure of quite a few similar leaks. Even he was rather shocked by the size of the exposed data in this particular instance, though.

The database weighed in at a whopping 4TB, and it held a massive 4 billion accounts. There were quite a few duplicates, but even after removing them, the researchers were looking at the personal records of over 1.2 billion individuals. The indexes in the database weren't uniform, and the exposed data varied from record to record. After processing the information, the experts found out that the open Elasticsearch server held, among other things:

  • More than 1 billion personal email addresses.
  • More than 400 million phone numbers.
  • More than 420 million LinkedIn URLs.
  • More than 1 billion Facebook URLs and account IDs as well as other data related to users' social media presence.

The database didn't contain any credit card details, Social Security Numbers, or passwords, but affected individuals should still be on the lookout for any signs of identity theft and fraud. Diachenko and Troia shared the leaked data with Troy Hunt, who loaded it into the Have I Been Pwned data breach alert service, which means that you can go there and check whether or not you have been affected by the leak.

Needless to say, as soon as they discovered the information, the security researchers took the necessary steps to bring it offline. The FBI was informed, but before the law enforcement agencies could take action, the database was brought down, presumably by its owner. It's impossible to say when the data appeared on the Elasticsearch server for the first time and who accessed it while it was exposed.

Who is to blame?

Every single one of the records in a database had a field labeled "source", and the value in it was either "PDL" or "Oxy". "PDL" stands for People Data Labs, and "Oxy" comes from Oxydata. People Data Labs and Oxydata are the two data enrichment companies that collected all these records.

The business of a data enrichment company revolves around collecting as much publicly available information about you as possible and creating a detailed profile based on what it finds. This profile, along with millions of others, is then sold to anyone willing to pay a predetermined fee. People Data Labs and Oxydata did indeed collect the information. This doesn't mean that they leaked it, though.

After discovering the leak, Vinny Troia shared his findings with Wired's Lily Hay Newman, who reported the exposure and contacted People Data Labs and Oxydata to ask them what they think of it. The two companies admitted that they might be the ultimate source of the information that was put in the database, but they both insisted that they hadn't suffered a data breach.

In all likelihood, a customer of People Data Labs and Oxydata paid for all this information, put it in a single database, and left it on the misconfigured Elasticsearch server. Martynas Simanauskas, an Oxydata representative, told Wired that his company has agreements with his clients designed to ensure that the data is processed securely. Even he admitted, however, that once the client has the information, the options for preventing misuse are more or less non-existent.

This was the main talking point in Troy Hunt's blog post dedicated to the exposure. The unfortunate fact of the matter is that data enrichment companies like People Data Labs and Oxydata will continue to scrape our personal information from wherever they can find it. They'll also continue to sell it, and the people and organizations who pay for it will inevitably leave it exposed every now and then. Regardless of whether we like it or not, our data is collected, organized, and copied many times, and safe for unplugging the ethernet cable and living like it's 1960 again, there is more or less nothing we can do about it. Considering all this, the fact that this particular leak didn't happen sooner is actually quite surprising.

November 27, 2019

Leave a Reply

IMPORTANT! To be able to proceed, you need to solve the following simple math.
Please leave these two fields as is:
What is 10 + 9 ?