Unprotected Database Leaks the Résumés of 200 Million People in China

Security experts can't stop urging users to start taking their online safety more seriously. For people, however, sticking to all that advice is too much hard work, and they continue the same old poor practices. This is good news for hackers who don't have too much trouble breaking into other people's online accounts and stealing data. You could even go as far as saying that users are often asking for it, especially in light of the fact that there are now more and more easy-to-use tools that can protect them. Sometimes, however, people are not to blame. Sometimes, things are beyond their control. Security researcher Bob Diachenko stumbled upon one such case at the end of last year.

Table of Contents

A massive database left out in the open

Diachenko, who works for bug bounty platform HackenProof, was doing some threat intelligence on December 28 when he noticed an unprotected MongoDB database. The first striking thing about it was its size – 854 GB. After he opened it, however, he was in for an even bigger shock.

The database contained more than 200 million CVs of Chinese job seekers. Each résumé exposed information on the victim's skills and working experience as well as personal details like email, phone number, weight, height, marital status, driver license, literacy level, salary expectations, etc.

All this was stored on the internet and was accessible without any form of authentication.

The source of the data remains unknown

Diachenko immediately set off to find the people responsible and remind them why putting sensitive data behind a password is a good idea. This proved to be more difficult than he first imagined.

After asking his Twitter followers for help, he was tipped off about a data scraping tool which was available on GitHub at the time and organized the information in a pretty similar way. At first, it looked like at least some of the data had come from a website called bj.58.com, but after speaking to its owners, he was assured that another portal had been the source.

We still don't know whether the data scraping tool that put the information in the database is legal or not, and we also have no idea who the database actually belongs to. What we do know is that both were taken offline shortly after Bob Diachenko first announced his findings on Twitter.

Unfortunately, it was too late

Before the database went down, Diachenko managed to examine the MongoDB logs and saw that "at least a dozen IPs" might have accessed the data before it was taken offline. We can only hope that the owners of these IPs don't have any malicious intentions because if they do, the consequences could be quite severe for the people who had their data exposed.

The worst thing about it is, it was all avoidable. A simple configuration setting and a strong password would have made the information inaccessible to the world. This incident really goes to show how easily things could go wrong on the Internet.

By Duran

January 15, 2019

News