A Server Was Not Protected with a Password, and Millions of Banking Documents Were Leaked
Another day, another exposed trove of sensitive data. The database was discovered on January 10 by security researcher Bob Diachenko. It had 24 million records weighing in at over 51GB, and as soon as Diachenko saw what these records contained, he knew that he must act quickly.
The text had apparently been generated by an OCR solution. OCR stands for Optical Character Recognition, a piece of technology that is used to convert handwritten or printed documents into machine-readable text. In this particular instance, the OCR software had scanned mortgage and credit reports.
Because each record contained different parts of different documents, it's difficult to estimate the exact number of affected individuals. It's fair to say, however, that a lot of extremely sensitive information was left out in the open. The database contained names, addresses, phone numbers, credit history, Social Security numbers, etc. As Diachenko himself puts it, the database was "a gold mine for cyber criminals who would have everything they need to steal identities, file false tax returns, get loans or credit cards".
Who exposed the data?
It wasn't immediately apparent who the database belonged to which is why Diachenko called Zack Whittaker from TechCrunch and asked for some help with the investigation. After a closer look, they found out that the documents date as far back as 2008 and were issued by several major financial institutions including Wells Fargo, CapitalOne, HSBC as well as a couple of US federal departments.
Some more poking around later, they concluded that the information had been collected by Ascension – a Texas-based business that provides data analytics services to banks. Whittaker got in touch with Sandy Campbell, a representative of Ascension's parent company, who said that yes, the database did belong to the analytics business, but no, their systems hadn't been compromised.
Apparently, a vendor hired by Ascension made the blunder. Although Campbell refused to name the said vendor, Zack Whittaker believes it's New York-based OpticsML. Considering the fact that according to an archived version of its website (the live one is down for some reason), OpticsML offers OCR services "to some of the largest companies in the Mortgage space", the scenario doesn't seem very unlikely. That said, nobody has officially confirmed the information.
The good news is that, after Citigroup, one of the financial institutions whose clients were affected, intervened, the database was taken down, and it has been inaccessible since January 15.
How was the data exposed?
Bob Diachenko didn't use a clever code injection, password brute-forcing or any other type of hacking technique. Instead, he discovered it with the help of publicly available search engines like Shodan and Censys, and when he got to it, there was no password to prevent him from looking at it, downloading it, or possibly abusing it.
The data was left on an ElasticSearch cluster and was facing the internet without any form of protection whatsoever. It was accessible to anyone willing to lo ok for it, and finding it was as difficult as entering a query in a search engine.
It's the latest in a very long line of data leaks that happens not because hackers are too smart or because software is too weak, but because vendors make simple and completely avoidable configuration errors when securing people's information. It actually comes just days after another unsecured ElasticSearch server exposed all sorts of sensitive data that belongs to users of several online casinos.
Vendors must really pick up their game when it comes to keeping people safe. They have the tools needed to avoid at least some of the data-exposing incidents, and they have no excuses for not using them.