How much photo data does Facebook really have?

According to a post by a Facebook Photos engineer, they receive around 200 million photo uploads per DAY, or about 6 billion per month. A separate post says Facebook currently hosts 4% of all photos ever taken. Specifically, it hosts 140 billion photos out of 3.5 trillion photos taken in history. Also, we see “it is estimated that 2.5 billion people in the world today have a digital camera. If the average person snaps 150 photos this year that would be a staggering 375 billion photos. That might sound implausible but this year people will upload over 70 billion photos to Facebook, suggesting around 20% of all photos this year will end up there. Already Facebook’s photo collection has a staggering 140 billion photos, that’s over 10,000 times larger than the Library of Congress.”

Whatever the number, there’s never been a larger concentration of user user-generated photo data in one spot, easily accessible, and hoping to make inter-relationships between users more meaningful, simple and relevant. This then becomes a boon for facial recognition training efforts to refine the fidelity of datasets, and lower the number of false positives generated. Also, a related boon is to be had correlating the increasingly accurate data with other data points in the set, raising the accuracy dramatically, especially over time.

So what does it mean? If other users tag you in photos, combined with facial recognition, it becomes very easy to reverse engineer who you are. Presumably this could lead to reduced fraud, but it could also enable third party nefarious data-miners to generate increasingly real-sounding identities that you would likely trust.

Social engineering has (for decades) been a cornerstone of online scamming. There were tricks to gain access to dial-up accounts, phonecards, etc., but it was mostly a hit-and-miss targeted exercise. But with the huge datasets now available online, automated social-engineering opportunities may start to hit the streets, offering criminals increasingly easy access (through you) to sensitive data, spearphishing their way into the deeper recesses of your organization’s data, definitely or places they should not be prying.

As the social media boom continues, many similar photo-relational systems will be implemented by default, causing concern from users who would prefer an opt-in system instead. Also, there will be booms in online reputation management systems, efforts that attempt to alert you and show just how much information can be relatively easily gathered about you (like people aggregator, hopefully allowing you to manage a bit of the data sprawl. Either way, protecting your identity, and therefore, social-engineering scams, will become increasingly tricky with time.

In the 1980’s we had the A-Team dressing as janitors and infiltratating myriad organizations without any shots being fired (at least until the car chase scenes). We will see a flurry of similar attacks, this time by scammers with buckets of relevant information about a potential target, so you may want to think seriously about guarding your reputation now. It’s easy to envision potential blackmail attempts in the future, starting with a phone message giving very specific details about you, promising to expose more to your employer unless you send money now to xyz organization to keep them from spilling an alleged goldmine of personal data. They will sound very convincing, very scary, and will become increasingly difficult to detect and stymie.

Cameron Camp
ESET Research Systems Manager

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s