Watch this episode on YouTube.
Reasonably Accurate 🤖🧠 Transcript
Morning everybody. How you doing today? In this episode of the show, I wanted to talk to you about the Facebook 10 Year challenge. The reason why I wanted uh to discuss this issue a little more in depth is that it's popped up in normal conversation.
And I mean that quite literally um very much you know about around town. I'm hearing it pop up, people are discussing it. It's coming across in mainstream um articles. It's not something that's just niche because obviously it's a big social media phenomenon right now.
But the flip side of, you know, is this something more sinister is actually a talking point? And I think this is a wonderful, wonderful thing. Now, let's get a couple of things out of the gate real quick. What is this challenge?
Basically, real simple. One side of your photo that you post is you in 2009. The other is this year in 2019. The idea is to show the difference or ideally the lack of difference between the two photos. Now the undercurrent.
No. So people po po uh post these photos out, share them around the undercurrent is that um maybe this is a data set that's being collected for something a little more sinister around helping computers identify how people age now.
It seems like the genesis of this idea started with a simple tweet uh from uh Kate o'neill. She's uh an entrepreneur, a fantastic speaker and a sometimes contributor for Wired Magazine. And she fully admits it was an off the cuff sort of sarcastic tweet and it was, you know, 2009 Kates uh would be all over this 2019 is asking, you know, is this really just an idea to get a tagged data set?
Now, the question I was asked a few times this week was why is this important? Aren't computers really good at facial recognition? The answer is yes, they are very good at facial recognition where we're moving to now in the state of computer science is identifying whose face that is and how that face changes in situations and over time.
So who's that? Whose face that is very simple? You need to identify it against another data source. This is where social media, this is where things like Facebook are really, really good at, right? It's a really strong point for them because they know this is Mark's profile and there are pictures that has said of himself.
Um So, you know, for those that aren't familiar actually just struck me as I, as I noticed the monitor here, facial recognition technology has been deployed widespread for a very, very long time and it's not always bad.
This camera is actually automatically detecting my face to maintain autofocus, right? Really great and useful thing. I unlock my ipad and my phone with my face as well. Um It's not necessarily a bad technology and we'll get to the implications in a minute here.
Um So this is the the of computer science where we're at is that we can identify faces very, very easily if you can link them to another data set, that means you can identify people. Um The challenge is in less than optimal situations.
So right now I'm lit by a key light. Um I'm, you know, very close to the camera. I'm the only subject here I'm looking directly in this is ideal circumstance for sort of facial recognition. But what if I'm uh you know, in a busy crowd, you can only see part of my face.
Um It, it gets less accurate because it's less distinct features that are visible and where things really, really, really break down is over time. This was demonstrated by Microsoft in 2015. They deployed a tool that basically said, hey, upload your photo, we'll tell you how old you are.
Um And it had, you know, somewhat laughable results, but sometimes it got it right. And that's because there was a lack of data at that point. Um It's not a clear data set and this is sort of the second misconception is that people assume machine learning is really, really easy to do.
Um And the ma are not super difficult to construct in the grand scheme of things. It's obviously a relative level of difficulty. The challenge is getting really good data. So if you're asking a computer to figure out the difference between a cat and a dog, it needs samples, it needs to know what attributes make up a cat or a dog or better yet a fully curated set of photos labeled.
This is a cat, this is a dog, this is a cat, this is a dog so that it can learn from that. And it's the same thing with faces. Now. There's a lot of great um easily accessible data sets around facial recognition, um two dimensional and three dimensional facial recognition, but there's not a lot of great ones on age and this was sort of the gist of Kate's humorous tweet was like, hey, what if this is just a big experiment to get us to do the work for them?
Um And that is not. Uh so that was vehemently Demi denied by Facebook 100% believe them. I don't think they started this at all. I don't think there's an ulterior motive. I think this is just a fun New Year's resolution type thing like, hey, look at me a decade ago, think of what I'll accomplish this year kind of thing.
Um But the premise of crowdsourcing data sets is not unheard of. We've all taken um you know, the caps on websites. The I am not a robot check box and it pops up with a bunch of images used to be texts that you had to go through and verify.
Um And now it's images, these are training data sets. This is um the Google as a service, I believe Google owns it. Now. Um They were uh verifying the Google scanning efforts. So was that actual, you know, was it these ocr mistakes?
Was this actually the word the yes or no? Um And now doing the same with image recognition is that we're, we're tagging these things for them. So when they ask, hey, can you show us all the pictures that contain a fire hydrant and that's confirming or new results and helping tag the data?
So the whole premise of, you know, the um that the 10 year challenge was somehow a collection for a data set. It's not crazy, it's pretty straightforward. But if I put on my, you know, cynical hat and try to go, OK.
Well, what's the advantage here? It's one extra attribute um in a uh already existing technology and use case, right? People are already using facial recognition like the horse has already left the barn. It's now a question of can we have discussions around the proper use um acceptable use?
What circumstances is not acceptable at what scale things like that and that needs to come out into the open. We need to have those conversations as a community, as a physical, uh you know, geography, as a local community, as a larger internet community.
But from the aging one, I couldn't find any threats that were really there. I did find one extremely useful use case if you have any collection of photos, um you'll see over time, it really has a hard time.
The photo tools of detecting that this was, in fact, you um you only really get that over decades. Um search of your face after you've tagged a bunch. So, you know, it will tag the most recent years because it's got a pretty good confidence that that's you.
And then when you start to go back a few years, it'll keep building the confidence to keep reaching back into the past, having a data set that shows common aging patterns and some attributes and how faces evolve over time would help with tools like that.
Um Help with some special effects in Hollywood. A lot of the superhero Blockbusters lately have had d aged characters. This would all help with that. But from a threat model, I really couldn't detect much because unless you can go completely off the grid and then come back on at some point, you're going to be generating a constant set of data um that's not really going to harm uh your uh profile, your privacy level.
Um Anything like that from this type of a challenge. So interesting thing to talk about. My biggest takeaway from this was we're finally at a point where this is a mainstream conversation and that's extremely signing for someone like myself who works in security and privacy.
Um What do you think? What do you uh have you participated in it? Do you see a different impact? Did I miss something here? Let me know. Hit me up online at Mark NC A for those of you in the vlogs uh in the comments down below.
Um And as always uh for our podcast listeners and everybody else by email me at Mark N dot C A and instead of in 10 years, I will see you on the next show. I hope you're set up for a fantastic day, talk to you soon.