1 00:00:00,000 --> 00:00:19,030 *36C3 preroll music* 2 00:00:19,030 --> 00:00:26,500 Herald: OK. So inside the fake like factories. I'm going to date myself. I 3 00:00:26,500 --> 00:00:32,980 remember it was the Congress around 1990,1991 or so, where I was sitting 4 00:00:32,980 --> 00:00:38,550 together with some people who came over to the states to visit the CCC Congress. And 5 00:00:38,550 --> 00:00:43,230 we were kind of riffing on how great the internet is gonna make the world, you 6 00:00:43,230 --> 00:00:46,970 know, how how it's gonna bring world peace and truth will rule and everything like 7 00:00:46,970 --> 00:00:57,259 that. Boy, were we naive, boy, where we totally wrong. And today I'm going to be 8 00:00:57,259 --> 00:01:03,470 schooled in how wrong I actually was because we have Svea, Dennis and Philip to 9 00:01:03,470 --> 00:01:08,980 tell us all about the fake like factories around the world. And with that, could you 10 00:01:08,980 --> 00:01:17,670 please help me in welcoming them onto the stage? Svea, Dennis and Philip. 11 00:01:17,670 --> 00:01:28,810 Philip: Thank you very much. Welcome to our talk "Inside the Fake Like Factories 12 00:01:28,810 --> 00:01:35,899 ". My name is Philip. I'm an Internet activist against disinformation and I'm 13 00:01:35,899 --> 00:01:38,719 also a student of the University of Bamberg. 14 00:01:38,719 --> 00:01:45,039 Svea: Hi. Thank you that you listen to us tonight. My name is Svea. I'm an 15 00:01:45,039 --> 00:01:50,219 investigative journalist, freelance mostly for the NDR and ARD. It's a public 16 00:01:50,219 --> 00:01:55,759 broadcaster in Germany. And I focus on tech issues. And I had the pleasure to 17 00:01:55,759 --> 00:02:01,280 work with these two guys on, for me, a journalistic project and for them on a 18 00:02:01,280 --> 00:02:04,289 scientific project. Dennis: Yeah. Hi, everyone. My name is 19 00:02:04,289 --> 00:02:09,009 Dennis. I'm a PhD student from Ruhr University Bochum. I'm working as a 20 00:02:09,009 --> 00:02:16,160 research assistant for the chair for System Security. My research focuses on 21 00:02:16,160 --> 00:02:21,349 network security topics and Internet measurements. And as Svea said, Philip and 22 00:02:21,349 --> 00:02:26,660 myself, we are here for the scientific part and Svea is for the journalistic part 23 00:02:26,660 --> 00:02:31,790 here. Philip: So here's our outline for today. 24 00:02:31,790 --> 00:02:38,550 So first, I'm going to briefly talk about our motivation for our descent into the 25 00:02:38,550 --> 00:02:45,160 fake like factories and then we are going to show you how we got our hands on ninety 26 00:02:45,160 --> 00:02:50,780 thousand fake like campaigns of a major crowd working platform. And we are also 27 00:02:50,780 --> 00:02:56,080 going to show you why we think that there are 10 billion registered Facebook users 28 00:02:56,080 --> 00:03:04,360 today. So first, I'm going to talk about the like button. The like button is the 29 00:03:04,360 --> 00:03:12,150 ultimate indicator for popularity on social media. It shows you how trustworthy 30 00:03:12,150 --> 00:03:18,620 someone is. It shows how how popular someone is. It shows, it is an indicator 31 00:03:18,620 --> 00:03:26,520 for economic success of brands and it also influences the Facebook algorithm. And as 32 00:03:26,520 --> 00:03:31,710 we are going to show now, these kind of likes can be easily forged and 33 00:03:31,710 --> 00:03:38,580 manipulated. But the problem is that many users will still prefer this bad info on 34 00:03:38,580 --> 00:03:45,960 Facebook about the popularity of a product to no info at all. And so this is a real 35 00:03:45,960 --> 00:03:53,780 problem. And there is no real solution to this. So first, we are going to talk about 36 00:03:53,780 --> 00:03:58,990 the factories and the workers in the fake like factories. 37 00:03:58,990 --> 00:04:04,210 Svea: That there are fake likes and that you can buy likes everywhere, it's well 38 00:04:04,210 --> 00:04:09,660 known. So if you Google "buying fake likes" or even "fake comments" for 39 00:04:09,660 --> 00:04:15,100 Instagram or for Facebook, then you will get like a hundreds of results and you can 40 00:04:15,100 --> 00:04:19,989 buy them very cheap and very expensive. It doesn't matter, you can buy them from 41 00:04:19,989 --> 00:04:27,790 every country. But when you think of these bought likes, then you may think of this. 42 00:04:27,790 --> 00:04:34,960 So you may think of somebody sitting in China, Pakistan or India, and you think of 43 00:04:34,960 --> 00:04:40,240 computers and machines doing all this and that they are, yeah, that they are fake 44 00:04:40,240 --> 00:04:47,630 and also that they can easily be detected and that maybe they are not a big problem. 45 00:04:47,630 --> 00:04:54,880 But it's not always like this. It also can be like this. So, I want you to meet 46 00:04:54,880 --> 00:05:03,120 Maria, I met her in Berlin. And Harald, he lives near Mönchen-Gladbach. So Maria, she 47 00:05:03,120 --> 00:05:11,750 is a a retiree. She was a former police officer. And as money is always short, she 48 00:05:11,750 --> 00:05:19,670 is clicking Facebook likes for money. She earns between 2 cent and 6 cent per like. 49 00:05:19,670 --> 00:05:28,720 And Harald, he was a baker once, is now getting social aid and he is also clicking 50 00:05:28,720 --> 00:05:34,480 and liking and commenting the whole day. We met them during our research project 51 00:05:34,480 --> 00:05:40,930 and did some interviews about their likes. And one platform they are clicking and 52 00:05:40,930 --> 00:05:46,750 working for is PaidLikes. It's only one platform out of a universe, out of a 53 00:05:46,750 --> 00:05:52,070 cosmos. PaidLikes, they are sitting just a couple of minutes from here in Magdeburg 54 00:05:52,070 --> 00:05:56,990 and they are offering that you can earn money with liking on different platforms. 55 00:05:56,990 --> 00:06:02,410 And it looks like this when you log into the platform with your Facebook account 56 00:06:02,410 --> 00:06:07,300 then you get in the morning, in the afternoon, in the evening, you get, we 57 00:06:07,300 --> 00:06:13,260 call it campaigns. But these are pages, Facebook fan pages or Instagram pages, or 58 00:06:13,260 --> 00:06:18,240 posts, or comments. You can, you know, you can work your way through them and click 59 00:06:18,240 --> 00:06:22,930 them. And I blurred you see here the blue bar; I blurred them because we don't want 60 00:06:22,930 --> 00:06:29,800 to get sued from all these companies, which you can see there. To take you a 61 00:06:29,800 --> 00:06:37,310 little bit with me on the journey. Harald, he was okay with us coming by for 62 00:06:37,310 --> 00:06:44,280 television and he was okay that we did a long interview with him, and I want to 63 00:06:44,280 --> 00:06:50,080 show you a very small piece out of his daily life sitting there doing the 64 00:06:50,080 --> 00:06:53,540 household, the washing and the cleaning, and clicking. 65 00:07:26,760 --> 00:07:36,020 Come on. It could be like that. You click and you earn some money. How did we meet 66 00:07:36,020 --> 00:07:41,150 him and all the others? Of course, because Philip and Dennis, they have a more 67 00:07:41,150 --> 00:07:45,169 scientific approach. So it was also important not only to talk to one or two, 68 00:07:45,169 --> 00:07:50,120 but to talk to many. So we created a Facebook fan page, which we call "Eine 69 00:07:50,120 --> 00:07:54,210 Linie unterm Strich" (a line under a line) because I thought, okay, nobody will like 70 00:07:54,210 --> 00:08:01,080 this freely. And then we did a post. This post, and we bought likes, and you won't 71 00:08:01,080 --> 00:08:10,310 believe it, it worked so well; 222 people, all the people I paid for liked this. And 72 00:08:10,310 --> 00:08:18,259 then we wrote all of them and we talked to many of them. Some of them only in 73 00:08:18,259 --> 00:08:23,410 writing, some of them only we just called or had a phone chat. But they gave us a 74 00:08:23,410 --> 00:08:29,949 lot of information about their life as a click worker, which I will sum up. So what 75 00:08:29,949 --> 00:08:36,169 PaidLikes by itself says, they say that they have 30000 registered users, and it's 76 00:08:36,169 --> 00:08:41,070 really interesting because you might think that they are all registered with 10 or 15 77 00:08:41,070 --> 00:08:45,620 accounts, but most of them, they are not. They are clicking with their real account, 78 00:08:45,620 --> 00:08:57,529 which makes it really hard to detect them. So they even scan their I.D. so that the 79 00:08:57,529 --> 00:09:03,210 company knows that they are real. Then they earn their money. And we met men, 80 00:09:03,210 --> 00:09:09,760 women, stay-at-home moms, low-income earners, retirees, people who are getting 81 00:09:09,760 --> 00:09:17,850 social care. So, basically, anybody. There was no kind of bias. And many of them are 82 00:09:17,850 --> 00:09:24,890 clicking for two and more platforms. That was, I didn't meet anybody who's only 83 00:09:24,890 --> 00:09:29,370 clicking for one platform. They all have a variety of platforms where they are 84 00:09:29,370 --> 00:09:34,610 writing comments or clicking likes. And you can make - this is what they told us - 85 00:09:34,610 --> 00:09:41,580 between 15 euro and 450 euro monthly, if you are a so-called power clicker and you 86 00:09:41,580 --> 00:09:48,410 do this some kind of professional. But this are only the workers, and maybe you 87 00:09:48,410 --> 00:09:52,740 are more interested in who are the buyers? Who benefits? 88 00:09:52,740 --> 00:09:59,631 Dennis: Yeah. Let's come to step two. Who benefits from the campaigns? So I think 89 00:09:59,631 --> 00:10:06,089 you all remember this page. This is the screen if you log into PaidLikes and, 90 00:10:06,089 --> 00:10:14,490 you'll see the campaigns with, you have to click in order to get a little bit of 91 00:10:14,490 --> 00:10:25,370 money. And by luck we've noticed that if you go over a URL, we see in the left 92 00:10:25,370 --> 00:10:31,980 bottom side of the browser, a URL redirecting to the campaign. You have to 93 00:10:31,980 --> 00:10:40,700 click and you see that every campaign is using a unique ID. It is just a simple 94 00:10:40,700 --> 00:10:49,640 integer, and the good thing is, it is just incremented. So now maybe some of you guys 95 00:10:49,640 --> 00:10:56,570 notice what we can do with that. And yeah, it is really easy with these constructed 96 00:10:56,570 --> 00:11:02,670 URLs to implement a crawler for data gathering, and our crawler simply 97 00:11:02,670 --> 00:11:11,931 requested all campaign IDs between 0 and 90000. Maybe some of you ask why 90000? As 98 00:11:11,931 --> 00:11:17,110 I already said, we were also registered as click workers and we see, we saw that the 99 00:11:17,110 --> 00:11:24,779 highest ID campaign used is about 88000. So we thought OK, 90000 is a good value 100 00:11:24,779 --> 00:11:30,540 and we check for every request between these 90000 requests if it got resolved or 101 00:11:30,540 --> 00:11:36,030 not, and if it got resolved, we redirected the URL we present this source. That 102 00:11:36,030 --> 00:11:42,431 should be liked or followed. And we did not save the page sources from the 103 00:11:42,431 --> 00:11:50,750 resolved URLs, we only save the resolved URLs in the list of campaigns, and this 104 00:11:50,750 --> 00:11:58,700 list was then the basis for further analysis. And here you see our list. 105 00:11:58,700 --> 00:12:05,740 Svea: Yes. This was the point when Dennis and Philip, when they came to us and said, 106 00:12:05,740 --> 00:12:12,000 hey, we have a list. So what can you find? And of course we searched AfD, was one of 107 00:12:12,000 --> 00:12:20,940 the first search queries. And yeah, of course, AfD is also in that list. Maybe 108 00:12:20,940 --> 00:12:31,149 not so surprisingly for some. And when you look, it is AFD Gelsenkirchen. And the fan 109 00:12:31,149 --> 00:12:39,589 page. And we asked AfD Gelsenkirchen, did you buy likes? And they said, we don't 110 00:12:39,589 --> 00:12:48,240 know how we got on that list. But however, we do not rule out an anonymous donation. 111 00:12:48,240 --> 00:12:55,410 But now you would think, Ok, they found AfD; this is very expectable. But no, all 112 00:12:55,410 --> 00:13:00,930 political parties – mostly local and regional entities - showed up on that 113 00:13:00,930 --> 00:13:09,250 list. So we have CDU/CSU. We have had FDP, SPD, AfD, Die Grünen and Die Linke. But 114 00:13:09,250 --> 00:13:15,390 not that you think Angela Merkel or some very big Facebook fan pages just showed 115 00:13:15,390 --> 00:13:23,800 up. No, no. Very small entities with a couple of hundreds or maybe 10000 or 15000 116 00:13:23,800 --> 00:13:28,390 followers. And I think this makes perfectly sense, because somebody who has 117 00:13:28,390 --> 00:13:35,370 already very, very much many fans probably would not buy them there at 118 00:13:35,370 --> 00:13:46,311 PaidLikes. And we asked many of them, and mostly they could not explain it. They 119 00:13:46,311 --> 00:13:52,040 would never do something like that. Yeah, they were completely over asked. But you 120 00:13:52,040 --> 00:13:56,690 have to think that we only saw the campaign. The campaigns, their Facebook 121 00:13:56,690 --> 00:14:03,110 fan pages, we could not see who bought the likes. And as you can imagine, everybody 122 00:14:03,110 --> 00:14:08,740 could have done it like the mother, the brother, the fan, you know, the dog. So 123 00:14:08,740 --> 00:14:15,160 this was a case we would have needed a lot of luck to call anybody out of the blue 124 00:14:15,160 --> 00:14:20,260 and then he would say, oh, yes, I did this. And there was one, or there were 125 00:14:20,260 --> 00:14:25,810 some politicians who admitted it. And one of them, she did it also publicly and gave 126 00:14:25,810 --> 00:14:35,339 us an interview. It's Tanja Kühne. She is a regional politician from Walsrode, 127 00:14:35,339 --> 00:14:40,260 Niedersachsen. And she was in the..., it was the case that it was after an election 128 00:14:40,260 --> 00:14:44,360 and she was not very happy with her fan page. That is what she told us. She was 129 00:14:44,360 --> 00:14:49,220 very unlucky and she wanted, you know, to push herself and to boost it a little bit, 130 00:14:49,220 --> 00:14:55,510 and get more friends and followers and reach. And then she bought 500 followers. 131 00:14:55,510 --> 00:15:02,870 And then we had a nice interview with her about that. Show you a small piece. 132 00:15:53,829 --> 00:15:59,760 Okay, so you see – answers are pretty interesting. And she.. I think she was 133 00:15:59,760 --> 00:16:05,180 that courageous to speak out to us. Many of others did too, but only on the phone. 134 00:16:05,180 --> 00:16:09,180 And they didn't want to go on the record. But she's not the only one who answered 135 00:16:09,180 --> 00:16:14,110 like this. Because, of course, if you call through a list of potential fake like 136 00:16:14,110 --> 00:16:21,120 buyers, of course they answer like, no, it's not a scam. And I also think from a 137 00:16:21,120 --> 00:16:26,180 jurisdictional way, it's it's also very hard to show that this is fraud and a 138 00:16:26,180 --> 00:16:33,209 scam. And it's more an ethical problem that you can that you can see here, that 139 00:16:33,209 --> 00:16:40,170 it's manipulative if you buy likes. We also found a guy from FSP from the 140 00:16:40,170 --> 00:16:45,269 Bundestag. But yeah, he ran away and didn't want to get interviewed, so I 141 00:16:45,269 --> 00:16:52,700 couldn't show you. So bought, or no probably... He was like 40 times in our 142 00:16:52,700 --> 00:16:59,100 list for various Facebook posts and videos and also for his Instagram account. But we 143 00:16:59,100 --> 00:17:06,730 could not get him on, we could not get him on record. So what did others say? We, of 144 00:17:06,730 --> 00:17:10,970 course, confronted Facebook, Instagram and YouTube with this small research. And they 145 00:17:10,970 --> 00:17:18,079 said, no, we don't want fake likes on our platform. PaidLikes is active since 2012, 146 00:17:18,079 --> 00:17:25,370 you know. So they waited seven years. But after our report, at least, Facebook 147 00:17:25,370 --> 00:17:32,549 temporarily blocked PaidLikes. And of course, we asked them too, and spoke to 148 00:17:32,549 --> 00:17:35,781 them and wrote with PaidLikes in Magdeburg. And they said, of course, it's 149 00:17:35,781 --> 00:17:41,620 not a scam because the click workers they are freely clicking on pages. So, yeah, 150 00:17:41,620 --> 00:17:47,640 kind of nobody cares. But PaidLikes, this is only the tip of the iceberg. 151 00:17:47,640 --> 00:17:58,520 Philip: So we also wanted to dive a little bit into this fake like universe outside 152 00:17:58,520 --> 00:18:05,780 of PaidLikes and to see what else is out there. And so we did an analysis of 153 00:18:05,780 --> 00:18:12,780 account creation on Facebook. So what Facebook is saying about account creation 154 00:18:12,780 --> 00:18:19,299 is that they are very effective against fake accounts. So they say they remove 155 00:18:19,299 --> 00:18:26,330 billions of accounts each year, and that most of these accounts never reach any 156 00:18:26,330 --> 00:18:33,000 real users and they remove them before they get reported. So what Facebook 157 00:18:33,000 --> 00:18:39,080 basically wants to tell you is that they have it under control. However, there are 158 00:18:39,080 --> 00:18:45,700 a number of reports that suggest otherwise. For example, recently at NATO- 159 00:18:45,700 --> 00:18:53,630 Stratcom Taskforce released a report where they actually bought 54000 likes, 54000 160 00:18:53,630 --> 00:19:02,220 social media interactions for just 300 Euros. So this is a very low price. And I 161 00:19:02,220 --> 00:19:07,169 think you wouldn't expect such a low price if it would be hard to get that many 162 00:19:07,169 --> 00:19:15,880 interactions. They bought 3500 comments, 25000 likes, 20000 views and 5100 163 00:19:15,880 --> 00:19:22,991 followers. Everything for just 300 Euros. So, you know, the thing they have in 164 00:19:22,991 --> 00:19:32,050 common, they are cheap, the fake likes and the fake interactions. So we also have, 165 00:19:32,050 --> 00:19:38,470 there was also another report from Vice Germany recently. And they reported on 166 00:19:38,470 --> 00:19:46,410 some interesting facts about automated fake accounts. They reported on findings 167 00:19:46,410 --> 00:19:50,980 that suggest that actually people use internet or hacked internet of things 168 00:19:50,980 --> 00:19:59,150 devices and to use them to create these fake accounts and to manage them. And so 169 00:19:59,150 --> 00:20:04,590 it's actually kind of interesting to think about this this wa. To say, OK, maybe next 170 00:20:04,590 --> 00:20:11,020 election your fridge is actually going to support the other candidate on Facebook. 171 00:20:11,020 --> 00:20:16,970 And so we also wanted to look into this and we wanted to go a step further and to 172 00:20:16,970 --> 00:20:24,660 look at who these people are. Who are they, and what what are they doing on 173 00:20:24,660 --> 00:20:32,200 Facebook? And so we actually examined the profiles of purchased likes. For this we 174 00:20:32,200 --> 00:20:38,390 created four comments under arbitrary posts, and then we bought likes for these 175 00:20:38,390 --> 00:20:46,500 comments, and then we examined the resulting profiles of the fake likes. So 176 00:20:46,500 --> 00:20:51,050 it was pretty cheap to buy these likes. Comment likes are always a little bit more 177 00:20:51,050 --> 00:20:59,520 expensive than other likes. And we found all these offerings on Google and we paid 178 00:20:59,520 --> 00:21:08,169 with PayPal. So we actually used a pretty neat trick to estimate the age of these 179 00:21:08,169 --> 00:21:16,490 fake accounts. So as you can see here, the Facebook user ID is incremented. So 180 00:21:16,490 --> 00:21:24,250 Facebook started in 2009 to use incremented Facebook ID, and they use this 181 00:21:24,250 --> 00:21:31,780 pattern of 1 0 0 0 and then the incremented number. And as you can see, in 182 00:21:31,780 --> 00:21:40,200 2009 this incremented number was very close to zero. And then today it is close 183 00:21:40,200 --> 00:21:49,559 to 40 billion. And in this time period, you can see that you can kind of get a 184 00:21:49,559 --> 00:21:56,770 rather fitting line through all these points. And you can see that the likes are 185 00:21:56,770 --> 00:22:02,710 in fact incremented, ... the account IDs are in fact incremented over time. So we 186 00:22:02,710 --> 00:22:08,670 can use this fact in reverse to estimate the creation date of an account where we 187 00:22:08,670 --> 00:22:15,340 know the Facebook ID. And that's exactly what we did with these fake likes. So we 188 00:22:15,340 --> 00:22:22,090 estimated the account creation dates. And as you can see, we get kind of different 189 00:22:22,090 --> 00:22:28,929 results from different services. For example, PaidLikes, they had rather old 190 00:22:28,929 --> 00:22:35,750 accounts. So this means they use very authentic accounts. And we already know 191 00:22:35,750 --> 00:22:41,370 that because we talked to them. So these are very authentic accounts. Also like 192 00:22:41,370 --> 00:22:46,660 Service A over here also uses very, very authentic accounts. But on the other hand, 193 00:22:46,660 --> 00:22:52,160 like service B uses very new accounts, they were all created in the last three 194 00:22:52,160 --> 00:22:58,280 years. So if you look at the accounts and also from these numbers, we think that 195 00:22:58,280 --> 00:23:06,510 these accounts were bots and on service C it's kind of not clear, are these are 196 00:23:06,510 --> 00:23:10,870 these accounts bots or are these clickworkers? Maybe it's a mixture of 197 00:23:10,870 --> 00:23:17,820 both, we don't know exactly for sure. But this is an interesting metric to measure 198 00:23:17,820 --> 00:23:23,390 the age of the accounts to determine if some of them might be bots. And that's 199 00:23:23,390 --> 00:23:29,340 exactly what we did on this page. So this is actually a page for garden furniture 200 00:23:29,340 --> 00:23:36,750 and we found it in our list that we got from paid likes. So they bought, obviously 201 00:23:36,750 --> 00:23:43,970 they were on this list for bought likes on Facebook, on PaidLikes. And they caught 202 00:23:43,970 --> 00:23:51,000 our eye because they had one million likes. And that's rather unusual for a 203 00:23:51,000 --> 00:24:01,260 shop for garden furniture in Germany. And so we looked at this page further and we 204 00:24:01,260 --> 00:24:07,390 noticed other interesting things. For example, there are posts, all the time, 205 00:24:07,390 --> 00:24:13,820 they got like thousands of likes. And that's also kind of unusual for a garden 206 00:24:13,820 --> 00:24:19,590 furniture shop. And so we looked into the likes and as you can see, they all look 207 00:24:19,590 --> 00:24:26,790 like they come from Southeast Asia and they don't look very authentic. And we 208 00:24:26,790 --> 00:24:32,460 were actually able to estimate the creation dates of these accounts. And we 209 00:24:32,460 --> 00:24:36,700 found that most of these accounts that were used for liking these posts on this 210 00:24:36,700 --> 00:24:44,130 page were actually created in the last three years. So this is a page where 211 00:24:44,130 --> 00:24:49,540 everything, from the number of people who like to page to the number of people who 212 00:24:49,540 --> 00:24:55,559 like to posts is complete fraud. So nothing about this is real. And it's 213 00:24:55,559 --> 00:25:02,380 obvious that this can happen on Facebook and that this is a really, really big 214 00:25:02,380 --> 00:25:08,309 problem. I mean, this is a, this is a shop for garden furniture. Obviously, they 215 00:25:08,309 --> 00:25:14,580 probably don't have such huge sums of money. So it was probably very cheap to 216 00:25:14,580 --> 00:25:22,170 buy this amount of fake accounts. And it is really shocking to see how, how big, 217 00:25:22,170 --> 00:25:31,179 how big the scale is of this kind of operations. And so what we have to say is, 218 00:25:31,179 --> 00:25:39,970 OK, when Facebook says they have it under control, we have to doubt that. So now we 219 00:25:39,970 --> 00:25:46,320 can look at the bigger picture. And what we are going to do here is we are going to 220 00:25:46,320 --> 00:25:52,700 use this same graph that we used before to estimate the creation dates, but in a 221 00:25:52,700 --> 00:25:59,080 different way. So we can actually see that the lowest and the highest points of 222 00:25:59,080 --> 00:26:05,090 Facebook IDs in this graph. So we know the newest Facebook ID by creating a new 223 00:26:05,090 --> 00:26:13,200 account. And we know the lowest ID because it's zero. And then we know that there are 224 00:26:13,200 --> 00:26:20,780 40 billion Facebook IDs. Now, in the next step, we took a sample, a random sample 225 00:26:20,780 --> 00:26:27,610 from these 40 billion Facebook IDs. And inside of the sample, we checked if these 226 00:26:27,610 --> 00:26:33,740 accounts exist, if this ID corresponds to an existing account. And we do that because 227 00:26:33,740 --> 00:26:39,360 we obviously cannot check 40 billion accounts and 40 billion IDs, but we can 228 00:26:39,360 --> 00:26:45,720 check a small sample of these accounts of these IDs and estimate, then, the number 229 00:26:45,720 --> 00:26:54,470 of existing accounts on Facebook and total. So for this, we repeatedly access 230 00:26:54,470 --> 00:27:02,770 the same sample of one million random IDs over the course of one year. And we also 231 00:27:02,770 --> 00:27:10,100 pulled a sample of 10 million random IDs for closer analysis this July. And now 232 00:27:10,100 --> 00:27:15,950 Dennis is going to tell you how we did it. Dennis: Yeah. Well, pretty interesting, 233 00:27:15,950 --> 00:27:21,160 pretty interesting results so far, right? So we again implemented the crawler, the 234 00:27:21,160 --> 00:27:26,530 second time for gathering public Facebook information, the public Facebook account 235 00:27:26,530 --> 00:27:35,730 data. And, yeah, this was not so easy as in the first case. Um, yeah. As. It's not 236 00:27:35,730 --> 00:27:45,059 surprising that Facebook is using a lot of measures to try to block the automated 237 00:27:45,059 --> 00:27:52,460 crawling of the Facebook page, for example with IP blocking or CAPTCHA solving. But, 238 00:27:52,460 --> 00:27:59,929 uh, we were pretty easy... Yeah, we could pretty easy solve this problem by using 239 00:27:59,929 --> 00:28:06,980 the Tor Anonymity Network. So every time our IP got blocked by crawling the data, 240 00:28:06,980 --> 00:28:14,480 we just made a new Tor connection and change the IP. And this also with the 241 00:28:14,480 --> 00:28:21,440 CAPTCHAs. And with this easy method, we were able to to crawl all the Facebook, 242 00:28:21,440 --> 00:28:26,020 and all the public Facebook data. And let's have a look at two examples. The 243 00:28:26,020 --> 00:28:36,890 first example is facebook.com/4. So the, very, very small Facebook ID. Yeah, in 244 00:28:36,890 --> 00:28:41,790 this case, we are, we are redirected and check the response and find a valid 245 00:28:41,790 --> 00:28:50,070 account page. And does anyone know which account this is? Mark Zuckerberg? Yeah, 246 00:28:50,070 --> 00:28:55,360 that's correct. This is this is a public account for Mark Zuckerberg. Number four, 247 00:28:55,360 --> 00:29:01,679 as we see, as we already saw, the other IDs are really high. But he got the number 248 00:29:01,679 --> 00:29:10,690 four. Second example was facebook.com/3. In this case, we are not forwarded. And 249 00:29:10,690 --> 00:29:17,760 this means that it is an invalid account. And that was really easy to confirm with a 250 00:29:17,760 --> 00:29:23,740 quick Google search. And it was a test account from the beginning of Facebook. So 251 00:29:23,740 --> 00:29:31,059 we did not get redirected. And it's just the login page from Facebook. And with 252 00:29:31,059 --> 00:29:38,500 these examples, we did, we did a lot of, a lot more experiments. And at the end, we 253 00:29:38,500 --> 00:29:46,970 were able to to build this tree. And, yeah, this tree represents the high level 254 00:29:46,970 --> 00:29:53,059 approach from our scraper. So in the, What's that? 255 00:29:53,059 --> 00:29:56,340 Svea: Okay. Sleeping. *Laughing* 256 00:29:56,340 --> 00:30:07,090 Dennis: Yeah. We have still time. Right. So what? Okay, so everyone is waking up 257 00:30:07,090 --> 00:30:16,680 again. Oh, yeah. The first step we call the domain, www.facebook.com/FID. If we 258 00:30:16,680 --> 00:30:24,650 get redirected in this case, then we check if the, if the page is an account page. If 259 00:30:24,650 --> 00:30:31,270 it's an account page, then it's an public account like the example 4 and we were 260 00:30:31,270 --> 00:30:39,890 able to save the raw data, the raw HTTP source. If we, if it's not an account page 261 00:30:39,890 --> 00:30:45,070 then everything is OK. If it's not, it's not a public account and we are not able 262 00:30:45,070 --> 00:30:52,580 to save any data. And if we call, if we do, if we do not get redirected in the 263 00:30:52,580 --> 00:31:01,630 first step, then we call the second domain, facebook.com/profile.php?id=FID 264 00:31:01,630 --> 00:31:09,289 with the mobile user agent. And if we get redirected then, then again, it is a 265 00:31:09,289 --> 00:31:14,990 nonpublic profile and we cannot save anything. But, and if we get not 266 00:31:14,990 --> 00:31:22,710 redirected, it is an invalid profile and it is most often a deleted account. Yeah. 267 00:31:22,710 --> 00:31:29,390 And yeah, that's the high level overview of our scraper. And Phillip will now give 268 00:31:29,390 --> 00:31:32,340 some more information on interesting results. 269 00:31:32,340 --> 00:31:38,820 Phillip: So the most interesting result of this scraping of the sample of Facebook 270 00:31:38,820 --> 00:31:47,070 IDs was that one in four Facebook IDs corresponds to a valid account. And you 271 00:31:47,070 --> 00:31:53,559 can do the math. There are 40 billion Facebook IDs, so there must be 10 billion 272 00:31:53,559 --> 00:32:00,170 registered users on Facebook. And this means that there are more registered users 273 00:32:00,170 --> 00:32:08,140 on Facebook than there are humans on Earth. And also, it means that it's even 274 00:32:08,140 --> 00:32:12,460 worse than that because not everybody on Earth can have a Facebook account because 275 00:32:12,460 --> 00:32:17,370 not everybody, you need a smartphone for that. And many people don't have those. So 276 00:32:17,370 --> 00:32:22,270 this is actually a pretty high number and it's very unexpected. So in July 2019, 277 00:32:22,270 --> 00:32:29,059 there were more than ten billion Facebook accounts. Also, we did another research on 278 00:32:29,059 --> 00:32:36,429 the timeframe between October 2018 and today, or this month. And we found that in 279 00:32:36,429 --> 00:32:43,140 this timeframe there were 2 billion new registered Facebook accounts. So this is 280 00:32:43,140 --> 00:32:48,679 like the timeframe of one year, more or less. And in a similar timeframe, the 281 00:32:48,679 --> 00:32:58,899 monthly active user base rose by only 187 million. Facebook deleted 150 million 282 00:32:58,899 --> 00:33:05,419 older accounts between October 2018 and July 2019. And we know that because we 283 00:33:05,419 --> 00:33:11,460 pulled the same sample over a longer period of time. And then we watched for 284 00:33:11,460 --> 00:33:16,230 accounts that got deleted in the sample. And that enables us to estimate this 285 00:33:16,230 --> 00:33:23,400 number of 150 million accounts that got deleted that are basically older than our 286 00:33:23,400 --> 00:33:31,890 sample. So I made some nice graphs for your viewing pleasure. So, again, the 287 00:33:31,890 --> 00:33:40,919 older accounts were, just 150 million were deleted since October 2018. These are 288 00:33:40,919 --> 00:33:46,350 accounts that are older than last year. And Facebook claims that since then, about 289 00:33:46,350 --> 00:33:52,789 7 billion accounts got deleted from their platform, which is vastly more than these 290 00:33:52,789 --> 00:33:58,370 older accounts. And that that's why we think that Facebook mostly deleted these 291 00:33:58,370 --> 00:34:06,770 newer accounts. And if an account is older than a certain age, then it is very 292 00:34:06,770 --> 00:34:13,069 unlikely that it gets deleted. And also, I think you can see the scales here. So, of 293 00:34:13,069 --> 00:34:17,960 course, the registered users are not the same thing as active users, but you can 294 00:34:17,960 --> 00:34:23,290 still see that there are much more registrations of, of new users than there 295 00:34:23,290 --> 00:34:30,139 are active users. And there are new active users during the last year. So what does 296 00:34:30,139 --> 00:34:37,909 this all mean? Does it mean that Facebook gets flooded by fake accounts? We don't 297 00:34:37,909 --> 00:34:42,980 really know. We only know these numbers. What Facebook is telling us is that they 298 00:34:42,980 --> 00:34:50,409 only count and publish active users, as I already said, that there is a disconnect 299 00:34:50,409 --> 00:34:56,759 between this record, registered users and active users and Facebook only reports on 300 00:34:56,759 --> 00:35:04,289 the active users. Also, they say that users register accounts, but they don't 301 00:35:04,289 --> 00:35:10,519 verify them or they don't use them, and that's how this number gets so high. But I 302 00:35:10,519 --> 00:35:19,319 think that that's not really explaining these high numbers and because that's just 303 00:35:19,319 --> 00:35:26,469 by orders of magnitude larger than anything that this could cause. Also, they 304 00:35:26,469 --> 00:35:31,819 say that they regularly delete fake accounts. But we have seen that these are 305 00:35:31,819 --> 00:35:37,519 mostly accounts that get deleted directly after their creation. And if they survive 306 00:35:37,519 --> 00:35:46,170 long enough, then they are getting through. So what does this all mean? 307 00:35:46,170 --> 00:35:55,390 Svea: Okay, so you got the full load, which I had like over two or three months. 308 00:35:55,390 --> 00:36:02,869 And what for me was, was a one very big conclusion was that we have some kind of 309 00:36:02,869 --> 00:36:08,530 broken metric here, that all the likes and all the hearts on Instagram and the 310 00:36:08,530 --> 00:36:13,650 followers that they can so easily be manipulated. And then it's it's so hard to 311 00:36:13,650 --> 00:36:19,029 tell in some cases, it's so hard to tell if they are real or not real. And this 312 00:36:19,029 --> 00:36:26,160 opens the gate for manipulation and yes, untrueness. And for economic losses, if 313 00:36:26,160 --> 00:36:33,109 you think as somebody who is investing money and or as an advertiser, for 314 00:36:33,109 --> 00:36:40,170 example. And in the very end, it is a case of eroding trust, which means that we 315 00:36:40,170 --> 00:36:45,739 cannot trust these numbers anymore. These numbers are, you know, they are so easily 316 00:36:45,739 --> 00:36:53,799 manipulated. And why should we trust this? And this has a severe consequence for all 317 00:36:53,799 --> 00:36:59,420 the social networks. If you are still in them. So what can be a solution? And 318 00:36:59,420 --> 00:37:05,150 Philip, you thought about that. Phillip: So basically we have two 319 00:37:05,150 --> 00:37:11,410 problems. One is click workers and one is fakes. Click workers are basically just 320 00:37:11,410 --> 00:37:18,420 hyper active users and they are selling their hyper activity. And so what social 321 00:37:18,420 --> 00:37:23,660 networks could do is just make interactions scarce, so just lower the 322 00:37:23,660 --> 00:37:29,180 value of more interactions. If you are a hyper active users, then your interaction 323 00:37:29,180 --> 00:37:34,240 should count less than the interactions of a less active user. 324 00:37:34,240 --> 00:37:39,229 *Mumbling* That's kind of solvable, I think. The real 325 00:37:39,229 --> 00:37:46,890 problem is the authenticity. So if you if you get stopped from posting or liking 326 00:37:46,890 --> 00:37:52,640 hundreds of pages a day, then maybe you just create multiple accounts and operate 327 00:37:52,640 --> 00:37:58,599 them simultaneously. And this can only be solved by authenticity. So this can only 328 00:37:58,599 --> 00:38:04,990 be solved if you know that the person who is operating the account is just one 329 00:38:04,990 --> 00:38:10,569 person, is operating one account. And this is really hard to do, because Facebook 330 00:38:10,569 --> 00:38:14,940 doesn't know who is clicking. Is it a bot? Is it a clickworrker, or is it one 331 00:38:14,940 --> 00:38:20,410 clickworker for ten accounts? How does this work? And so this is really hard for 332 00:38:20,410 --> 00:38:27,609 the, for the social media companies to do. And you could say, OK, let's send in the 333 00:38:27,609 --> 00:38:32,359 passport or something like that to prove authenticity. But that's actually not a 334 00:38:32,359 --> 00:38:37,109 good idea because nobody wants to send their passport to Facebook. And so this is 335 00:38:37,109 --> 00:38:42,359 really a hard problem that has to be solved. If we want to use social, social 336 00:38:42,359 --> 00:38:49,750 media in a meaningful way. And so this is what, what companies could do. And now... 337 00:38:49,750 --> 00:38:53,200 Svea: But what do what you could do. Okay. Of course, you can delete 338 00:38:53,200 --> 00:38:56,469 your Facebook account or your Instagram account and stop. 339 00:38:56,469 --> 00:39:01,299 *Slight Applause, Lauthing* Svea: Yeah! Stay away from social media. 340 00:39:01,299 --> 00:39:08,959 But this maybe is not for all of us a solution. So I think be aware, of course. 341 00:39:08,959 --> 00:39:17,499 Spread the word, tell others. And if, if you, if you like, then and you get more 342 00:39:17,499 --> 00:39:24,019 intelligence about that, we are really happy to dig deeper in these networks. And 343 00:39:24,019 --> 00:39:30,180 and we will go on investigating and so at last but not least, it's to say thank you 344 00:39:30,180 --> 00:39:33,349 to you guys. Thank you very much for listening. 345 00:39:33,349 --> 00:39:40,089 *Applause* Svea: And we did not do this alone. We are 346 00:39:40,089 --> 00:39:44,849 not three people. There are many more standing behind and doing this, this 347 00:39:44,849 --> 00:39:50,709 beautiful research. And we are opening now for questions, please. 348 00:39:50,719 --> 00:39:55,429 Herald: Yes. Please, thank Svea, Phil and Dennis again. 349 00:39:55,429 --> 00:40:05,519 *Applause* And we have microphones out 350 00:40:05,519 --> 00:40:09,680 here in the room, about nine of them, actually. If you line up behind them to 351 00:40:09,680 --> 00:40:15,780 ask a question, remember that a question is a sentence with a question mark behind 352 00:40:15,780 --> 00:40:20,500 it. And I think I see somebody at number three. So let's start with that. 353 00:40:20,500 --> 00:40:25,979 Question: Hi. I, I just have a little question. Wouldn't a dislike button, the 354 00:40:25,979 --> 00:40:30,749 concept of a dislike button, wouldn't that be a solution to all the problems? 355 00:40:30,749 --> 00:40:38,039 Phillip: So we thought about recommending that Facebook ditches the like button 356 00:40:38,039 --> 00:40:42,299 altogether. I think that would be a better solution than a dislike button, because a 357 00:40:42,299 --> 00:40:47,079 dislike button could also be manipulated and it would be even worse because you 358 00:40:47,079 --> 00:40:54,119 could actually manipulate the network into down ranking posts or kind of not showing 359 00:40:54,119 --> 00:41:00,670 posts to somebody. And that, I think would be even worse. I imagine what dictators 360 00:41:00,670 --> 00:41:08,209 would do with that. And so I think the best option would be to actually not show 361 00:41:08,209 --> 00:41:18,029 off like, like counts anymore and to this, to actually make people not invest into 362 00:41:18,029 --> 00:41:25,199 these counts if they become meaningless. Herald: I think I see a microphone 7, up 363 00:41:25,199 --> 00:41:28,109 there. Question: Hello. So one question I had is 364 00:41:28,109 --> 00:41:37,210 you are signed creation dates to IDs. How did you do this? 365 00:41:37,210 --> 00:41:52,489 Phillip: So, we actually knew the creation date of some accounts. And then we kind of 366 00:41:52,489 --> 00:41:58,210 interpolated between the creation dates and the IDs. So you see this black line 367 00:41:58,210 --> 00:42:04,109 there. That's actually our, our interpolation. And with this black line, 368 00:42:04,109 --> 00:42:10,910 we can then estimate the creation dates for IDs that we do not yet know because 369 00:42:10,910 --> 00:42:17,430 they did, kind of fill in the gaps. Q: Follow up question, do you know why 370 00:42:17,430 --> 00:42:20,310 there are some points outside of this graph? 371 00:42:20,310 --> 00:42:23,999 Phillip: No. Q: No? Thank you. 372 00:42:23,999 --> 00:42:26,400 Herald: So there was a question from the Internet. 373 00:42:26,400 --> 00:42:33,723 Question: Did you report your findings to Facebook? And did they do anything? 374 00:42:33,723 --> 00:42:41,509 Svea: Because this research is very new, we, we just recently approached them and 375 00:42:41,509 --> 00:42:47,190 showed them the research and we got an answer. But I think we also already showed 376 00:42:47,190 --> 00:42:54,480 the answer. It was that they, I think that they only count and publish active users. 377 00:42:54,480 --> 00:42:59,680 They could, they did not want to tell us how many registered users they have, that 378 00:42:59,680 --> 00:43:03,859 they say, oh, sometimes users register accounts, but don't use them or verify 379 00:43:03,859 --> 00:43:08,930 them. And that they regularly delete fake accounts. But we hope that we get into a 380 00:43:08,930 --> 00:43:12,469 closer discussion with them soon about this. 381 00:43:12,469 --> 00:43:19,469 Herald: Microphone two. Question: When hunting down the bias of 382 00:43:19,469 --> 00:43:26,740 the campaigns, did you dig out your own campaign line, Line below the line? No, 383 00:43:26,740 --> 00:43:34,039 because they stopped scraping in August. And I, you stopped scraping in August. And 384 00:43:34,039 --> 00:43:39,449 then I started, you know, the whole project started with them coming to us 385 00:43:39,449 --> 00:43:44,599 with the list. And then we thought, oh, this is very interesting. And then the 386 00:43:44,599 --> 00:43:50,729 whole journalistic research started. And, but I think if we, I think if we would do 387 00:43:50,729 --> 00:43:56,200 it again, of course, I think we would find us. We all also found there was another 388 00:43:56,200 --> 00:44:01,650 magazine, and they did, also a test, paid test a couple of years ago. And we found 389 00:44:01,650 --> 00:44:04,920 their campaign. Phillip: So, so we we actually did another 390 00:44:04,920 --> 00:44:11,480 test. And for the other test, I noted we also got like this ID, I think. And it 391 00:44:11,480 --> 00:44:20,329 worked to plug it into the URL and then we also got to redirected to our own page. So 392 00:44:20,329 --> 00:44:22,569 that worked. Q: Thank you. 393 00:44:22,569 --> 00:44:26,379 Herald: Microphone three. Question: Hi. I'm Farhan, I'm a Pakistani 394 00:44:26,379 --> 00:44:30,759 journalist. And first of all, I would like to say that you were right when you said 395 00:44:30,759 --> 00:44:34,910 that there might be people sitting in Pakistan clicking on the likes. That does 396 00:44:34,910 --> 00:44:41,329 happen. But my question would be that Facebook does have its own ad program that 397 00:44:41,329 --> 00:44:47,470 it aggressively pushes. And in that ad program, there is also options whereby 398 00:44:47,470 --> 00:44:53,701 people can buy likes and comments and impressions and reactions. Did you, would 399 00:44:53,701 --> 00:44:59,670 you also consider those as a fake? I mean, that they're not fake, per se, but they're 400 00:44:59,670 --> 00:45:05,799 still bought likes. So what's your view on those? Thank you. 401 00:45:05,799 --> 00:45:14,349 Phillip: So, when you buy ads on Facebook, then, so, what you what you actually want 402 00:45:14,349 --> 00:45:19,489 to have is fans for your page that are actually interested in your page. So 403 00:45:19,489 --> 00:45:25,460 that's kind of the difference, I think to the, to the paid likes system where the 404 00:45:25,460 --> 00:45:30,119 people themselves, they get paid for liking stuff that they wouldn't normally 405 00:45:30,119 --> 00:45:35,599 like. So I think that's the fundamental difference between the two programs. And 406 00:45:35,599 --> 00:45:40,529 that's why I think that one is unethical. And one is not really that unethical. 407 00:45:40,529 --> 00:45:47,749 Svea: The very problem is if you, if you buy these click workers, then you have 408 00:45:47,749 --> 00:45:52,789 many people in your fan page. They are not interested in you. They don't care about 409 00:45:52,789 --> 00:45:57,410 you. They don't look at your products. They don't look at your political party. 410 00:45:57,410 --> 00:46:03,539 And then often the people, they additionally, they make Facebook ads, and 411 00:46:03,539 --> 00:46:08,229 these ads, they are shown, again, the click workers and they don't look at them. 412 00:46:08,229 --> 00:46:13,410 So, you know, people, they are burning money and money and money with this whole 413 00:46:13,410 --> 00:46:18,069 corrupt system. Herald: So, microphone two. 414 00:46:18,069 --> 00:46:22,039 Question: Hi. Thanks. Thanks for the talk and thanks for the effort of going through 415 00:46:22,039 --> 00:46:27,709 all of this project. From my understanding, this whole finding 416 00:46:27,709 --> 00:46:35,209 basically undermines the trust in Facebook's likes in general, per se. So I 417 00:46:35,209 --> 00:46:42,369 would expect now the price of likes to drop and the pay for click workers to drop 418 00:46:42,369 --> 00:46:49,250 as well. Do you have any metrics on that? Svea: The research just went public. I 419 00:46:49,250 --> 00:46:56,180 think one week ago. So, so what we have seen as an effect is that Facebook, they 420 00:46:56,180 --> 00:47:02,940 excluded paid likes for, for a moment. So, yes, of course, one platform is down. But 421 00:47:02,940 --> 00:47:08,010 I think there are so many outside. There are so many. So I think... 422 00:47:08,010 --> 00:47:14,229 Q: I meant the phenomenon of paid likes, not the company itself. Like the value of 423 00:47:14,229 --> 00:47:19,319 a like as a measure of credibility... Phillip: We didn't... 424 00:47:19,319 --> 00:47:22,829 Q: ...is declining now. That's my, that's my... 425 00:47:22,829 --> 00:47:27,869 Svea: Yes. That's why many people are buying Instagram hearts now. So, so, yes, 426 00:47:27,869 --> 00:47:32,900 that's true. The like is not the fancy hot shit anymore. Yes. And we also saw in the 427 00:47:32,900 --> 00:47:40,670 data that the likes for the fan pages, they rapidly went down and the likes for 428 00:47:40,670 --> 00:47:45,229 the posts and the comments, they went up. So I think, yes, there is a shift. And 429 00:47:45,229 --> 00:47:51,809 what we also saw in that data was that the Facebook likes, they, they went down from 430 00:47:51,809 --> 00:47:57,839 2016. They are rapidly down. And what is growing and rising is YouTube and 431 00:47:57,839 --> 00:48:01,609 Instagram. Now, everything is about, today, everything is about Instagram. 432 00:48:01,609 --> 00:48:05,270 Q: Thanks. Herald: So let's go to number one. 433 00:48:05,270 --> 00:48:09,630 Question: Hello and thank you very much for this fascinating talk, because I've 434 00:48:09,630 --> 00:48:15,400 been following this whole topic for a while. And I was wondering if you were 435 00:48:15,400 --> 00:48:20,849 looking also into the demographics, in terms of age groups and social class, not 436 00:48:20,849 --> 00:48:25,619 of the people who were doing the actual liking, but actually, you know, buying 437 00:48:25,619 --> 00:48:31,249 these likes. Because I think that what is changing is an entire social discourse on 438 00:48:31,249 --> 00:48:36,709 social capital and, the bold U.S. kind of term, because it can now be quantified. As 439 00:48:36,709 --> 00:48:43,650 a teacher, I hear of kids who buy likes to be more popular than their other 440 00:48:43,650 --> 00:48:47,880 schoolmates. So I'm wondering if you're looking into that, because I think that's 441 00:48:47,880 --> 00:48:52,559 fascinating, fascinating area to actually come up with numbers about it. 442 00:48:52,559 --> 00:48:59,229 Svea: It definitely is. And we were all so fascinated by this data set of 90,000 data 443 00:48:59,229 --> 00:49:05,479 points. And what we did was, and this was very hard, and was that we tried it, first 444 00:49:05,479 --> 00:49:11,869 of all, to look who is buying likes, like automotives, you know, to to, this some, 445 00:49:11,869 --> 00:49:18,910 you know, what, what kind of branches? Who is in that? And so this was this was 446 00:49:18,910 --> 00:49:24,769 doable. But to get more into demographics, you would have liked to, to crawl, to 447 00:49:24,769 --> 00:49:33,699 click every page. And so we we did not do this. What we did was, of course, that we 448 00:49:33,699 --> 00:49:38,489 that we were a team of three to ten people and manually looking into it. And what we, 449 00:49:38,489 --> 00:49:43,739 of course, saw that on Instagram and on YouTube, you have many of these very young 450 00:49:43,739 --> 00:49:47,219 people. Some of them, I actually called them and they were like, Yes, I bought 451 00:49:47,219 --> 00:49:54,089 likes. Very bad idea. So I think yes, I think there is a demographic shift away 452 00:49:54,089 --> 00:49:59,890 from the companies and the automotive and industries buying Facebook fan page likes 453 00:49:59,890 --> 00:50:04,390 to Instagram and YouTube wannabe- influencers. 454 00:50:04,390 --> 00:50:06,430 Q: Influencers, influencer culture is obviously... 455 00:50:06,430 --> 00:50:12,670 Svea: Yes. And I have to admit here we, we showed you the political side, but we have 456 00:50:12,670 --> 00:50:19,849 to admit that the political likes, they were like this small in the numbers. And 457 00:50:19,849 --> 00:50:25,640 the very, very vast majority of this data set, it's about wedding planners, 458 00:50:25,640 --> 00:50:31,440 photography, tattoo studios and influencers, influencers, influencers and 459 00:50:31,440 --> 00:50:34,479 YouTubers, of course. Q: Yes. Thank you so much. 460 00:50:34,479 --> 00:50:37,439 Herald: So we have a lot of questions in the room. I'm going to get to you as soon 461 00:50:37,439 --> 00:50:40,009 as we can. I'd like to go to the Internet first. 462 00:50:40,009 --> 00:50:44,680 Signal Angel: Do you think this will get bit better or worse if people move to more 463 00:50:44,680 --> 00:50:48,319 decentralized platforms? Phillip: To more what? 464 00:50:48,319 --> 00:50:54,910 Svea: If it get better or worse. Dennis: Can you repeat that, please? 465 00:50:54,910 --> 00:50:58,880 Herald: Would this issue get better or worse if people move to a more 466 00:50:58,880 --> 00:51:01,239 decentralized platform? Phillip: Decentralized. decentralized, 467 00:51:01,239 --> 00:51:12,160 okay. So, I mean, we can look at, at the, this slide, I think, and think about 468 00:51:12,160 --> 00:51:18,249 whether decentralized platforms would change any of these, any of these two 469 00:51:18,249 --> 00:51:25,999 points here. And I fear, I don't think so, because they cannot solve the interactions 470 00:51:25,999 --> 00:51:30,210 problem that people can be hyperactive. Actually, that's kind of a normal thing 471 00:51:30,210 --> 00:51:34,299 with social media. A small portion of social media users is much more active 472 00:51:34,299 --> 00:51:39,880 than everybody else. That's kind of. You have that without paying for it. So 473 00:51:39,880 --> 00:51:44,720 without even having paid likes, you will have to consider if social media is really 474 00:51:44,720 --> 00:51:51,189 kind of representative of the society. But, and the other thing is authenticity. 475 00:51:51,189 --> 00:51:57,170 And also in a decentralized platform, you could have multiple accounts run by the 476 00:51:57,170 --> 00:52:01,199 same person. Herald: So, microphone seven, all the way 477 00:52:01,199 --> 00:52:06,779 back there. Question: Hi. Do you know if Facebook even 478 00:52:06,779 --> 00:52:10,220 removes the likes when they delete fake accounts? 479 00:52:10,220 --> 00:52:17,319 Svea: Do you know that? Phillip: No, we don't know that. No, we 480 00:52:17,319 --> 00:52:21,259 don't. We don't know. We know they delete fake accounts, but we don't know if they 481 00:52:21,259 --> 00:52:27,619 also delete the likes. I know from our research that the people we approached, 482 00:52:27,619 --> 00:52:31,329 they did not delete the click workers. They get... 483 00:52:31,329 --> 00:52:35,839 Herald: Microphone two. Question: Yeah. Hi. So I have a question 484 00:52:35,839 --> 00:52:41,359 with respect to this, one out of four Facebook accounts are active in your, in 485 00:52:41,359 --> 00:52:46,949 your test. Did you see any difference with respect to age of the accounts? So is it 486 00:52:46,949 --> 00:52:52,489 always one out the four to the entire sample? Or does it maybe change over the, 487 00:52:52,489 --> 00:52:57,730 over the like going from a zero ID to, well, 10 billion or 40 billion? 488 00:52:57,730 --> 00:53:02,189 Phillip: So you're talking about the density of accounts in our ID? 489 00:53:02,189 --> 00:53:05,989 Q: Kind of. Phillip: So, so there are changes over 490 00:53:05,989 --> 00:53:12,150 time. Yeah. So I guess I think now it's less than it was before. So now they are 491 00:53:12,150 --> 00:53:19,089 less than for then, and before it was more and so I think it was. Yeah. I don't know. 492 00:53:19,089 --> 00:53:23,660 Q: But you don't see anything specific that now, only in the new accounts, only 493 00:53:23,660 --> 00:53:28,229 one out of 10 is active or valid and before it was one out of two or something 494 00:53:28,229 --> 00:53:31,259 like that. Phillip: It's not that extreme. So it's 495 00:53:31,259 --> 00:53:34,859 less than that. It's kind of... Dennis: We have to say we did not check 496 00:53:34,859 --> 00:53:41,239 this, but there were no special cases. Phillip: But it changed over time? So 497 00:53:41,239 --> 00:53:47,200 before it was less and, before it was more and now it is less. And so what we checked 498 00:53:47,200 --> 00:53:54,710 was whether an ID actually corresponds to an account. And so this metric, yeah. And 499 00:53:54,710 --> 00:53:57,299 it changed a little bit over time, but not much. 500 00:53:57,299 --> 00:54:02,239 Herald: So, so number three, please. Question: Yeah. Thank you for a very 501 00:54:02,239 --> 00:54:06,989 interesting talk. At the end, you gave some recommendations, how to fix the 502 00:54:06,989 --> 00:54:11,769 metrics, right? And it's always nice to have some metrics because then, well, we 503 00:54:11,769 --> 00:54:15,220 are the people who deal with the numbers. So we want the metrics. But I want to 504 00:54:15,220 --> 00:54:20,309 raise the issue whether quantitative measure is actually the right thing to do. 505 00:54:20,309 --> 00:54:26,449 So would you buy your furniture from store A with 300 likes against store B with 200 506 00:54:26,449 --> 00:54:32,049 likes? Or would it not be better to have a more qualitative thing? And to what extent 507 00:54:32,049 --> 00:54:38,259 is a quantitative measure maybe also the source of a lot of bad developments we see 508 00:54:38,259 --> 00:54:43,390 in social media to begin with, even not with bot firms and anything, but just 509 00:54:43,390 --> 00:54:48,339 people who go for the quick like and say Hooray for Trump and then get, whatever, 510 00:54:48,339 --> 00:54:52,479 all the Trumpists is liking that and the others say Fuck Trump and you get all the 511 00:54:52,479 --> 00:54:57,229 non Trumpists like that and you get all the polarization, right? So, Instagram, I 512 00:54:57,229 --> 00:55:02,650 think they just don't just display their like equivalent anymore in order to 513 00:55:02,650 --> 00:55:04,929 prevent that, so could you maybe comment on that? 514 00:55:04,929 --> 00:55:12,299 Svea: I think this is a good idea, to, to hide the likes. Yes. But I you know, we 515 00:55:12,299 --> 00:55:17,799 talked to many clickworkers and they do a lot of stuff. And what they also do is 516 00:55:17,799 --> 00:55:23,309 taking comments and doing copy paste for comments section or for Amazon reviews. 517 00:55:23,309 --> 00:55:29,789 So, you know, I think it's really hard to get them out of the system because maybe 518 00:55:29,789 --> 00:55:34,390 if the likes are not shown and if and when the comments are counting, then you will 519 00:55:34,390 --> 00:55:41,069 have people who are copy pasting comments in the comments section. So I really think 520 00:55:41,069 --> 00:55:44,519 that the networks, that they really have an issue here. 521 00:55:44,519 --> 00:55:49,829 Herald: So let's try to squeeze the last three questions now. First, number seven, 522 00:55:49,829 --> 00:55:52,950 really quick. Question: Very quick. Thank you for the 523 00:55:52,950 --> 00:55:58,799 nice insights. And I have a question about the location of the users. So you made 524 00:55:58,799 --> 00:56:03,289 your point that you can analyze by the metadata where, uh, when the account was 525 00:56:03,289 --> 00:56:08,650 made. But how about the location of the followers? Is there any way to analyze 526 00:56:08,650 --> 00:56:12,339 that as well? Phillip: So we can only analyze that if 527 00:56:12,339 --> 00:56:21,049 the users agreed to share it publicly and not all of them do that, I think often a 528 00:56:21,049 --> 00:56:26,460 name check is often a very good way to check where somebody is from. For these 529 00:56:26,460 --> 00:56:32,190 fake likes, for example. But as I said, it always depends on what the user himself is 530 00:56:32,190 --> 00:56:36,130 willing to share. Herald: Internet? 531 00:56:36,130 --> 00:56:41,039 Signal Angel: Isn't this just the western version of the Chinese social credit 532 00:56:41,039 --> 00:56:43,999 system? Where do we go from here? What is the future of all this? 533 00:56:43,999 --> 00:56:54,089 Svea: Yeah, it's dystopian, right? Oh, yeah, I don't, after this research, you 534 00:56:54,089 --> 00:57:01,109 know, for me, I deleted my Facebook account like one or two years ago. So this 535 00:57:01,109 --> 00:57:07,279 does you know, this did not matter to me so much. But I stayed on Instagram and 536 00:57:07,279 --> 00:57:13,359 when I saw all this bought likes and abonnents and followers and also YouTube, 537 00:57:13,359 --> 00:57:16,999 all this views, this, because the click workers, they also watch YouTube videos. 538 00:57:16,999 --> 00:57:20,859 They have to stay on them like 40 seconds, it's really funny because they hate 539 00:57:20,859 --> 00:57:27,239 hearing like techno music, rap music, all 40 seconds and then they go on. But when I 540 00:57:27,239 --> 00:57:34,589 sit next to Herald for two hour, three hours, I was so desillusionated about all 541 00:57:34,589 --> 00:57:40,960 the social network things. And and I thought, OK, don't count on anything. Just 542 00:57:40,960 --> 00:57:46,119 if you like the content, follow them and look at them. But don't believe anything. 543 00:57:46,119 --> 00:57:50,479 That was my personal take away from this research. 544 00:57:50,479 --> 00:57:53,970 Herald: So very last question, microphone two. 545 00:57:53,970 --> 00:57:59,150 Question: A couple of days ago, The Independent reported that Facebook, the 546 00:57:59,150 --> 00:58:06,839 Facebook App was activating the camera when reading a news feed. Could this be in 547 00:58:06,839 --> 00:58:10,779 use in the context of detecting fake accounts? 548 00:58:10,779 --> 00:58:18,400 Svea: I don't know. Phillip: So, I think that that in this 549 00:58:18,400 --> 00:58:26,799 particular instance that it was probably a bug. So, I don't know, but I mean that the 550 00:58:26,799 --> 00:58:30,679 people who work at Facebook are, not all of them are like crooks or anything that 551 00:58:30,679 --> 00:58:35,130 they will deliberately program this kind of stuff. So they said that it was kind of 552 00:58:35,130 --> 00:58:41,189 a bug from from an update that they did. And the question is whether we can 553 00:58:41,189 --> 00:58:49,430 actually detect fake accounts with the camera. And the problem is that current, I 554 00:58:49,430 --> 00:58:57,469 don't think that current face recognition technology is enough to detect that you 555 00:58:57,469 --> 00:59:02,940 are a unique person. So there are so many people on the planet that probably another 556 00:59:02,940 --> 00:59:08,959 person who has the same face. And I think the new iPhone, they also have this much 557 00:59:08,959 --> 00:59:14,579 more sophisticated version of this technology. And even they say, OK, there's 558 00:59:14,579 --> 00:59:19,079 a chance of one in, I don't know, that there is somebody who can unlock your 559 00:59:19,079 --> 00:59:23,829 phone. So I think it's really hard to do that with, do that with recording 560 00:59:23,829 --> 00:59:29,299 technology, to actually prove that somebody is just one person. 561 00:59:29,299 --> 00:59:38,059 Herald: So with that, would you please help me thank Svea, Dennis and Philip 562 00:59:38,059 --> 00:59:41,160 one more time for this fantastic presentation! Very interesting and very, 563 00:59:41,160 --> 00:59:48,099 very disturbing. Thank you very much. *Applause* 564 00:59:48,099 --> 00:59:52,099 *postroll music* 565 00:59:52,099 --> 01:00:16,000 Subtitles created by c3subtitles.de in the year 2020. Join, and help us!