0 00:00:00,000 --> 00:00:30,000 Dear viewer, these subtitles were generated by a machine via the service Trint and therefore are (very) buggy. If you are capable, please help us to create good quality subtitles: https://c3subtitles.de/talk/335 Thanks! 1 00:00:09,200 --> 00:00:10,169 Can you hear OK? 2 00:00:10,170 --> 00:00:13,019 Yep, I can hear you OK? 3 00:00:13,020 --> 00:00:14,020 This is good. 4 00:00:14,780 --> 00:00:16,909 So we are here's 5 00:00:16,910 --> 00:00:20,149 Mike from the project and I'm s from if 6 00:00:20,150 --> 00:00:21,619 Mike is going to give most of the 7 00:00:21,620 --> 00:00:23,809 presentation, talking about 8 00:00:23,810 --> 00:00:25,699 the a lot of the motivation for 9 00:00:25,700 --> 00:00:27,379 reproducible builds and a lot of the work 10 00:00:27,380 --> 00:00:29,059 that a number of projects have done in 11 00:00:29,060 --> 00:00:30,079 this direction. 12 00:00:30,080 --> 00:00:32,030 And I'm going to talk 13 00:00:33,200 --> 00:00:34,820 about a particular demo 14 00:00:36,830 --> 00:00:38,509 that I think really focuses on the 15 00:00:38,510 --> 00:00:40,009 motivation and why we care about this 16 00:00:40,010 --> 00:00:41,569 problem for security. 17 00:00:41,570 --> 00:00:43,399 So we have a number of people. 18 00:00:43,400 --> 00:00:44,899 Mike, would you like to describe who all 19 00:00:44,900 --> 00:00:45,799 the people are? 20 00:00:45,800 --> 00:00:46,069 Yeah. 21 00:00:46,070 --> 00:00:48,139 So obviously a lot of work has gone 22 00:00:48,140 --> 00:00:50,449 into this problem over the years, 23 00:00:50,450 --> 00:00:52,909 and especially even recently. 24 00:00:52,910 --> 00:00:54,379 We were supposed to have a whole party up 25 00:00:54,380 --> 00:00:55,399 here on stage today. 26 00:00:55,400 --> 00:00:57,469 Actually, we're supposed to have 27 00:00:57,470 --> 00:00:59,929 Kosoff Steiner from the Guardian Project 28 00:00:59,930 --> 00:01:02,149 NA from the Tor Project and 29 00:01:02,150 --> 00:01:03,079 from Debian. 30 00:01:03,080 --> 00:01:05,059 And we were trying to get somebody from 31 00:01:05,060 --> 00:01:06,859 the who worked on the Gideon system and 32 00:01:06,860 --> 00:01:09,289 Bitcoin up here, too, unfortunately, 33 00:01:09,290 --> 00:01:10,189 for various reasons. 34 00:01:10,190 --> 00:01:12,139 Everybody several those people couldn't 35 00:01:12,140 --> 00:01:13,140 make it. 36 00:01:13,940 --> 00:01:16,159 And Moonah especially is 37 00:01:16,160 --> 00:01:18,139 is having some unfortunate circumstances 38 00:01:18,140 --> 00:01:20,030 that we wish him a speedy recovery. 39 00:01:21,170 --> 00:01:23,349 But he has done a lot of this work for 40 00:01:23,350 --> 00:01:24,350 Debian. 41 00:01:26,000 --> 00:01:28,819 So for background, basically 42 00:01:28,820 --> 00:01:30,919 the idea behind Reproducible World is 43 00:01:30,920 --> 00:01:33,139 to sort of close the circle on 44 00:01:33,140 --> 00:01:35,329 this at those are the promise of 45 00:01:35,330 --> 00:01:37,429 free software, the idea that the 46 00:01:37,430 --> 00:01:39,709 users should have all the source 47 00:01:39,710 --> 00:01:41,809 code to correspond to all the 48 00:01:41,810 --> 00:01:43,549 programs that they run that run on their 49 00:01:43,550 --> 00:01:45,799 computer. And the original argument 50 00:01:45,800 --> 00:01:47,419 by the Free Software Foundation was this 51 00:01:47,420 --> 00:01:49,609 was for individual freedom, 52 00:01:49,610 --> 00:01:51,059 the freedom to modify, the freedom to 53 00:01:51,060 --> 00:01:53,209 understand to to work with the things 54 00:01:53,210 --> 00:01:54,199 on your computer. 55 00:01:54,200 --> 00:01:56,509 And the argument was extended to software 56 00:01:56,510 --> 00:01:58,909 security that you can audit. 57 00:01:58,910 --> 00:02:01,279 And as with many eyeballs auditing 58 00:02:01,280 --> 00:02:03,379 all of the source code that people 59 00:02:03,380 --> 00:02:05,449 use, we can better understand as a 60 00:02:05,450 --> 00:02:07,579 community that the vulnerabilities and 61 00:02:07,580 --> 00:02:09,409 the privacy properties and other aspects 62 00:02:09,410 --> 00:02:11,299 of the software that we all use, both 63 00:02:11,300 --> 00:02:12,499 from the point of view of knowing that 64 00:02:12,500 --> 00:02:13,669 the software doesn't have 65 00:02:14,720 --> 00:02:16,159 vulnerabilities and from the point of 66 00:02:16,160 --> 00:02:17,299 view of knowing that the software doesn't 67 00:02:17,300 --> 00:02:19,939 have malicious functionality, that 68 00:02:19,940 --> 00:02:22,249 unfortunately the for 69 00:02:22,250 --> 00:02:24,619 the vast majority of free software 70 00:02:24,620 --> 00:02:26,749 projects today, the only proof that 71 00:02:26,750 --> 00:02:28,729 the bindery project packages that you 72 00:02:28,730 --> 00:02:30,769 download actually correspond to the 73 00:02:30,770 --> 00:02:32,539 published source code that you can also 74 00:02:32,540 --> 00:02:34,099 download is that somebody said so 75 00:02:35,240 --> 00:02:37,789 either this is a trusted institution, 76 00:02:37,790 --> 00:02:40,399 a trusted individual 77 00:02:40,400 --> 00:02:42,529 or an organization 78 00:02:42,530 --> 00:02:45,449 or several a collection of people. 79 00:02:45,450 --> 00:02:47,119 Ultimately, as somebody compiles the 80 00:02:47,120 --> 00:02:48,709 source code and produces a binary that 81 00:02:48,710 --> 00:02:50,569 you download and you take their word for 82 00:02:50,570 --> 00:02:51,139 it. 83 00:02:51,140 --> 00:02:52,579 So you have a sort of social or 84 00:02:52,580 --> 00:02:54,619 institutional connection where you go to 85 00:02:54,620 --> 00:02:56,779 a website and the website says, 86 00:02:56,780 --> 00:02:58,189 here's the source package, here's the 87 00:02:58,190 --> 00:02:59,179 binary package. 88 00:02:59,180 --> 00:03:01,249 But you as the end user don't actually 89 00:03:01,250 --> 00:03:03,529 have any technical evidence connecting 90 00:03:03,530 --> 00:03:05,149 these to just the fact that they appear 91 00:03:05,150 --> 00:03:06,150 on the same website 92 00:03:07,220 --> 00:03:09,439 and without bill information about 93 00:03:09,440 --> 00:03:11,539 the build system, even binary analysis, 94 00:03:11,540 --> 00:03:12,829 sophisticated reverse engineering 95 00:03:12,830 --> 00:03:14,240 techniques using 96 00:03:15,440 --> 00:03:17,509 either pro or even 97 00:03:17,510 --> 00:03:19,639 manual analysis the machine code, 98 00:03:19,640 --> 00:03:21,979 this verification is almost impossible. 99 00:03:23,270 --> 00:03:25,069 There's a very large amount of code 100 00:03:25,070 --> 00:03:26,089 differences even. 101 00:03:26,090 --> 00:03:28,279 And even if the you try to reproduce 102 00:03:28,280 --> 00:03:30,529 the build system as close as you can, 103 00:03:30,530 --> 00:03:32,059 you can still end up with a large amount 104 00:03:32,060 --> 00:03:34,639 of differences which will explain in 105 00:03:34,640 --> 00:03:35,640 later material. 106 00:03:36,750 --> 00:03:38,869 So as a result of sort of inadequate in 107 00:03:38,870 --> 00:03:40,519 really fulfilling this promise and 108 00:03:40,520 --> 00:03:42,799 fostering trust in and the software 109 00:03:42,800 --> 00:03:44,689 as a true functionality and security, and 110 00:03:44,690 --> 00:03:46,669 we'll try to make this really concrete in 111 00:03:46,670 --> 00:03:47,670 a few minutes. 112 00:03:48,470 --> 00:03:50,269 Now, the most common objection from 113 00:03:50,270 --> 00:03:52,579 hackers and such is probably 114 00:03:52,580 --> 00:03:54,979 many in the audience here is, well, I'm 115 00:03:54,980 --> 00:03:57,469 the developer. I'm a developer. 116 00:03:57,470 --> 00:03:59,329 I know what's in the binary because I 117 00:03:59,330 --> 00:04:00,799 don't know the source code. I say compile 118 00:04:00,800 --> 00:04:02,119 it myself. 119 00:04:02,120 --> 00:04:04,099 And moreover, I'm careful with my 120 00:04:04,100 --> 00:04:05,989 machine, with my operational security. 121 00:04:05,990 --> 00:04:07,099 I know what I'm doing. 122 00:04:07,100 --> 00:04:08,509 Nobody's going to own me. 123 00:04:08,510 --> 00:04:09,829 Or maybe nobody cares. 124 00:04:09,830 --> 00:04:11,329 Only why should I have to worry about 125 00:04:11,330 --> 00:04:13,429 these hypothetical risks of somebody 126 00:04:13,430 --> 00:04:14,359 owning me some way? 127 00:04:14,360 --> 00:04:15,680 And why do they even care? 128 00:04:18,019 --> 00:04:20,088 So, you know, to 129 00:04:20,089 --> 00:04:22,129 try and bring this into perspective and 130 00:04:22,130 --> 00:04:24,169 to add a little bit more rational thought 131 00:04:24,170 --> 00:04:26,329 into this, it actually turns out that as 132 00:04:26,330 --> 00:04:28,609 a developer, even though you may think 133 00:04:28,610 --> 00:04:30,589 your actions are benign, they still are 134 00:04:30,590 --> 00:04:33,439 interesting to adversaries. 135 00:04:33,440 --> 00:04:35,899 In fact, you're a very attractive 136 00:04:35,900 --> 00:04:37,279 target for someone who wants to 137 00:04:37,280 --> 00:04:40,399 compromise large numbers of people. 138 00:04:40,400 --> 00:04:42,559 However, Flake actually this year 139 00:04:42,560 --> 00:04:44,899 presented a very interesting work 140 00:04:44,900 --> 00:04:47,569 offline or offensive work and addiction, 141 00:04:47,570 --> 00:04:49,879 where he drew drew parallels 142 00:04:49,880 --> 00:04:52,039 between addictive behavior and addictive 143 00:04:52,040 --> 00:04:54,289 activity and this desire 144 00:04:54,290 --> 00:04:56,539 to compromise machines and progressively 145 00:04:56,540 --> 00:04:58,549 use what you compromise to gain more 146 00:04:58,550 --> 00:05:00,679 access and more access and 147 00:05:00,680 --> 00:05:01,999 use them as stepping stones. 148 00:05:02,000 --> 00:05:04,069 And this is also been revealed to us 149 00:05:04,070 --> 00:05:06,229 through the Snowden 150 00:05:06,230 --> 00:05:08,239 leaks and that intelligence agencies and 151 00:05:08,240 --> 00:05:09,919 institutions engage in this behavior, 152 00:05:09,920 --> 00:05:11,989 too. They see a target, oh, we 153 00:05:11,990 --> 00:05:14,209 want to go after us so and so and so 154 00:05:14,210 --> 00:05:16,339 we're going to compromise any anything 155 00:05:16,340 --> 00:05:17,419 that we need to get there. 156 00:05:17,420 --> 00:05:19,939 Google, Apple, any 157 00:05:19,940 --> 00:05:22,519 large infrastructure, any like software 158 00:05:22,520 --> 00:05:24,769 distribution and the temptation because 159 00:05:24,770 --> 00:05:27,259 software is created with other software 160 00:05:27,260 --> 00:05:28,909 to get the capability to compromise 161 00:05:28,910 --> 00:05:30,109 infrastructure. 162 00:05:30,110 --> 00:05:31,669 The natural thing to want to do is to get 163 00:05:31,670 --> 00:05:32,959 the capability to compromise other 164 00:05:32,960 --> 00:05:34,939 infrastructure and compromise other 165 00:05:34,940 --> 00:05:36,199 infrastructure. And there's a lot of 166 00:05:36,200 --> 00:05:38,239 sensitive trust because all of you, when 167 00:05:38,240 --> 00:05:40,129 developing when hosting software, when 168 00:05:40,130 --> 00:05:42,139 creating infrastructure, are relying on 169 00:05:42,140 --> 00:05:44,209 other infrastructure for that, relying on 170 00:05:44,210 --> 00:05:45,210 other programs. 171 00:05:46,460 --> 00:05:49,069 And there have been even outside of 172 00:05:49,070 --> 00:05:51,319 of the the Snowden leaks have been 173 00:05:51,320 --> 00:05:53,599 known successful attacks by just 174 00:05:53,600 --> 00:05:55,669 hackers against infrastructure used 175 00:05:55,670 --> 00:05:57,769 by several free software projects, the 176 00:05:57,770 --> 00:05:59,359 Linux kernel, somebody to try to 177 00:05:59,360 --> 00:06:01,849 institute a route privilege, escalation, 178 00:06:01,850 --> 00:06:04,039 vulnerability that was thankfully caught 179 00:06:04,040 --> 00:06:06,740 by Linux is careful custodian 180 00:06:08,030 --> 00:06:11,059 or guardianship of the canonical 181 00:06:11,060 --> 00:06:13,069 source code repository. 182 00:06:13,070 --> 00:06:15,199 Also, RedHat Pache had 183 00:06:15,200 --> 00:06:16,370 their distribution's 184 00:06:17,420 --> 00:06:18,829 distribution servers compromised to 185 00:06:18,830 --> 00:06:20,689 distribute malicious binaries and 186 00:06:20,690 --> 00:06:21,469 packages. 187 00:06:21,470 --> 00:06:23,569 And just yesterday, IAC had 188 00:06:23,570 --> 00:06:24,949 their website compromised 189 00:06:26,000 --> 00:06:29,029 to distribute malware to to visitors. 190 00:06:29,030 --> 00:06:30,469 So this is something that isn't just a 191 00:06:30,470 --> 00:06:32,819 threat from state actors. 192 00:06:32,820 --> 00:06:34,229 This is. 193 00:06:34,230 --> 00:06:36,810 It's very tempting for for any attacker, 194 00:06:37,860 --> 00:06:39,959 so the rest of the talk, we're going to 195 00:06:39,960 --> 00:06:41,759 spend some time trying to thoroughly 196 00:06:41,760 --> 00:06:43,529 convince you that this is these sorts of 197 00:06:43,530 --> 00:06:45,119 attacks against software distribution are 198 00:06:45,120 --> 00:06:47,429 very hard to detect, very possible. 199 00:06:47,430 --> 00:06:49,709 And it can be extremely harmful if 200 00:06:49,710 --> 00:06:51,509 the if the adversary has malicious 201 00:06:51,510 --> 00:06:52,510 intent. 202 00:06:53,220 --> 00:06:55,319 So to try and really drill 203 00:06:55,320 --> 00:06:57,389 this idea of 204 00:06:57,390 --> 00:06:58,739 this vulnerability, source of 205 00:06:58,740 --> 00:07:00,839 vulnerability home, imagine, if 206 00:07:00,840 --> 00:07:03,029 you will, your most to the most 207 00:07:03,030 --> 00:07:05,369 secure computer that you could design 208 00:07:05,370 --> 00:07:07,799 or that you could use for. 209 00:07:07,800 --> 00:07:09,539 Obviously, there's probably a lot of 210 00:07:09,540 --> 00:07:10,829 people who have a lot of opinions on 211 00:07:10,830 --> 00:07:12,929 this. And people like OpenBSD will 212 00:07:12,930 --> 00:07:14,879 want to run, Linux will run around app 213 00:07:14,880 --> 00:07:16,829 armor, want to run. 214 00:07:16,830 --> 00:07:19,439 Are the colonel's probably disable Wi-Fi, 215 00:07:19,440 --> 00:07:21,689 Bluetooth, maybe USB, maybe you can keep 216 00:07:21,690 --> 00:07:23,819 the thing off the network entirely. 217 00:07:23,820 --> 00:07:26,279 Just keep it in a bomb shelter, you know, 218 00:07:26,280 --> 00:07:28,409 like in a Faraday 219 00:07:28,410 --> 00:07:29,410 cage. 220 00:07:30,270 --> 00:07:32,399 So in order to bring this back to 221 00:07:32,400 --> 00:07:34,499 reality, imagine you're 222 00:07:34,500 --> 00:07:36,899 in the realm of software development. 223 00:07:36,900 --> 00:07:38,999 Can your most secure computer 224 00:07:39,000 --> 00:07:41,399 that you in your ideal 225 00:07:41,400 --> 00:07:43,739 scenario, can it still be useful 226 00:07:43,740 --> 00:07:44,819 for software development? 227 00:07:44,820 --> 00:07:46,319 So we're thinking about a developer's 228 00:07:46,320 --> 00:07:47,639 laptop where we're thinking about a built 229 00:07:47,640 --> 00:07:49,949 server used by a real world mainstream 230 00:07:49,950 --> 00:07:50,759 software project. 231 00:07:50,760 --> 00:07:52,199 So in the case of many open source 232 00:07:52,200 --> 00:07:54,149 projects, this computer is often 233 00:07:54,150 --> 00:07:55,079 networked. 234 00:07:55,080 --> 00:07:57,119 The developer, it could be built instead 235 00:07:57,120 --> 00:07:58,979 of build servers that several people have 236 00:07:58,980 --> 00:08:01,139 access to, to upload their packages. 237 00:08:01,140 --> 00:08:03,239 It could be the developers laptop that is 238 00:08:03,240 --> 00:08:05,579 mobile moves from hacker conference 239 00:08:05,580 --> 00:08:07,679 to hacker conference is left in 240 00:08:07,680 --> 00:08:09,329 hotel rooms, is 241 00:08:11,640 --> 00:08:13,439 potentially vulnerable to any number of 242 00:08:13,440 --> 00:08:15,269 physical vectors. 243 00:08:15,270 --> 00:08:16,829 And even in the extreme scenario where 244 00:08:16,830 --> 00:08:19,019 you have an institution that's capable 245 00:08:19,020 --> 00:08:20,789 of maintaining physical security over 246 00:08:20,790 --> 00:08:22,919 this computer, you still have to 247 00:08:22,920 --> 00:08:25,979 ferry data to and from it, often 248 00:08:25,980 --> 00:08:28,349 via USB devices or 249 00:08:28,350 --> 00:08:29,609 cold storage. 250 00:08:29,610 --> 00:08:31,019 And there are several vectors. 251 00:08:31,020 --> 00:08:33,449 And just this year, Carson showed that 252 00:08:33,450 --> 00:08:35,788 USB through the USB torque. 253 00:08:35,789 --> 00:08:37,589 There are several vectors for USB devices 254 00:08:37,590 --> 00:08:40,349 to maintain persistence on 255 00:08:40,350 --> 00:08:42,719 Aragao so called Aagot machines. 256 00:08:42,720 --> 00:08:44,879 Even if you're doing things like 257 00:08:44,880 --> 00:08:47,009 reinstalling your OS periodically to 258 00:08:47,010 --> 00:08:49,139 try and wipe it clean and then in 259 00:08:49,140 --> 00:08:50,609 more extreme scenarios, you'll have to 260 00:08:50,610 --> 00:08:53,399 run windows on this on that 261 00:08:53,400 --> 00:08:54,929 machine to be able to build windows 262 00:08:54,930 --> 00:08:56,459 packages. 263 00:08:56,460 --> 00:08:58,229 In the case of the browser vendors, they 264 00:08:58,230 --> 00:09:00,119 do what's called profile guided 265 00:09:00,120 --> 00:09:02,129 optimization, where they actually crawl 266 00:09:02,130 --> 00:09:04,289 the Aleksa top one million and 267 00:09:04,290 --> 00:09:06,389 get all that HTML, all that JavaScript 268 00:09:06,390 --> 00:09:08,279 unauthenticated and feed it through a 269 00:09:08,280 --> 00:09:10,349 profiler on a machine that outputs 270 00:09:10,350 --> 00:09:12,669 a profiler that a profile 271 00:09:12,670 --> 00:09:14,849 organization file that tells another 272 00:09:14,850 --> 00:09:17,069 machine that may be disconnected, how 273 00:09:17,070 --> 00:09:19,109 to arbitrarily rewrite the resulting 274 00:09:19,110 --> 00:09:21,869 binary to make it faster. 275 00:09:21,870 --> 00:09:23,969 So you can imagine a 276 00:09:23,970 --> 00:09:25,469 piece of malware that targets that 277 00:09:25,470 --> 00:09:27,199 optimizer. 278 00:09:27,200 --> 00:09:29,299 Causing the profiling process that a late 279 00:09:29,300 --> 00:09:30,829 stage in the bill process causes a 280 00:09:30,830 --> 00:09:33,889 profile to be generated that causes 281 00:09:33,890 --> 00:09:36,079 arbitrary rewrites of the 282 00:09:36,080 --> 00:09:38,029 binary to introduce gadgets for 283 00:09:38,030 --> 00:09:39,319 exploitation and malware. 284 00:09:39,320 --> 00:09:40,849 So if you're thinking not just of the 285 00:09:40,850 --> 00:09:43,099 source code as a target of attack, but 286 00:09:43,100 --> 00:09:44,779 really the infrastructure that's used to 287 00:09:44,780 --> 00:09:46,699 produce the software that's used to 288 00:09:46,700 --> 00:09:48,319 produce the binaries as a target of 289 00:09:48,320 --> 00:09:50,329 attack, it's a big deal. 290 00:09:50,330 --> 00:09:51,679 So this gets even worse. 291 00:09:51,680 --> 00:09:53,809 So take your most secure computer 292 00:09:53,810 --> 00:09:55,939 that you've been thinking about and 293 00:09:55,940 --> 00:09:58,159 imagine that 294 00:09:58,160 --> 00:09:59,839 not only are all these attack vectors 295 00:09:59,840 --> 00:10:01,159 possible, not only does it have to be 296 00:10:01,160 --> 00:10:03,619 used in these in these risky 297 00:10:03,620 --> 00:10:05,779 use scenarios, what if 298 00:10:05,780 --> 00:10:07,579 it's extremely valuable to compromise? 299 00:10:07,580 --> 00:10:09,259 What if compromising that computer gets 300 00:10:09,260 --> 00:10:11,299 you access to hundreds of millions of 301 00:10:11,300 --> 00:10:12,349 other computers? 302 00:10:12,350 --> 00:10:13,729 What if it gets you in the case of the 303 00:10:13,730 --> 00:10:15,829 browser wonders what if it gets you 304 00:10:15,830 --> 00:10:17,719 access to every bank account in the 305 00:10:17,720 --> 00:10:20,359 world? In the case of software that runs 306 00:10:20,360 --> 00:10:22,039 the financial system, what if it gets you 307 00:10:22,040 --> 00:10:23,659 access to every Windows computer in the 308 00:10:23,660 --> 00:10:25,159 world and maybe you're not even 309 00:10:25,160 --> 00:10:26,569 compromised? The adversary is not even 310 00:10:26,570 --> 00:10:28,079 compromising Microsoft here. 311 00:10:28,080 --> 00:10:30,259 Maybe they're just compromising a popular 312 00:10:30,260 --> 00:10:32,539 dependency for this for Windows 313 00:10:32,540 --> 00:10:34,069 development or something like Flash 314 00:10:34,070 --> 00:10:35,599 that's on ninety five percent of Windows 315 00:10:35,600 --> 00:10:37,789 computers or in the case of 316 00:10:37,790 --> 00:10:40,069 Debian Red Hat and Bunta, every Linux 317 00:10:40,070 --> 00:10:41,210 server in the world. 318 00:10:42,320 --> 00:10:44,689 And then this goes on, then you can carry 319 00:10:44,690 --> 00:10:46,429 this thought experiment further. 320 00:10:46,430 --> 00:10:47,959 If you think about how much that computer 321 00:10:47,960 --> 00:10:49,759 is worth in monetary terms, if it's a 322 00:10:49,760 --> 00:10:52,189 very if you're targeting servers, remote 323 00:10:52,190 --> 00:10:53,269 Odey against 324 00:10:54,590 --> 00:10:56,929 hardened servers can go for hundreds of 325 00:10:56,930 --> 00:10:58,549 hundreds of one hundred thousand dollars 326 00:10:58,550 --> 00:11:00,409 or five hundred thousand dollars or more 327 00:11:00,410 --> 00:11:01,410 on the black market. 328 00:11:02,660 --> 00:11:04,039 If you're talking about in the case of 329 00:11:04,040 --> 00:11:05,389 the Tor project, censorship 330 00:11:05,390 --> 00:11:07,579 infrastructure, Iran and China 331 00:11:07,580 --> 00:11:09,439 spend hundreds of millions of dollars a 332 00:11:09,440 --> 00:11:10,949 year on their firewall. 333 00:11:10,950 --> 00:11:13,009 So disabling something like Tor can 334 00:11:13,010 --> 00:11:14,989 be potentially worth even more. 335 00:11:14,990 --> 00:11:17,269 And if you're talking about financial 336 00:11:17,270 --> 00:11:19,369 software, just the small 337 00:11:19,370 --> 00:11:22,489 a small example is the Bitcoin community, 338 00:11:22,490 --> 00:11:23,989 the current market cap of all the 339 00:11:23,990 --> 00:11:25,709 bitcoins in circulation is four billion 340 00:11:25,710 --> 00:11:26,839 dollars. 341 00:11:26,840 --> 00:11:28,609 Many of those are offline, obviously, but 342 00:11:28,610 --> 00:11:30,290 obviously still very juicy target. 343 00:11:31,860 --> 00:11:32,879 So I don't know if you want to speak 344 00:11:32,880 --> 00:11:35,089 about your conversations with. 345 00:11:35,090 --> 00:11:37,159 Yeah, so I'm not directly involved in 346 00:11:37,160 --> 00:11:39,139 the Bitcoin development community, but 347 00:11:39,140 --> 00:11:40,819 from what I've heard and I'm sorry that 348 00:11:40,820 --> 00:11:43,219 we don't have Bitcoin developers on stage 349 00:11:43,220 --> 00:11:44,989 to describe this firsthand, from what 350 00:11:44,990 --> 00:11:46,399 I've understood from talking to Bitcoin 351 00:11:46,400 --> 00:11:48,379 developers, this became a really concrete 352 00:11:48,380 --> 00:11:50,479 concern, certainly as 353 00:11:50,480 --> 00:11:52,549 the amount of money 354 00:11:52,550 --> 00:11:54,469 involved in the Bitcoin ecosystem has 355 00:11:54,470 --> 00:11:56,689 grown. The idea is basically that 356 00:11:56,690 --> 00:11:59,299 the Bitcoin transfers are irrevocable. 357 00:11:59,300 --> 00:12:01,849 And so if someone can cause 358 00:12:01,850 --> 00:12:03,380 a Bitcoin client to cause 359 00:12:04,850 --> 00:12:07,069 a transaction, they can steal someone's 360 00:12:07,070 --> 00:12:09,620 bitcoins anonymously and irreversibly. 361 00:12:11,570 --> 00:12:13,639 If you imagine malware in 362 00:12:13,640 --> 00:12:15,949 the Bitcoin client itself, the malware 363 00:12:15,950 --> 00:12:18,109 could sort of wait for a year and then do 364 00:12:18,110 --> 00:12:20,359 this at a certain point and 365 00:12:20,360 --> 00:12:22,639 someone might say, oh, 366 00:12:22,640 --> 00:12:24,469 the developer put that malware in there 367 00:12:24,470 --> 00:12:26,029 to steal everyone's money because the 368 00:12:26,030 --> 00:12:27,559 developer obviously had the necessary 369 00:12:27,560 --> 00:12:29,029 access to do that. 370 00:12:29,030 --> 00:12:30,439 And the developer might say, no, no, 371 00:12:30,440 --> 00:12:32,149 someone hacked into my machine. 372 00:12:32,150 --> 00:12:33,379 It wasn't me. 373 00:12:33,380 --> 00:12:35,059 It was a third party. 374 00:12:35,060 --> 00:12:36,709 Well, people might not believe the 375 00:12:36,710 --> 00:12:38,899 developer or the developer has 376 00:12:38,900 --> 00:12:40,699 an incentive to lie about that. 377 00:12:40,700 --> 00:12:42,979 And so really, 378 00:12:42,980 --> 00:12:45,589 the developers inability to 379 00:12:45,590 --> 00:12:47,779 sneak something into the code or to have 380 00:12:47,780 --> 00:12:49,129 someone else sneak something into the 381 00:12:49,130 --> 00:12:51,169 code late in the development process is a 382 00:12:51,170 --> 00:12:52,819 source of protection for the developer. 383 00:12:52,820 --> 00:12:54,649 And I think this really stimulated the 384 00:12:54,650 --> 00:12:56,419 Tor project thinking about this in the 385 00:12:56,420 --> 00:12:57,439 analagous consideration. 386 00:12:57,440 --> 00:12:59,509 I was very concerned actually about 387 00:12:59,510 --> 00:13:01,759 our build engineers traveling through 388 00:13:01,760 --> 00:13:03,889 various jurisdictions with potentially 389 00:13:03,890 --> 00:13:04,999 with their build machines and 390 00:13:05,000 --> 00:13:05,899 cryptographic material. 391 00:13:05,900 --> 00:13:07,969 And and the laptops have produced 392 00:13:07,970 --> 00:13:09,799 a lot of our packages as we're very 393 00:13:09,800 --> 00:13:11,149 distributed organization. 394 00:13:11,150 --> 00:13:12,529 We don't have a lot of centralized, 395 00:13:12,530 --> 00:13:15,649 secure, physically secure network. 396 00:13:15,650 --> 00:13:17,689 When are you offline computers to be able 397 00:13:17,690 --> 00:13:18,679 to build? 398 00:13:18,680 --> 00:13:20,509 So we had the situation where a building 399 00:13:20,510 --> 00:13:22,849 years were traveling and might be subject 400 00:13:22,850 --> 00:13:24,529 to a course of risk demands to put back 401 00:13:24,530 --> 00:13:25,609 doors in software. 402 00:13:25,610 --> 00:13:27,259 And so I really wanted to eliminate that 403 00:13:27,260 --> 00:13:28,580 completely as a possibility. 404 00:13:30,550 --> 00:13:32,079 So I wanted to get people thinking very 405 00:13:32,080 --> 00:13:34,629 concretely about the 406 00:13:34,630 --> 00:13:36,849 relationship between source and binary as 407 00:13:36,850 --> 00:13:38,919 a vector for introducing bugs and 408 00:13:38,920 --> 00:13:39,920 vulnerabilities. 409 00:13:40,660 --> 00:13:42,309 So I have a couple of concrete examples 410 00:13:42,310 --> 00:13:44,019 just to sort of stimulate thinking in 411 00:13:44,020 --> 00:13:45,279 this area. 412 00:13:45,280 --> 00:13:47,269 This is a real bug from back in 2002. 413 00:13:47,270 --> 00:13:49,089 So this bug is now 12 years old. 414 00:13:49,090 --> 00:13:50,710 I mean, it was fixed in 2002. 415 00:13:51,880 --> 00:13:53,380 This is a very typical bug, 416 00:13:54,670 --> 00:13:56,799 a fence post here in the 417 00:13:56,800 --> 00:13:57,929 open SSA server. 418 00:13:59,080 --> 00:14:00,759 And so the idea is that the programmer 419 00:14:00,760 --> 00:14:02,859 wrote greater than instead 420 00:14:02,860 --> 00:14:04,659 of writing greater than or equal to 421 00:14:04,660 --> 00:14:05,859 because this is something that was 422 00:14:05,860 --> 00:14:06,999 counted starting from zero. 423 00:14:07,000 --> 00:14:08,409 And the programmer was thinking in terms 424 00:14:08,410 --> 00:14:09,429 of counting from one 425 00:14:10,900 --> 00:14:12,639 totally common fence post error. 426 00:14:12,640 --> 00:14:15,279 And so this is the fix that was applied 427 00:14:15,280 --> 00:14:16,959 where the condition should have been 428 00:14:16,960 --> 00:14:18,459 greater than or equal to instead of 429 00:14:18,460 --> 00:14:20,049 greater than. Right. 430 00:14:20,050 --> 00:14:21,309 So it got fixed. 431 00:14:21,310 --> 00:14:23,559 So what kind 432 00:14:23,560 --> 00:14:25,750 of change did this produce when the 433 00:14:26,920 --> 00:14:27,920 fix was applied? 434 00:14:29,450 --> 00:14:31,429 Well, I went through the assembly and 435 00:14:31,430 --> 00:14:32,989 compilation process to look at this 436 00:14:32,990 --> 00:14:34,699 because I had a hypothesis about how big 437 00:14:34,700 --> 00:14:36,529 it would be and in fact, the hypothesis 438 00:14:36,530 --> 00:14:38,989 was right. The difference in the binary 439 00:14:38,990 --> 00:14:41,479 is going to be a single bit fixing 440 00:14:41,480 --> 00:14:43,510 this apparently remotely exploitable bug. 441 00:14:44,750 --> 00:14:46,879 So just looking through the compiler 442 00:14:46,880 --> 00:14:48,529 in one case in the vulnerable case 443 00:14:48,530 --> 00:14:50,869 generated an intel jump if lesser 444 00:14:50,870 --> 00:14:53,179 equal in the fixed case, 445 00:14:53,180 --> 00:14:55,639 the compiler generated jump is less. 446 00:14:55,640 --> 00:14:57,229 And if you look at how those upgrades are 447 00:14:57,230 --> 00:14:59,569 represented, it's 448 00:14:59,570 --> 00:15:01,129 a difference of a single bit. 449 00:15:01,130 --> 00:15:02,419 There's also a corresponding case for 450 00:15:02,420 --> 00:15:04,009 greater than or equal versus greater 451 00:15:04,010 --> 00:15:05,119 then, which is also a difference of a 452 00:15:05,120 --> 00:15:06,469 single bit. 453 00:15:06,470 --> 00:15:07,519 Right. 454 00:15:07,520 --> 00:15:09,379 So this is a small excerpt from the 455 00:15:09,380 --> 00:15:11,599 assembly code of the server from the 456 00:15:11,600 --> 00:15:13,399 division back in 2002. 457 00:15:13,400 --> 00:15:15,889 On the left is the vulnerable 458 00:15:15,890 --> 00:15:17,269 version. On the right is the fixed 459 00:15:17,270 --> 00:15:19,249 version. You might not notice the change. 460 00:15:20,990 --> 00:15:21,990 It's pretty small. 461 00:15:24,310 --> 00:15:25,760 That's the change there, right? 462 00:15:26,890 --> 00:15:29,169 And when we compile it, this is a short 463 00:15:29,170 --> 00:15:31,389 excerpt from a 500 kilobyte binary, 464 00:15:33,910 --> 00:15:36,039 and that's the only change in the binary 465 00:15:36,040 --> 00:15:37,599 as a result of fixing this remotely 466 00:15:37,600 --> 00:15:38,619 explainable bug. 467 00:15:38,620 --> 00:15:40,329 So what that means is you have a 468 00:15:40,330 --> 00:15:42,459 particular concrete case where if you 469 00:15:42,460 --> 00:15:44,649 can flip a single bit in a binary, 470 00:15:44,650 --> 00:15:46,569 that makes the difference between 471 00:15:46,570 --> 00:15:48,159 remotely exploitable or not remotely 472 00:15:48,160 --> 00:15:49,720 exploitable by being present or absent. 473 00:15:52,290 --> 00:15:54,149 Now, I also have a demo that I thought 474 00:15:54,150 --> 00:15:56,309 would also make the sort of 475 00:15:56,310 --> 00:15:58,139 trust in the laptop issue concrete for 476 00:15:58,140 --> 00:15:59,140 people. 477 00:16:00,330 --> 00:16:01,740 So I wrote a kernel load rootkit 478 00:16:03,570 --> 00:16:05,759 just to think about how 479 00:16:05,760 --> 00:16:07,169 much do you know about what your system 480 00:16:07,170 --> 00:16:08,549 is doing when you compile something? 481 00:16:11,040 --> 00:16:12,429 So I have here a program that does 482 00:16:12,430 --> 00:16:13,589 something. Is that big enough for people 483 00:16:13,590 --> 00:16:14,590 to read? 484 00:16:16,620 --> 00:16:18,329 I have here a little program that does 485 00:16:18,330 --> 00:16:20,309 something really important, which is that 486 00:16:20,310 --> 00:16:21,330 it adds to numbers. 487 00:16:23,340 --> 00:16:25,830 We could take check some of the program, 488 00:16:27,240 --> 00:16:29,129 maybe a more cryptographically and 489 00:16:29,130 --> 00:16:30,059 checksum. 490 00:16:30,060 --> 00:16:31,109 Right. 491 00:16:31,110 --> 00:16:32,369 So this is the source code of this 492 00:16:32,370 --> 00:16:34,529 program could look at it. 493 00:16:34,530 --> 00:16:35,530 And Ed. 494 00:16:38,580 --> 00:16:40,049 Right. So it takes two command line 495 00:16:40,050 --> 00:16:41,489 arguments and it adds them and outputs 496 00:16:41,490 --> 00:16:42,490 the result. 497 00:16:43,590 --> 00:16:44,970 So I'm just going to compile that 498 00:16:48,210 --> 00:16:50,009 and what numbers? I'll add seventeen and 499 00:16:50,010 --> 00:16:51,010 twenty three. 500 00:16:52,740 --> 00:16:54,809 OK. Seventeen plus twenty three is forty. 501 00:17:03,860 --> 00:17:05,660 So I'm just going to 502 00:17:06,980 --> 00:17:09,169 install a rootkit on my laptop. 503 00:17:10,690 --> 00:17:12,019 OK. 504 00:17:12,020 --> 00:17:13,999 Using the handy install rootkit command. 505 00:17:17,980 --> 00:17:20,169 So I hope you've memorized the checksums 506 00:17:20,170 --> 00:17:21,219 there, because I'm just going to go 507 00:17:21,220 --> 00:17:22,449 through this process again and take a 508 00:17:22,450 --> 00:17:24,669 look at the code, right? 509 00:17:24,670 --> 00:17:25,670 So there's the code. 510 00:17:33,470 --> 00:17:34,670 Checksums are still the same. 511 00:17:40,770 --> 00:17:41,999 OK, there's the source code. 512 00:17:46,500 --> 00:17:48,599 And let's add 17 and 23 having 513 00:17:48,600 --> 00:17:49,600 compiled this. 514 00:18:11,070 --> 00:18:12,929 And so just to point out quickly what 515 00:18:12,930 --> 00:18:14,400 this particular kid is doing, 516 00:18:17,280 --> 00:18:20,579 if I make a copy of the program 517 00:18:20,580 --> 00:18:22,889 and look at this code, it's 518 00:18:22,890 --> 00:18:23,890 totally normal. 519 00:18:24,990 --> 00:18:27,359 But if that copy of Cat were called 520 00:18:27,360 --> 00:18:30,419 Cissie one, which is one of the internal 521 00:18:30,420 --> 00:18:32,609 subprocesses used by the C compiler, 522 00:18:35,520 --> 00:18:37,169 then the colonel would notice that it was 523 00:18:37,170 --> 00:18:38,939 the compiler trying to read the source 524 00:18:38,940 --> 00:18:41,309 code as opposed to some other program, 525 00:18:41,310 --> 00:18:42,989 and it would actually cause it to open a 526 00:18:42,990 --> 00:18:45,419 different version of the source code. 527 00:18:45,420 --> 00:18:47,069 And so that means that what this cat is 528 00:18:47,070 --> 00:18:49,529 doing is introducing a difference 529 00:18:49,530 --> 00:18:51,659 in the content of files 530 00:18:51,660 --> 00:18:53,849 just at the last moment as you're 531 00:18:53,850 --> 00:18:55,559 actually compiling them. 532 00:18:55,560 --> 00:18:57,720 So if you were to look at 533 00:19:01,170 --> 00:19:02,579 if you were to look at the source code 534 00:19:02,580 --> 00:19:04,439 using any kind of tool on your system 535 00:19:04,440 --> 00:19:06,329 other than the compiler, you would say, 536 00:19:06,330 --> 00:19:08,309 oh, my source code is totally correct. 537 00:19:08,310 --> 00:19:10,529 If you look at it, this shot, once 538 00:19:10,530 --> 00:19:12,369 you say my source code is totally correct 539 00:19:12,370 --> 00:19:14,579 in unmodified and there are no changes 540 00:19:14,580 --> 00:19:16,169 on the disk. Right. 541 00:19:16,170 --> 00:19:17,849 The source code on the disk is correct. 542 00:19:17,850 --> 00:19:19,409 The compiler on the disk is correct. 543 00:19:19,410 --> 00:19:21,479 The kernel on the disk is correct. 544 00:19:21,480 --> 00:19:23,219 All of the modifications are happening in 545 00:19:23,220 --> 00:19:24,599 RAM and then you're getting a modified 546 00:19:24,600 --> 00:19:26,699 binary at the last minute 547 00:19:26,700 --> 00:19:28,139 as you actually compile it. 548 00:19:28,140 --> 00:19:29,879 And the compiler is faithfully compiling 549 00:19:29,880 --> 00:19:30,779 what it's given. 550 00:19:30,780 --> 00:19:32,819 It's just that in this case, the kernel 551 00:19:32,820 --> 00:19:35,249 isn't allowing it to access 552 00:19:35,250 --> 00:19:36,269 the proper source code. 553 00:19:36,270 --> 00:19:38,339 It's accessing a slightly incorrect 554 00:19:38,340 --> 00:19:39,779 version of the source code. 555 00:19:39,780 --> 00:19:41,039 Now, of course, that was just a very 556 00:19:41,040 --> 00:19:42,569 basic proof of concept, Brocket. 557 00:19:42,570 --> 00:19:44,519 I mean, you can't expect to defend 558 00:19:44,520 --> 00:19:46,859 against an arbitrary rootkit by renaming 559 00:19:46,860 --> 00:19:48,630 your compiler to something else and then 560 00:19:49,680 --> 00:19:51,689 hoping that you'll be able to or renaming 561 00:19:51,690 --> 00:19:53,819 your editor to something else and 562 00:19:53,820 --> 00:19:55,649 hoping you'll be able to actually see the 563 00:19:55,650 --> 00:19:56,619 true source code. 564 00:19:56,620 --> 00:19:58,409 A more sophisticated rootkit could, of 565 00:19:58,410 --> 00:20:00,449 course, inspect the virtual address base 566 00:20:00,450 --> 00:20:02,579 of what it claims to be. 567 00:20:02,580 --> 00:20:04,229 The compiler make sure it actually is a 568 00:20:04,230 --> 00:20:06,809 compiler and only act in those cases. 569 00:20:06,810 --> 00:20:08,519 Yeah, and so the basic idea of this kind 570 00:20:08,520 --> 00:20:10,799 of rootkit is to say we want to produce 571 00:20:10,800 --> 00:20:13,289 a specified change only in the binary 572 00:20:13,290 --> 00:20:15,569 of a very specific program, only 573 00:20:15,570 --> 00:20:16,889 at the moment that that's actually 574 00:20:16,890 --> 00:20:19,169 compiled so that the developer will say, 575 00:20:19,170 --> 00:20:21,149 OK, I compiled it myself, I checked the 576 00:20:21,150 --> 00:20:22,869 integrity of everything myself. 577 00:20:22,870 --> 00:20:24,499 This binary is good, right? 578 00:20:26,520 --> 00:20:28,739 OK, so 579 00:20:28,740 --> 00:20:30,299 I think it's going to be very hard to 580 00:20:30,300 --> 00:20:31,679 defend against this if you allow the 581 00:20:31,680 --> 00:20:33,539 possibility that the server or the laptop 582 00:20:33,540 --> 00:20:34,919 that you use to actually make your 583 00:20:34,920 --> 00:20:36,809 binaries may have been compromised at 584 00:20:36,810 --> 00:20:38,249 some point or may be compromised at some 585 00:20:38,250 --> 00:20:39,299 point in the future. 586 00:20:39,300 --> 00:20:41,549 So, yeah. So the idea behind building 587 00:20:41,550 --> 00:20:44,099 usability is eliminating that developer's 588 00:20:44,100 --> 00:20:46,169 laptop or any build machine as 589 00:20:46,170 --> 00:20:48,659 a single point of failure, basically 590 00:20:48,660 --> 00:20:50,579 bringing some science into software 591 00:20:50,580 --> 00:20:51,580 development. For once 592 00:20:52,980 --> 00:20:55,169 you have if you give anybody around 593 00:20:55,170 --> 00:20:57,269 the world the ability to take the source 594 00:20:57,270 --> 00:20:59,699 code and produce 595 00:20:59,700 --> 00:21:00,930 an identical binary 596 00:21:02,160 --> 00:21:03,929 to the one that's being distributed, you 597 00:21:03,930 --> 00:21:06,089 then require the adversary to basically 598 00:21:06,090 --> 00:21:07,869 have to compromise everybody. 599 00:21:07,870 --> 00:21:09,959 And this addresses a pretty wide range 600 00:21:09,960 --> 00:21:11,819 of threat models that relate to the 601 00:21:11,820 --> 00:21:13,170 compilation process itself 602 00:21:14,400 --> 00:21:16,469 in the sense that it's not just about 603 00:21:16,470 --> 00:21:19,319 whether the developer is malicious or 604 00:21:19,320 --> 00:21:21,029 whether a third party is malicious or 605 00:21:21,030 --> 00:21:23,519 whether the developer is under coercion. 606 00:21:23,520 --> 00:21:25,979 Right. It's a wide range of things, 607 00:21:25,980 --> 00:21:27,869 any kind of particular threat that might 608 00:21:27,870 --> 00:21:29,609 result in a discrepancy between the 609 00:21:29,610 --> 00:21:30,610 source code in the binary. 610 00:21:31,560 --> 00:21:34,139 And it also turns out that sort of 611 00:21:34,140 --> 00:21:35,699 development practice is actually very 612 00:21:35,700 --> 00:21:37,229 useful as sort of a canary in the coal 613 00:21:37,230 --> 00:21:39,029 mine to help. 614 00:21:39,030 --> 00:21:40,889 You know, if there's something wrong with 615 00:21:40,890 --> 00:21:42,749 your build machines that build, stop 616 00:21:42,750 --> 00:21:44,909 being reproduced faithfully 617 00:21:44,910 --> 00:21:46,379 there. Something's wrong with the 618 00:21:46,380 --> 00:21:47,699 machine. Either is having a transient 619 00:21:47,700 --> 00:21:49,769 hardware failure or some issue with build 620 00:21:49,770 --> 00:21:51,899 process or perhaps has been 621 00:21:51,900 --> 00:21:52,949 compromised. 622 00:21:52,950 --> 00:21:55,019 So you get that external validation 623 00:21:55,020 --> 00:21:57,149 for free basically from your developer 624 00:21:57,150 --> 00:21:58,150 and enthusiastic. 625 00:22:00,940 --> 00:22:03,249 So how hard is this? 626 00:22:03,250 --> 00:22:05,319 Well, it depends on what you're 627 00:22:05,320 --> 00:22:06,429 trying to compile 628 00:22:07,750 --> 00:22:10,029 for sophisticated, very 629 00:22:10,030 --> 00:22:11,829 large software projects, especially ones 630 00:22:11,830 --> 00:22:13,689 that have custom scripted portions of 631 00:22:13,690 --> 00:22:15,129 their build process that can get quite 632 00:22:15,130 --> 00:22:16,130 involved. 633 00:22:16,660 --> 00:22:18,369 The most obvious differences are if the 634 00:22:18,370 --> 00:22:20,199 build machine is configured differently 635 00:22:20,200 --> 00:22:21,339 as different software. 636 00:22:21,340 --> 00:22:23,349 You have either different compilers, 637 00:22:23,350 --> 00:22:24,879 different optimization flags, different 638 00:22:24,880 --> 00:22:27,579 header files, different library versions, 639 00:22:27,580 --> 00:22:30,129 but it can also extend to 640 00:22:30,130 --> 00:22:32,109 build processes that pull in metadata 641 00:22:32,110 --> 00:22:33,280 from the system. 642 00:22:34,660 --> 00:22:36,459 It's very common to include the build 643 00:22:36,460 --> 00:22:38,889 hostname the kernel version. 644 00:22:38,890 --> 00:22:40,449 The file modification times file. 645 00:22:40,450 --> 00:22:41,530 A lot of times 646 00:22:42,740 --> 00:22:44,979 it turns out the debugging formats 647 00:22:44,980 --> 00:22:47,319 for for LTH actually often include 648 00:22:47,320 --> 00:22:49,449 full path to the reference to where the 649 00:22:49,450 --> 00:22:50,789 source code is expected to be. 650 00:22:52,180 --> 00:22:54,639 The container format, 651 00:22:54,640 --> 00:22:56,859 such as thaat Zip and 652 00:22:56,860 --> 00:22:58,989 Zha and RPK 653 00:22:58,990 --> 00:23:01,179 end up with metadata and file 654 00:23:01,180 --> 00:23:03,369 system source data. 655 00:23:03,370 --> 00:23:05,139 You end up with signatures. 656 00:23:05,140 --> 00:23:07,149 You have entropy that can be introduced 657 00:23:07,150 --> 00:23:08,919 by signatures. If you have any signing 658 00:23:08,920 --> 00:23:10,629 process as part of your distribution and 659 00:23:10,630 --> 00:23:12,429 built into the packages, you probably 660 00:23:12,430 --> 00:23:13,599 don't want to allow third parties to 661 00:23:13,600 --> 00:23:15,159 reproduce your signatures, right? 662 00:23:15,160 --> 00:23:16,599 Obviously, you don't want you don't want 663 00:23:16,600 --> 00:23:18,519 to keep some things some something 664 00:23:18,520 --> 00:23:20,109 secret, such as a material 665 00:23:21,160 --> 00:23:23,799 test driven optimizations is another 666 00:23:23,800 --> 00:23:25,839 problem the browser manufacturer or 667 00:23:25,840 --> 00:23:27,669 producers are grappling with, as I 668 00:23:27,670 --> 00:23:29,739 mentioned earlier, where they have to 669 00:23:29,740 --> 00:23:32,079 try and tune their machine code 670 00:23:32,080 --> 00:23:34,210 for the most popular websites. 671 00:23:35,820 --> 00:23:37,469 So who is doing this? 672 00:23:37,470 --> 00:23:39,659 I think the 673 00:23:39,660 --> 00:23:41,879 most the widest, 674 00:23:41,880 --> 00:23:44,519 most public institution that has 675 00:23:44,520 --> 00:23:46,679 that first made this is a process 676 00:23:46,680 --> 00:23:48,059 for all of their bills was the Bitcoin 677 00:23:48,060 --> 00:23:50,699 community. However, back in the nineties, 678 00:23:50,700 --> 00:23:52,829 the the Free Software Foundation was 679 00:23:52,830 --> 00:23:54,059 concerned with this property. 680 00:23:54,060 --> 00:23:55,769 And I mean, if you trawl through the 681 00:23:55,770 --> 00:23:58,309 ancient documentation man pages 682 00:23:58,310 --> 00:24:00,599 it pre info era, even 683 00:24:00,600 --> 00:24:02,849 you see references to how 684 00:24:02,850 --> 00:24:05,309 to do deterministic linking with R 685 00:24:05,310 --> 00:24:06,569 and D. 686 00:24:06,570 --> 00:24:08,759 So these the chain has been thinking 687 00:24:08,760 --> 00:24:10,409 about this for a while, but it just 688 00:24:10,410 --> 00:24:12,179 hasn't been deployed in practice. 689 00:24:12,180 --> 00:24:13,859 A lot of the toolchain reproducibility 690 00:24:13,860 --> 00:24:16,109 work from the 90s was done by 691 00:24:16,110 --> 00:24:17,219 signees. 692 00:24:17,220 --> 00:24:18,959 When signees was maintaining the new 693 00:24:18,960 --> 00:24:21,419 toolchain, they actually had some tests 694 00:24:21,420 --> 00:24:23,669 that involved getting 695 00:24:23,670 --> 00:24:25,109 reproducibility within the toolchain 696 00:24:25,110 --> 00:24:27,329 itself, like compiling the compiler 697 00:24:27,330 --> 00:24:29,369 twice and making sure that the output was 698 00:24:29,370 --> 00:24:30,370 the same. 699 00:24:31,080 --> 00:24:32,099 John Gilmore said that. 700 00:24:32,100 --> 00:24:33,100 That. 701 00:24:33,470 --> 00:24:35,569 Turned up dozens of boats which Cygnus 702 00:24:35,570 --> 00:24:37,639 managed to fix in various parts 703 00:24:37,640 --> 00:24:39,979 of the toolchain, so it's good for 704 00:24:39,980 --> 00:24:41,929 other testing aspects. 705 00:24:41,930 --> 00:24:43,579 But yeah, there was this concern in the 706 00:24:43,580 --> 00:24:45,409 toolchain in the 90s and unfortunately, 707 00:24:45,410 --> 00:24:47,389 it didn't really spread throughout the 708 00:24:47,390 --> 00:24:48,019 operating system. 709 00:24:48,020 --> 00:24:49,020 And applications were 710 00:24:50,210 --> 00:24:52,519 thankfully, that's starting to change. 711 00:24:52,520 --> 00:24:54,499 The Tor project, I said, was inspired by 712 00:24:54,500 --> 00:24:56,779 by Bitcoin to use 713 00:24:56,780 --> 00:24:58,879 the Gideon system, which I'll get into in 714 00:24:58,880 --> 00:25:00,019 a bit. 715 00:25:00,020 --> 00:25:01,699 The Guardian project is also working on 716 00:25:01,700 --> 00:25:03,049 making sure their packages are 717 00:25:03,050 --> 00:25:05,179 reproducible. Debian has somewhere 718 00:25:05,180 --> 00:25:07,249 around two thirds of their packages 719 00:25:07,250 --> 00:25:09,499 currently already reproducible, 720 00:25:09,500 --> 00:25:11,209 at least in one of the branches, either 721 00:25:11,210 --> 00:25:12,889 unstable or testing. 722 00:25:12,890 --> 00:25:14,729 RedHat is working on this and Fedora, 723 00:25:14,730 --> 00:25:17,029 after all, it is has a very neat build 724 00:25:17,030 --> 00:25:19,279 verify verification system in the works, 725 00:25:19,280 --> 00:25:20,539 which will describe in a bit. 726 00:25:20,540 --> 00:25:23,269 Mozilla is interested and we hope that 727 00:25:23,270 --> 00:25:24,889 developers in the audience and who review 728 00:25:24,890 --> 00:25:27,589 this talk online will be inspired 729 00:25:27,590 --> 00:25:29,149 to try and do this for their own software 730 00:25:29,150 --> 00:25:30,150 projects. 731 00:25:31,060 --> 00:25:32,859 So how does Tor browser do this? 732 00:25:32,860 --> 00:25:35,289 Well, I'm the technical 733 00:25:35,290 --> 00:25:37,750 lead on the browser, 734 00:25:38,980 --> 00:25:41,229 and for those who aren't familiar, 735 00:25:41,230 --> 00:25:43,719 basically we have a branch of Firefox. 736 00:25:43,720 --> 00:25:45,789 We have about 40 to 50 patches on top 737 00:25:45,790 --> 00:25:48,039 of Firefox for third party tracking 738 00:25:48,040 --> 00:25:49,599 and fingerprinting defenses and tor 739 00:25:49,600 --> 00:25:51,129 integration. 740 00:25:51,130 --> 00:25:53,139 We have a Tor client and the add on that 741 00:25:53,140 --> 00:25:55,689 helps make configuration of of 742 00:25:55,690 --> 00:25:58,029 TOR. It's very simple for for novice 743 00:25:58,030 --> 00:25:59,259 users. 744 00:25:59,260 --> 00:26:00,429 And then we have these things called 745 00:26:00,430 --> 00:26:01,899 pluggable transports, which are actually 746 00:26:01,900 --> 00:26:03,969 separate binaries that obfuscate 747 00:26:03,970 --> 00:26:05,889 the traffic for the first half to get 748 00:26:05,890 --> 00:26:07,899 around censorship firewalls in China and 749 00:26:07,900 --> 00:26:09,669 Iran and elsewhere. 750 00:26:09,670 --> 00:26:11,349 And we have a couple of add ons as well. 751 00:26:11,350 --> 00:26:12,680 No skipping everywhere. 752 00:26:13,870 --> 00:26:16,059 So the build system we now use, 753 00:26:16,060 --> 00:26:18,099 Gideon, which was developed, as I said, 754 00:26:18,100 --> 00:26:20,079 was developed by the Bitcoin community, 755 00:26:20,080 --> 00:26:21,969 basically. Well, I'll get into what 756 00:26:21,970 --> 00:26:23,919 Gideon is in a bit. 757 00:26:23,920 --> 00:26:25,689 The output from Gideon produces a full 758 00:26:25,690 --> 00:26:27,819 package set and incremental 759 00:26:27,820 --> 00:26:30,069 update files as used by the Firefox 760 00:26:30,070 --> 00:26:32,829 updater, for which 761 00:26:32,830 --> 00:26:35,709 a single shot to fix some file 762 00:26:35,710 --> 00:26:37,299 lists all of the hashes of all the 763 00:26:37,300 --> 00:26:39,609 individual packages and those Choudhry 764 00:26:39,610 --> 00:26:41,679 six. Some files are signed 765 00:26:41,680 --> 00:26:44,919 by all participating official developers. 766 00:26:44,920 --> 00:26:46,359 We also were very interested in 767 00:26:46,360 --> 00:26:47,889 supporting anonymous verifiers. 768 00:26:47,890 --> 00:26:50,049 So people the inputs are actually 769 00:26:50,050 --> 00:26:52,029 downloaded by default over Tor. 770 00:26:52,030 --> 00:26:54,189 So it's very hard to tell who is who is 771 00:26:54,190 --> 00:26:55,959 a verifier. We encourage anybody who 772 00:26:55,960 --> 00:26:57,549 finds discrepancies in our bills to 773 00:26:57,550 --> 00:26:59,139 report them pseudonymously in our bug 774 00:26:59,140 --> 00:27:01,869 tracker and not reveal necessarily 775 00:27:01,870 --> 00:27:04,059 who they are so that we can 776 00:27:04,060 --> 00:27:06,069 make it even harder to compromise all of 777 00:27:06,070 --> 00:27:08,139 the extended community of 778 00:27:08,140 --> 00:27:10,479 verifiers so no one can make a list 779 00:27:10,480 --> 00:27:12,579 of what all the machines 780 00:27:12,580 --> 00:27:14,469 that are testing the reproducibility and 781 00:27:14,470 --> 00:27:15,940 correctness of the bill. 782 00:27:17,350 --> 00:27:19,749 And also very important for us not to 783 00:27:19,750 --> 00:27:21,219 have it require dedicated hardware, 784 00:27:21,220 --> 00:27:22,659 especially not everybody. 785 00:27:22,660 --> 00:27:24,429 We don't want people to have to purchase 786 00:27:24,430 --> 00:27:26,679 a Mac and a Windows machine 787 00:27:26,680 --> 00:27:28,059 just to be able to reproduce those 788 00:27:28,060 --> 00:27:29,229 builds. 789 00:27:29,230 --> 00:27:32,199 And we wanted the all the build 790 00:27:32,200 --> 00:27:33,819 components and dependencies to at least 791 00:27:33,820 --> 00:27:35,319 be free, as in beer. 792 00:27:35,320 --> 00:27:37,539 So at least interested individuals 793 00:27:37,540 --> 00:27:38,679 could download them. Even for the 794 00:27:38,680 --> 00:27:40,479 proprietary platforms. 795 00:27:40,480 --> 00:27:42,909 For example, the Mac OS SDK is 796 00:27:42,910 --> 00:27:44,049 free, isn't beer. 797 00:27:44,050 --> 00:27:45,549 You get binaries, but you don't get 798 00:27:45,550 --> 00:27:47,649 source code. But we're still able to use 799 00:27:47,650 --> 00:27:49,479 that. But the only tools are, of course, 800 00:27:49,480 --> 00:27:50,480 all free as in freedom. 801 00:27:52,600 --> 00:27:54,429 And as far as what those components look 802 00:27:54,430 --> 00:27:57,489 like on Windows, we cross, compile 803 00:27:57,490 --> 00:28:00,129 and on Mac, we cross compile from 804 00:28:00,130 --> 00:28:01,209 a Linux host. 805 00:28:01,210 --> 00:28:03,489 We use mingy W-W 806 00:28:03,490 --> 00:28:05,739 sixty four for the compiler and Windows 807 00:28:05,740 --> 00:28:07,839 we use hydroxy for the python 808 00:28:07,840 --> 00:28:09,789 bits and Nelsa for the installer. 809 00:28:09,790 --> 00:28:11,319 We have a couple of cross compilers for 810 00:28:11,320 --> 00:28:13,779 Mac that were graciously 811 00:28:13,780 --> 00:28:14,949 provided by Ray. 812 00:28:14,950 --> 00:28:16,359 He's very excellent. 813 00:28:16,360 --> 00:28:18,639 We worked in video gaming video, 814 00:28:18,640 --> 00:28:20,139 the video game industry, but is very 815 00:28:20,140 --> 00:28:21,669 interested in making sure he can compile 816 00:28:21,670 --> 00:28:24,789 for iOS and Mac from his Linux machine. 817 00:28:24,790 --> 00:28:26,199 And we use to produce DMG. 818 00:28:26,200 --> 00:28:28,989 We use make a lot of us and DMG 819 00:28:28,990 --> 00:28:30,579 and for Linux we use some newer versions 820 00:28:30,580 --> 00:28:31,749 of the toolchain which we compile 821 00:28:31,750 --> 00:28:33,189 ourselves for reasons I'll get into in a 822 00:28:33,190 --> 00:28:34,190 bit. 823 00:28:34,720 --> 00:28:36,099 So what is this Gideon Beast? 824 00:28:37,150 --> 00:28:39,189 Basically, as I said, it was developed by 825 00:28:39,190 --> 00:28:40,149 the Bitcoin community. 826 00:28:40,150 --> 00:28:42,099 It's a thin wrapper around the Bunta 827 00:28:42,100 --> 00:28:44,169 virtualization tools and that's 828 00:28:44,170 --> 00:28:46,509 Kumu Cavium and see the 829 00:28:46,510 --> 00:28:47,559 Linux containers. 830 00:28:48,580 --> 00:28:50,709 The compilation stages are 831 00:28:50,710 --> 00:28:52,899 well represented in what are called 832 00:28:52,900 --> 00:28:54,969 Yamal descriptors, which are individual 833 00:28:54,970 --> 00:28:57,489 files that specify Kubuntu version 834 00:28:57,490 --> 00:28:59,559 and architecture, a package list to 835 00:28:59,560 --> 00:29:01,839 install on that VM, a list 836 00:29:01,840 --> 00:29:04,199 of git repos to clone 837 00:29:04,200 --> 00:29:06,459 and additional any additional input files 838 00:29:06,460 --> 00:29:08,979 that go in, and then 839 00:29:08,980 --> 00:29:11,229 an inline baskett that just gets run 840 00:29:11,230 --> 00:29:12,939 on that virtual machine as your build 841 00:29:12,940 --> 00:29:15,039 script. And then these descriptors 842 00:29:15,040 --> 00:29:17,589 can be changed. So the file input files 843 00:29:18,730 --> 00:29:21,039 produce go into a compilation process 844 00:29:21,040 --> 00:29:23,259 to produce output files and then those 845 00:29:23,260 --> 00:29:25,359 output files can go in as input files to 846 00:29:25,360 --> 00:29:26,360 the next stage. 847 00:29:27,710 --> 00:29:28,999 So what does this provide? 848 00:29:29,000 --> 00:29:30,619 Obviously, any sort of scripted 849 00:29:30,620 --> 00:29:32,809 virtualization container or container 850 00:29:32,810 --> 00:29:35,419 system can help normalize hostname, 851 00:29:35,420 --> 00:29:37,879 username your build, pass 852 00:29:37,880 --> 00:29:39,050 your tool versions, 853 00:29:40,220 --> 00:29:42,349 your if it's a full, fully virtualize 854 00:29:42,350 --> 00:29:44,539 system, your kernel and your name 855 00:29:44,540 --> 00:29:47,449 output, and then through fake time 856 00:29:47,450 --> 00:29:48,589 can fix the time. 857 00:29:48,590 --> 00:29:49,789 Some timestamp issues. 858 00:29:51,950 --> 00:29:54,019 It doesn't require, as I said, very 859 00:29:54,020 --> 00:29:55,639 important to us. It doesn't require 860 00:29:55,640 --> 00:29:56,639 delicate hardware. 861 00:29:56,640 --> 00:29:59,899 It can be run on any any Linux machine 862 00:29:59,900 --> 00:30:01,909 that is capable of running Ubuntu in some 863 00:30:01,910 --> 00:30:03,949 form and we're extending that to Debian 864 00:30:03,950 --> 00:30:04,950 very soon. 865 00:30:05,750 --> 00:30:08,059 It authenticates your get based input for 866 00:30:08,060 --> 00:30:11,029 you and it integrates with the Fatime 867 00:30:11,030 --> 00:30:12,949 Linux command line utility for spoofing 868 00:30:12,950 --> 00:30:13,950 timestamps. 869 00:30:15,540 --> 00:30:17,159 So some problems that we ran into with 870 00:30:17,160 --> 00:30:20,069 Gideon to start a Kubuntu only, 871 00:30:20,070 --> 00:30:21,509 we've worked out we've put some work into 872 00:30:21,510 --> 00:30:23,579 making it possible to be 873 00:30:23,580 --> 00:30:25,919 hosted from Debian so you can launch your 874 00:30:25,920 --> 00:30:27,899 Dabby, your VMS from a Debian system. 875 00:30:27,900 --> 00:30:29,999 But they're still right now only can 876 00:30:30,000 --> 00:30:31,589 only be a bunch of guests. 877 00:30:31,590 --> 00:30:32,640 So we're still working on that. 878 00:30:33,720 --> 00:30:35,789 If you're using non input, obviously his 879 00:30:35,790 --> 00:30:38,009 named Gideon after git, 880 00:30:38,010 --> 00:30:39,389 you have to provide additional 881 00:30:39,390 --> 00:30:42,389 authentication for for for those packages 882 00:30:42,390 --> 00:30:44,519 and explain how you know what 883 00:30:44,520 --> 00:30:46,589 the input is, how you 884 00:30:46,590 --> 00:30:48,089 identify the right 885 00:30:49,290 --> 00:30:50,879 and parcel compilation. 886 00:30:50,880 --> 00:30:51,880 So it is a little tricky. 887 00:30:52,910 --> 00:30:55,079 Basically the it creates 888 00:30:55,080 --> 00:30:57,329 a base image and then you have for every 889 00:30:57,330 --> 00:30:59,279 compilation run that you do, creates a 890 00:30:59,280 --> 00:31:01,889 queue to copy on write 891 00:31:01,890 --> 00:31:03,629 secondary image that gets written to for 892 00:31:03,630 --> 00:31:04,709 any modifications. 893 00:31:04,710 --> 00:31:06,929 And normally that that 894 00:31:06,930 --> 00:31:09,149 secondary image is destroyed 895 00:31:09,150 --> 00:31:12,059 for each stage to keep the base clean. 896 00:31:12,060 --> 00:31:14,159 But you can hack around and 897 00:31:14,160 --> 00:31:15,869 sort of play with descriptor stages. 898 00:31:15,870 --> 00:31:17,429 And I believe there's a branch that 899 00:31:17,430 --> 00:31:19,529 allows you to use that secondary 900 00:31:19,530 --> 00:31:21,719 portion to do incremental 901 00:31:21,720 --> 00:31:23,729 builds where say you modify a source code 902 00:31:23,730 --> 00:31:26,039 file and tor you can then resume 903 00:31:26,040 --> 00:31:28,739 the compilation process of just Tor, 904 00:31:28,740 --> 00:31:30,569 but it is still kind of clunky and 905 00:31:30,570 --> 00:31:31,979 time-Consuming to sit there and wait for 906 00:31:31,980 --> 00:31:33,449 these builds and end to see if they 907 00:31:33,450 --> 00:31:34,450 match. 908 00:31:35,080 --> 00:31:37,289 It took me about two months of work 909 00:31:37,290 --> 00:31:38,879 to get to our browser building for all 910 00:31:38,880 --> 00:31:41,399 three platforms reproducibly 911 00:31:41,400 --> 00:31:43,709 and most of that was spent sitting around 912 00:31:43,710 --> 00:31:45,779 waiting willcock for 913 00:31:45,780 --> 00:31:47,009 the build to complete. 914 00:31:47,010 --> 00:31:48,299 Maybe you should have done the sword 915 00:31:48,300 --> 00:31:49,199 fighting. 916 00:31:49,200 --> 00:31:50,200 Oh yeah, I 917 00:31:51,570 --> 00:31:52,509 know I did a lot of that. 918 00:31:52,510 --> 00:31:53,729 Various forms, don't worry. 919 00:31:57,660 --> 00:31:59,849 So yeah. So this means it's sort of kind 920 00:31:59,850 --> 00:32:02,279 of janky, you 921 00:32:02,280 --> 00:32:04,589 know, sometimes of process managements 922 00:32:04,590 --> 00:32:07,079 with due process management issues 923 00:32:07,080 --> 00:32:09,689 with QMI cavium 924 00:32:09,690 --> 00:32:11,879 you can only have one a slave 925 00:32:11,880 --> 00:32:14,039 or Aleksi slave at a time and 926 00:32:14,040 --> 00:32:15,389 it doesn't solve everything. 927 00:32:15,390 --> 00:32:17,669 So in the case of Firefox, we 928 00:32:17,670 --> 00:32:19,709 have Python scripts pulling in things 929 00:32:19,710 --> 00:32:21,119 from all of the file system, assembling 930 00:32:21,120 --> 00:32:22,500 them in jars and zip's 931 00:32:23,670 --> 00:32:25,559 you have all sorts of file system 932 00:32:25,560 --> 00:32:27,599 ordering issues. So interestingly, the 933 00:32:27,600 --> 00:32:29,579 reader system call in POSIX 934 00:32:30,660 --> 00:32:32,729 the order of your directory listing 935 00:32:32,730 --> 00:32:34,649 is not specified. It basically comes out 936 00:32:34,650 --> 00:32:36,509 and whatever ordering the innards were 937 00:32:36,510 --> 00:32:38,249 written to on the file system. 938 00:32:38,250 --> 00:32:40,589 So that means machines of different 939 00:32:40,590 --> 00:32:41,579 speeds. 940 00:32:41,580 --> 00:32:43,499 If you have a compilation process, 941 00:32:43,500 --> 00:32:45,149 especially running in parallel and things 942 00:32:45,150 --> 00:32:47,129 are happening at the same time, the order 943 00:32:47,130 --> 00:32:48,839 of those files getting written, that 944 00:32:48,840 --> 00:32:50,099 directory can be different. 945 00:32:50,100 --> 00:32:52,289 And then a script or zip or 946 00:32:52,290 --> 00:32:54,389 thaat that pulls them all in is going to 947 00:32:54,390 --> 00:32:56,699 end up with issue with ordering issues 948 00:32:56,700 --> 00:32:57,659 in. The binaries are going to be 949 00:32:57,660 --> 00:32:58,660 different. 950 00:32:59,460 --> 00:33:00,839 So the answer to that is, of course, you 951 00:33:00,840 --> 00:33:02,459 sort it, but then you run into the 952 00:33:02,460 --> 00:33:04,079 problem. The different locales sort 953 00:33:04,080 --> 00:33:04,989 things differently. 954 00:33:04,990 --> 00:33:06,419 So you have to make sure you set the 955 00:33:06,420 --> 00:33:08,879 locale explicitly to specify 956 00:33:08,880 --> 00:33:09,880 your sorting order. 957 00:33:10,860 --> 00:33:12,749 We run into ran into an initialized 958 00:33:12,750 --> 00:33:14,819 memory. So as I 959 00:33:14,820 --> 00:33:16,889 said, there was Tussin in Cygnus 960 00:33:16,890 --> 00:33:19,229 that made sure that these the toolchain 961 00:33:19,230 --> 00:33:21,599 had no disability issues of its own. 962 00:33:21,600 --> 00:33:23,189 But on some fork's like the cross 963 00:33:23,190 --> 00:33:25,379 compilers, these tests weren't being run. 964 00:33:25,380 --> 00:33:27,329 And Mianzhu means 64. 965 00:33:27,330 --> 00:33:29,579 Introduce a regression where 966 00:33:29,580 --> 00:33:31,619 structure field or the padding in a 967 00:33:31,620 --> 00:33:33,869 structure wasn't being properly Menza 968 00:33:33,870 --> 00:33:36,209 to zero. And so we were getting random 969 00:33:36,210 --> 00:33:38,449 bytes and parts of our windows binaries 970 00:33:38,450 --> 00:33:40,379 with a similar problem in the creation of 971 00:33:40,380 --> 00:33:42,429 the DMG on a Mac OS. 972 00:33:42,430 --> 00:33:43,979 If got some of those random bytes turned 973 00:33:43,980 --> 00:33:44,879 out to be meaningful. 974 00:33:44,880 --> 00:33:46,439 Oh yeah. And we sent it in terrible. 975 00:33:46,440 --> 00:33:47,999 We spent a long time trying to figure 976 00:33:48,000 --> 00:33:50,189 that out. I actually wrote a script, 977 00:33:50,190 --> 00:33:52,199 a little script lit to look for the 978 00:33:52,200 --> 00:33:54,239 surrounding bytes and then just bang out 979 00:33:54,240 --> 00:33:55,559 those bits to like twenty three. 980 00:33:55,560 --> 00:33:56,849 Twenty three. Twenty three. 981 00:33:56,850 --> 00:33:59,549 And people were shocked and horrified 982 00:33:59,550 --> 00:34:01,479 at that. Enough for anonymous contributor 983 00:34:01,480 --> 00:34:03,499 to show me like, no, there's a problem. 984 00:34:03,500 --> 00:34:04,919 And then you say, here's a patch, stop 985 00:34:04,920 --> 00:34:05,999 doing this. 986 00:34:06,000 --> 00:34:07,169 So that was kind of nice. Thank you. 987 00:34:07,170 --> 00:34:08,429 But not now. More of you are 988 00:34:09,630 --> 00:34:11,939 or scruffy round of applause 989 00:34:11,940 --> 00:34:12,940 for scruffy everyone. 990 00:34:14,070 --> 00:34:15,070 Well they pretty much. 991 00:34:19,350 --> 00:34:21,059 That's our anonymous to our blog reporter 992 00:34:21,060 --> 00:34:23,158 who is possibly a collection 993 00:34:23,159 --> 00:34:25,289 of people that reports all manner of 994 00:34:25,290 --> 00:34:27,479 interesting bugs and patches 995 00:34:27,480 --> 00:34:28,480 as well. 996 00:34:30,330 --> 00:34:31,979 So time zone and you may ask, obviously 997 00:34:31,980 --> 00:34:33,569 can leak deliberately generated 998 00:34:33,570 --> 00:34:34,619 signatures, entropy. 999 00:34:36,120 --> 00:34:38,369 And Aleksi and probably Dr. Container's 1000 00:34:38,370 --> 00:34:39,419 will have this problem as well. 1001 00:34:39,420 --> 00:34:40,559 For people who are experimenting with 1002 00:34:40,560 --> 00:34:42,238 other ways to do this, they won't raptor 1003 00:34:42,239 --> 00:34:44,638 kernel where there's a 1004 00:34:44,639 --> 00:34:46,259 build issue with live GMP, where it 1005 00:34:46,260 --> 00:34:48,448 actually inspects your current CPU 1006 00:34:48,449 --> 00:34:49,678 and says, oh, I'll build with these 1007 00:34:49,679 --> 00:34:50,879 optimizations for you. 1008 00:34:50,880 --> 00:34:52,829 You know, we know you have this type of 1009 00:34:52,830 --> 00:34:54,149 CPU. We'll just build with that. 1010 00:34:54,150 --> 00:34:55,468 Unless you say no, build all the 1011 00:34:55,469 --> 00:34:57,839 optimizations and then 1012 00:34:57,840 --> 00:34:59,039 so you get differences in different 1013 00:34:59,040 --> 00:35:00,419 views. 1014 00:35:00,420 --> 00:35:02,519 And he still has memory initialization 1015 00:35:02,520 --> 00:35:03,989 issues that we have not fully tracked 1016 00:35:03,990 --> 00:35:05,370 down yet on some platforms. 1017 00:35:06,530 --> 00:35:08,760 So since we're talking about we 1018 00:35:09,840 --> 00:35:12,119 wanted to also protect our developers 1019 00:35:12,120 --> 00:35:13,889 from targeted input delivery. 1020 00:35:13,890 --> 00:35:15,479 So we want to make sure that nobody could 1021 00:35:15,480 --> 00:35:18,119 compromise our are 1022 00:35:18,120 --> 00:35:20,159 the way we download inputs and say, oh, 1023 00:35:20,160 --> 00:35:21,929 that looks like the download download for 1024 00:35:21,930 --> 00:35:23,939 Gideon, let's feed this malicious source 1025 00:35:23,940 --> 00:35:25,589 code. And now when they run the configure 1026 00:35:25,590 --> 00:35:26,999 script, we can compromise the build 1027 00:35:27,000 --> 00:35:28,109 process that way. 1028 00:35:28,110 --> 00:35:29,399 Or that looks like the download for 1029 00:35:29,400 --> 00:35:30,899 Gideon. Let's try to compromise that 1030 00:35:30,900 --> 00:35:31,949 person's laptop. 1031 00:35:31,950 --> 00:35:34,049 Yeah. So that we can cause it to say yes, 1032 00:35:34,050 --> 00:35:36,719 everything compiled properly 1033 00:35:36,720 --> 00:35:38,489 and some projects don't even provide 1034 00:35:38,490 --> 00:35:39,599 signatures for us to do this 1035 00:35:39,600 --> 00:35:40,139 verification. 1036 00:35:40,140 --> 00:35:42,359 So we have to actually hard code 1037 00:35:42,360 --> 00:35:44,269 hashes in the build process. 1038 00:35:44,270 --> 00:35:46,769 So in case their servers, their files, 1039 00:35:46,770 --> 00:35:49,139 FTP servers or HP servers, 1040 00:35:49,140 --> 00:35:51,689 our man in the middle or compromised, 1041 00:35:51,690 --> 00:35:53,219 that can't be used to affect us. 1042 00:35:53,220 --> 00:35:55,079 Or at least we'll know and see that the 1043 00:35:55,080 --> 00:35:57,319 shot is different before running 1044 00:35:57,320 --> 00:35:58,589 the config strip. 1045 00:35:58,590 --> 00:36:00,839 And an embarrassing list of projects 1046 00:36:00,840 --> 00:36:02,849 actually have very weak or no signatures. 1047 00:36:02,850 --> 00:36:05,009 Open SSL for a long time was signing 1048 00:36:05,010 --> 00:36:06,059 with MI five. 1049 00:36:06,060 --> 00:36:07,559 Now they signed with 12 keys. 1050 00:36:07,560 --> 00:36:09,419 One of them is owned by Frodo Baggins of 1051 00:36:09,420 --> 00:36:11,459 the Shire. Or at least that's what the 1052 00:36:11,460 --> 00:36:12,809 email address is. 1053 00:36:12,810 --> 00:36:14,429 There's another name on it to the 1054 00:36:14,430 --> 00:36:15,389 software. 1055 00:36:15,390 --> 00:36:16,949 Software authentication problem in 1056 00:36:16,950 --> 00:36:18,209 general is quite bad. 1057 00:36:20,490 --> 00:36:22,319 My former colleague from EFF, Chris 1058 00:36:22,320 --> 00:36:24,569 Palmer, wrote a piece about it 1059 00:36:24,570 --> 00:36:26,909 trying to authenticate a PUDI binary. 1060 00:36:28,650 --> 00:36:30,209 Pudi is kind of important because you 1061 00:36:30,210 --> 00:36:32,399 might use it to log into all your servers 1062 00:36:32,400 --> 00:36:34,649 and they haven't done it download 1063 00:36:34,650 --> 00:36:37,019 and they have a 1064 00:36:37,020 --> 00:36:39,659 HTP delivered signature 1065 00:36:39,660 --> 00:36:40,949 that's on another domain. 1066 00:36:40,950 --> 00:36:43,709 And yeah, there's 1067 00:36:43,710 --> 00:36:45,269 software authentication in general. 1068 00:36:45,270 --> 00:36:47,249 It's pretty rough out there. 1069 00:36:47,250 --> 00:36:49,199 Yeah, especially in the Windows world I 1070 00:36:49,200 --> 00:36:50,139 think. 1071 00:36:50,140 --> 00:36:52,649 Well, but luckily 1072 00:36:52,650 --> 00:36:54,029 I think Firefox is probably one of the 1073 00:36:54,030 --> 00:36:55,919 most complicated things in the world to 1074 00:36:55,920 --> 00:36:57,569 try and build this way. 1075 00:36:57,570 --> 00:36:59,369 As a result, our build process is massive 1076 00:36:59,370 --> 00:37:00,809 and scary. 1077 00:37:00,810 --> 00:37:02,459 Most things aren't that complicated, 1078 00:37:02,460 --> 00:37:04,589 especially, it turns out, Android. 1079 00:37:04,590 --> 00:37:06,719 The most of this material is prepared by 1080 00:37:06,720 --> 00:37:09,359 Hans Steiner of the Guardian project, 1081 00:37:09,360 --> 00:37:11,909 but he has been working on on 1082 00:37:11,910 --> 00:37:13,679 reproducing Android packages. 1083 00:37:13,680 --> 00:37:15,539 And because they're mostly pure Java and 1084 00:37:15,540 --> 00:37:17,340 the JDK versions are standardized, 1085 00:37:18,360 --> 00:37:21,479 it it's actually fairly, 1086 00:37:21,480 --> 00:37:23,399 fairly straightforward to produce a 1087 00:37:23,400 --> 00:37:25,649 reproducible Android package, at least 1088 00:37:25,650 --> 00:37:27,899 to the point where the signatures match. 1089 00:37:27,900 --> 00:37:30,299 So for AP KS, I guess Google 1090 00:37:30,300 --> 00:37:32,639 realized that the jar format 1091 00:37:32,640 --> 00:37:34,589 based on ZIP can pull in all sorts of 1092 00:37:34,590 --> 00:37:37,259 metadata and have ordering issues. 1093 00:37:37,260 --> 00:37:39,179 And as a result, the app actually only 1094 00:37:39,180 --> 00:37:41,249 signs the contents and the manifest 1095 00:37:41,250 --> 00:37:44,429 that describes the package 1096 00:37:44,430 --> 00:37:46,949 and doesn't mess with the container. 1097 00:37:46,950 --> 00:37:48,989 Now the Lansac people in the audience 1098 00:37:48,990 --> 00:37:50,759 probably are all cringing like, oh, now 1099 00:37:50,760 --> 00:37:52,199 there's this discrepancy between 1100 00:37:52,200 --> 00:37:54,269 verification of can exploit be 1101 00:37:54,270 --> 00:37:56,069 introduced by the what was happening at 1102 00:37:56,070 --> 00:37:58,049 the zip layer versus what's happening at 1103 00:37:58,050 --> 00:37:59,669 a signature layer. 1104 00:37:59,670 --> 00:38:02,189 I think the answer is probably yes. 1105 00:38:02,190 --> 00:38:03,269 In fact, I believe there's was a very 1106 00:38:03,270 --> 00:38:05,099 good talk last year on this possibility 1107 00:38:05,100 --> 00:38:06,749 where they've demonstrated that those 1108 00:38:06,750 --> 00:38:08,249 sorts of discrepancies can introduce 1109 00:38:08,250 --> 00:38:10,379 Turing complete machines 1110 00:38:10,380 --> 00:38:12,479 that you can use to actually execute 1111 00:38:12,480 --> 00:38:13,480 what you want 1112 00:38:14,730 --> 00:38:16,619 based on the just the discrepancies 1113 00:38:16,620 --> 00:38:18,539 between the grammars of of different 1114 00:38:18,540 --> 00:38:19,540 parsers. 1115 00:38:20,340 --> 00:38:21,649 That was depends on how the person is 1116 00:38:21,650 --> 00:38:22,839 using. Right. 1117 00:38:22,840 --> 00:38:25,019 Depends on the nature of how you're using 1118 00:38:25,020 --> 00:38:25,199 that. 1119 00:38:25,200 --> 00:38:27,869 But so 1120 00:38:27,870 --> 00:38:30,659 nonetheless, afterward, it is 1121 00:38:30,660 --> 00:38:33,569 moving towards using this property to 1122 00:38:33,570 --> 00:38:36,569 allow developers to build their packages 1123 00:38:36,570 --> 00:38:38,909 locally, submit the binaries 1124 00:38:38,910 --> 00:38:41,459 with their local key signature 1125 00:38:41,460 --> 00:38:43,409 to after ID and as well as the source 1126 00:38:43,410 --> 00:38:44,369 code URL. 1127 00:38:44,370 --> 00:38:47,009 And then afterward, verifier can 1128 00:38:47,010 --> 00:38:49,439 recompile the source code, run verify, 1129 00:38:49,440 --> 00:38:51,149 which takes the signature from the 1130 00:38:51,150 --> 00:38:53,339 developer's binary, attaches it to 1131 00:38:53,340 --> 00:38:55,349 the after I built by an area and sees if 1132 00:38:55,350 --> 00:38:57,839 it verifies still and if so, 1133 00:38:57,840 --> 00:38:59,969 then it can. The verify binaries can 1134 00:38:59,970 --> 00:39:01,649 be published with the developer 1135 00:39:01,650 --> 00:39:02,549 signature. 1136 00:39:02,550 --> 00:39:05,039 So now you solve the problem with Droid 1137 00:39:05,040 --> 00:39:07,229 that currently exists where you're 1138 00:39:07,230 --> 00:39:08,999 implicitly trust it. We're all whoever 1139 00:39:09,000 --> 00:39:10,859 uses asteroid is implicitly trusting 1140 00:39:10,860 --> 00:39:13,649 asteroid and their ability to maintain 1141 00:39:13,650 --> 00:39:15,689 control over their key. 1142 00:39:15,690 --> 00:39:17,129 There's a single asteroid signing 1143 00:39:17,130 --> 00:39:19,079 currently. But this reproducibility 1144 00:39:19,080 --> 00:39:21,509 system allows you to have to compromise 1145 00:39:21,510 --> 00:39:23,969 both the asteroid belt system and 1146 00:39:23,970 --> 00:39:26,369 the developer's laptop. 1147 00:39:26,370 --> 00:39:28,019 So let's just think concretely again for 1148 00:39:28,020 --> 00:39:29,309 a minute about what the world would be 1149 00:39:29,310 --> 00:39:31,379 like without this property or 1150 00:39:31,380 --> 00:39:32,939 what the world has been like without this 1151 00:39:32,940 --> 00:39:34,139 property. 1152 00:39:34,140 --> 00:39:35,879 So without this property, you have all 1153 00:39:35,880 --> 00:39:37,439 these developers who have their laptops 1154 00:39:37,440 --> 00:39:39,629 in there making these APCs and 1155 00:39:39,630 --> 00:39:41,729 then they're uploading them into the 1156 00:39:41,730 --> 00:39:43,619 system. And then people are saying, 1157 00:39:43,620 --> 00:39:45,299 great, it's open source. 1158 00:39:45,300 --> 00:39:47,429 So everything's fine and we'll just 1159 00:39:47,430 --> 00:39:49,130 give the case to everyone in the world. 1160 00:39:50,850 --> 00:39:52,949 And that also, just to think 1161 00:39:52,950 --> 00:39:55,139 further concretely about something that 1162 00:39:55,140 --> 00:39:57,449 Mike is about to come to, that's 1163 00:39:57,450 --> 00:39:59,759 the way that Debian worked until just 1164 00:39:59,760 --> 00:40:02,039 a couple of years ago, that each 1165 00:40:02,040 --> 00:40:03,569 individual Debian developer had the 1166 00:40:03,570 --> 00:40:05,729 ability to make a binary package upload 1167 00:40:05,730 --> 00:40:07,199 that was created on that individual 1168 00:40:07,200 --> 00:40:08,369 developer's laptop. 1169 00:40:08,370 --> 00:40:10,949 And what you had was that developers word 1170 00:40:10,950 --> 00:40:12,659 that that binary package matched the 1171 00:40:12,660 --> 00:40:14,579 source package and you didn't have a 1172 00:40:14,580 --> 00:40:15,929 technical mechanism that was under the 1173 00:40:15,930 --> 00:40:17,729 control of the project. 1174 00:40:17,730 --> 00:40:20,159 You just had uploads of binary packages 1175 00:40:20,160 --> 00:40:21,959 by thousands of individual Debian 1176 00:40:21,960 --> 00:40:23,219 developers that they made on their own 1177 00:40:23,220 --> 00:40:24,149 laptops. 1178 00:40:24,150 --> 00:40:26,339 So having this for any kind 1179 00:40:26,340 --> 00:40:28,409 of package repository is really 1180 00:40:28,410 --> 00:40:30,569 a major shift in the kind of assurance 1181 00:40:30,570 --> 00:40:32,579 that end users can get about where their 1182 00:40:32,580 --> 00:40:34,379 binaries are coming from and how their 1183 00:40:34,380 --> 00:40:36,239 binaries were produced. It's really quite 1184 00:40:36,240 --> 00:40:37,240 meaningful. 1185 00:40:38,820 --> 00:40:41,159 So the effort was, I believe, 1186 00:40:41,160 --> 00:40:43,349 spearheaded by Boonah and 1187 00:40:43,350 --> 00:40:45,899 Hogar, and 1188 00:40:45,900 --> 00:40:48,119 they include so there's so many 1189 00:40:48,120 --> 00:40:49,979 different kinds of software, obviously 1190 00:40:49,980 --> 00:40:51,869 packages in the universe 1191 00:40:52,890 --> 00:40:55,169 that they had to patch several aspects 1192 00:40:55,170 --> 00:40:57,119 of their build system and supporting 1193 00:40:57,120 --> 00:40:58,319 utilities in order to make sure 1194 00:40:58,320 --> 00:40:59,579 everything was reproducible. 1195 00:41:00,810 --> 00:41:03,079 I mean, this list here, Deb Helper, 1196 00:41:03,080 --> 00:41:05,339 CVS D package, 1197 00:41:05,340 --> 00:41:08,309 the Python tools, Java tools, 1198 00:41:08,310 --> 00:41:09,869 you can Octave apparently had to be 1199 00:41:09,870 --> 00:41:11,939 patched and involved in some packages 1200 00:41:11,940 --> 00:41:13,109 built system. 1201 00:41:13,110 --> 00:41:14,579 They have a wiki page that describes all 1202 00:41:14,580 --> 00:41:16,679 these these problems 1203 00:41:16,680 --> 00:41:19,229 in detail and is very well informative 1204 00:41:19,230 --> 00:41:21,389 and invites anyone to 1205 00:41:21,390 --> 00:41:22,949 help out with that effort as well. 1206 00:41:22,950 --> 00:41:24,179 And as a result of this, I believe they 1207 00:41:24,180 --> 00:41:26,429 have somewhere around 60 percent or two 1208 00:41:26,430 --> 00:41:28,499 thirds of their packages currently 1209 00:41:28,500 --> 00:41:29,820 being built reproducibly. 1210 00:41:30,900 --> 00:41:32,099 And they have a little cool little 1211 00:41:32,100 --> 00:41:34,499 jenkins' graph that shows that slowly 1212 00:41:34,500 --> 00:41:36,689 has been going up into the right for 1213 00:41:36,690 --> 00:41:38,919 the past six or nine months since 1214 00:41:38,920 --> 00:41:40,619 but since people started taking this 1215 00:41:40,620 --> 00:41:42,149 effort seriously and all the individual 1216 00:41:42,150 --> 00:41:44,279 package maintainers have been working 1217 00:41:44,280 --> 00:41:46,319 to make their packages reproducible. 1218 00:41:46,320 --> 00:41:47,639 And that really protects Debian 1219 00:41:47,640 --> 00:41:49,679 developers in much the way that you 1220 00:41:49,680 --> 00:41:51,569 described that the reproducibility 1221 00:41:51,570 --> 00:41:54,299 protects for developers and produces 1222 00:41:54,300 --> 00:41:57,059 a much diminished incentive to 1223 00:41:57,060 --> 00:41:59,219 compromise the developers laptop 1224 00:41:59,220 --> 00:42:01,379 or to try to coerce a developer to 1225 00:42:01,380 --> 00:42:02,819 do something improper in a software 1226 00:42:02,820 --> 00:42:04,679 release, at least at the bindery level. 1227 00:42:04,680 --> 00:42:06,299 Yeah, unfortunately, we have a situation 1228 00:42:06,300 --> 00:42:07,649 where you could find any one of a 1229 00:42:07,650 --> 00:42:09,569 thousand people who just left their 1230 00:42:09,570 --> 00:42:11,789 laptop at a hacker conference laying 1231 00:42:11,790 --> 00:42:13,139 on a table one day wasn't paying 1232 00:42:13,140 --> 00:42:15,269 attention and allowed you to stick a USB 1233 00:42:15,270 --> 00:42:16,229 key or something in. 1234 00:42:16,230 --> 00:42:17,699 And then you could compromise the entire 1235 00:42:17,700 --> 00:42:19,529 Debian population just from one of those 1236 00:42:19,530 --> 00:42:21,119 thousand people or so that were building 1237 00:42:21,120 --> 00:42:22,379 packages. 1238 00:42:22,380 --> 00:42:24,419 And now that's no longer possible. 1239 00:42:24,420 --> 00:42:26,039 And it's getting to the case where you 1240 00:42:26,040 --> 00:42:27,679 won't even be able to compromise their 1241 00:42:27,680 --> 00:42:30,689 their central billed system anymore, 1242 00:42:30,690 --> 00:42:32,039 or at least that it would be detected, 1243 00:42:32,040 --> 00:42:32,969 right? 1244 00:42:32,970 --> 00:42:35,609 Yes. In a way that you can send packages. 1245 00:42:35,610 --> 00:42:36,929 Now, they took a slightly different 1246 00:42:36,930 --> 00:42:38,969 approach to expedite this process. 1247 00:42:38,970 --> 00:42:41,069 They strip out differences that were 1248 00:42:41,070 --> 00:42:42,899 that they found common to many packages. 1249 00:42:42,900 --> 00:42:45,149 There's actually a strip 1250 00:42:45,150 --> 00:42:48,029 determinism helper that removes 1251 00:42:48,030 --> 00:42:50,429 time stamps and handles common 1252 00:42:50,430 --> 00:42:52,229 issues of file formats and normalizes 1253 00:42:52,230 --> 00:42:54,539 these things and basically 1254 00:42:54,540 --> 00:42:55,540 alters those. 1255 00:42:56,700 --> 00:42:58,769 And it's use, I 1256 00:42:58,770 --> 00:43:00,749 think, is a large part of what got Debian 1257 00:43:00,750 --> 00:43:02,879 so far in terms of all of their packages 1258 00:43:02,880 --> 00:43:03,880 being reproduced. 1259 00:43:04,710 --> 00:43:07,979 But because it modifies the binary, 1260 00:43:07,980 --> 00:43:09,929 it itself can be as a target for 1261 00:43:09,930 --> 00:43:11,399 compromise, another single point of 1262 00:43:11,400 --> 00:43:12,599 failure. 1263 00:43:12,600 --> 00:43:14,669 And you introduce 1264 00:43:14,670 --> 00:43:16,230 this trusting trust 1265 00:43:17,370 --> 00:43:19,469 attack where if you can 1266 00:43:19,470 --> 00:43:22,049 compromise the things that build 1267 00:43:22,050 --> 00:43:24,449 determinism that can be used to propagate 1268 00:43:24,450 --> 00:43:26,409 a back door for those who are familiar. 1269 00:43:26,410 --> 00:43:29,069 The trust and trust attack was first 1270 00:43:29,070 --> 00:43:31,260 described by Ken Thompson in the 80s 1271 00:43:32,790 --> 00:43:35,039 in his famous paper 1272 00:43:35,040 --> 00:43:36,539 Reflections on Trusting Trust. 1273 00:43:36,540 --> 00:43:39,029 What he did was took the C compiler 1274 00:43:39,030 --> 00:43:41,429 and wrote a backdoor and source code 1275 00:43:41,430 --> 00:43:43,529 that went 1276 00:43:43,530 --> 00:43:45,179 into a binary version that went out in 1277 00:43:45,180 --> 00:43:47,339 one release and then removed that 1278 00:43:47,340 --> 00:43:49,409 source code from the compiler source 1279 00:43:49,410 --> 00:43:51,719 code. But the back door was such a nature 1280 00:43:51,720 --> 00:43:54,059 that likes what Sutt demonstrated, 1281 00:43:54,060 --> 00:43:56,729 it could insert itself 1282 00:43:56,730 --> 00:43:58,919 into the compiler source code. 1283 00:43:58,920 --> 00:44:00,959 So it's a self-propagating vector that's 1284 00:44:00,960 --> 00:44:02,549 only visible in the binaries and never 1285 00:44:02,550 --> 00:44:04,349 visible in the source, but it can 1286 00:44:04,350 --> 00:44:05,729 propagate itself through multiple 1287 00:44:05,730 --> 00:44:07,679 generations of the compiler. 1288 00:44:09,600 --> 00:44:11,939 And that's still 1289 00:44:11,940 --> 00:44:14,069 a considerable concern about the 1290 00:44:14,070 --> 00:44:15,269 integrity of our infrastructure. 1291 00:44:16,410 --> 00:44:17,519 And there's research, 1292 00:44:18,540 --> 00:44:20,669 which you're about to mention 1293 00:44:20,670 --> 00:44:23,009 by David Wheeler, describing one 1294 00:44:23,010 --> 00:44:23,969 means of addressing this. 1295 00:44:23,970 --> 00:44:26,309 And we actually found that the same 1296 00:44:26,310 --> 00:44:28,859 attack was actually described 1297 00:44:28,860 --> 00:44:30,719 by Air Force researchers in nineteen 1298 00:44:30,720 --> 00:44:32,819 seventy three and 1299 00:44:32,820 --> 00:44:34,799 that they said, oh, someone could make a 1300 00:44:34,800 --> 00:44:37,439 self propagating compromise 1301 00:44:37,440 --> 00:44:39,779 in the software development toolchain 1302 00:44:39,780 --> 00:44:41,099 and then we wouldn't notice. 1303 00:44:41,100 --> 00:44:42,539 And then all of the things that we built 1304 00:44:42,540 --> 00:44:44,519 could be compromised. 1305 00:44:44,520 --> 00:44:46,049 So this is a concern that people have 1306 00:44:46,050 --> 00:44:47,609 identified. Back in nineteen seventy 1307 00:44:47,610 --> 00:44:49,649 three and the Air Force researchers 1308 00:44:49,650 --> 00:44:51,059 suggested that this would be a powerful 1309 00:44:51,060 --> 00:44:52,649 tool for espionage that someone might be 1310 00:44:52,650 --> 00:44:53,650 tempted to do. 1311 00:44:54,540 --> 00:44:55,949 And I think that's still the case. 1312 00:44:55,950 --> 00:44:58,019 Right. So David Wheeler actually 1313 00:44:58,020 --> 00:45:00,239 found that what you can do is you take to 1314 00:45:00,240 --> 00:45:02,039 independently develop compilers and you 1315 00:45:02,040 --> 00:45:03,599 only need the binaries for this. 1316 00:45:03,600 --> 00:45:04,889 They can be one of them can be 1317 00:45:04,890 --> 00:45:06,419 proprietary compiler. 1318 00:45:06,420 --> 00:45:08,579 And then you use those two compilers 1319 00:45:08,580 --> 00:45:10,049 to compile the source code of an open 1320 00:45:10,050 --> 00:45:12,119 source compiler and you produce two more 1321 00:45:12,120 --> 00:45:13,109 binaries. 1322 00:45:13,110 --> 00:45:15,239 Now, you have two binaries, one of 1323 00:45:15,240 --> 00:45:16,799 which may or may not have have a back 1324 00:45:16,800 --> 00:45:19,499 door. And now you use those two compilers 1325 00:45:19,500 --> 00:45:20,849 that were compiled from the same source 1326 00:45:20,850 --> 00:45:22,709 code, but with different compilers to 1327 00:45:22,710 --> 00:45:24,839 compile that same source code again 1328 00:45:24,840 --> 00:45:26,969 to produce two more binaries. 1329 00:45:26,970 --> 00:45:28,709 Now, those two binaries should be 1330 00:45:28,710 --> 00:45:30,839 identical if there is no back door 1331 00:45:30,840 --> 00:45:32,939 in or unless 1332 00:45:32,940 --> 00:45:34,259 there is the same back door and both of 1333 00:45:34,260 --> 00:45:36,119 the binary compilers or there is no 1334 00:45:36,120 --> 00:45:36,959 backdoor. 1335 00:45:36,960 --> 00:45:39,119 So, David. This is really great, if you 1336 00:45:39,120 --> 00:45:40,439 like, saying the word compiler. 1337 00:45:40,440 --> 00:45:42,689 Yeah, so Bruce Schneier has a very 1338 00:45:42,690 --> 00:45:44,729 good description of this that describes 1339 00:45:44,730 --> 00:45:46,229 all the security properties. 1340 00:45:46,230 --> 00:45:48,569 But the really important property 1341 00:45:48,570 --> 00:45:50,969 is that now the Debian is working on 1342 00:45:50,970 --> 00:45:52,320 reproducible builds. 1343 00:45:53,340 --> 00:45:55,439 It turns out that, as demonstrated, 1344 00:45:55,440 --> 00:45:57,629 this sort of back door can't 1345 00:45:57,630 --> 00:45:59,099 or can propagate. It can propagate 1346 00:45:59,100 --> 00:46:01,079 between the kernel and the compiler or 1347 00:46:01,080 --> 00:46:03,179 the compiler and usually 1348 00:46:03,180 --> 00:46:04,949 the tools or the link or anything on the 1349 00:46:04,950 --> 00:46:07,019 system. And David, 1350 00:46:07,020 --> 00:46:09,539 Dr. Wheeler only proved that 1351 00:46:09,540 --> 00:46:11,639 his two compilers didn't have back doors. 1352 00:46:11,640 --> 00:46:13,469 He did not rebuild an entire build 1353 00:46:13,470 --> 00:46:14,249 environment. 1354 00:46:14,250 --> 00:46:15,779 So we also have to trust the operating 1355 00:46:15,780 --> 00:46:16,109 system. 1356 00:46:16,110 --> 00:46:18,239 Right. We still and he trusted the system 1357 00:46:18,240 --> 00:46:19,739 that experiment and concluded there was 1358 00:46:19,740 --> 00:46:21,959 no back door and GC, but 1359 00:46:21,960 --> 00:46:23,789 we still don't know if there might be a 1360 00:46:23,790 --> 00:46:26,009 back door that is capable of propagating 1361 00:46:26,010 --> 00:46:28,529 between the kernel and the compiler. 1362 00:46:28,530 --> 00:46:30,869 But once with Debian reproducibility, 1363 00:46:30,870 --> 00:46:33,629 what we can do now is if 1364 00:46:33,630 --> 00:46:35,309 we have a base system that can be cross 1365 00:46:35,310 --> 00:46:38,039 compiled to all these other 1366 00:46:38,040 --> 00:46:39,239 architectures and it's still 1367 00:46:39,240 --> 00:46:41,909 reproducible, you can take the arm, build 1368 00:46:41,910 --> 00:46:44,249 PowerPC, build the MIPS, build the intel 1369 00:46:44,250 --> 00:46:46,439 build and you can take the hard 1370 00:46:46,440 --> 00:46:49,259 kernel, the K FreeBSD kernel, 1371 00:46:49,260 --> 00:46:51,029 and you can have them all across 1372 00:46:51,030 --> 00:46:53,339 compiling and verifying each other. 1373 00:46:53,340 --> 00:46:55,499 And now you force the adversary to 1374 00:46:55,500 --> 00:46:57,419 be very forward looking in a way that 1375 00:46:57,420 --> 00:46:59,369 they would have to have anticipated. 1376 00:46:59,370 --> 00:47:01,409 Oh, we need a compromise, have a 1377 00:47:01,410 --> 00:47:03,539 self-propagating back door that has so 1378 00:47:03,540 --> 00:47:05,639 many copies of itself, the connected 1379 00:47:05,640 --> 00:47:07,889 able to infect every architecture, using 1380 00:47:07,890 --> 00:47:10,079 every tool and survives by 1381 00:47:10,080 --> 00:47:11,669 infecting every architecture in the 1382 00:47:11,670 --> 00:47:12,899 compatible way. 1383 00:47:12,900 --> 00:47:14,339 And turns out you can also throw in 1384 00:47:14,340 --> 00:47:16,379 proprietary compilers using Dr. Wheelus 1385 00:47:16,380 --> 00:47:18,689 technique and then force the adversary 1386 00:47:18,690 --> 00:47:20,279 to have to compromise those as well. 1387 00:47:20,280 --> 00:47:21,809 So we're getting very close to being able 1388 00:47:21,810 --> 00:47:23,969 to actually do this kind of assurance for 1389 00:47:23,970 --> 00:47:25,599 essentially the first time ever. 1390 00:47:25,600 --> 00:47:27,659 Yes. So we're almost about to be able 1391 00:47:27,660 --> 00:47:30,119 to prove that the integrity of software 1392 00:47:30,120 --> 00:47:32,399 to the level of the hardware, which 1393 00:47:32,400 --> 00:47:33,400 then. 1394 00:47:42,550 --> 00:47:44,589 Which, as you know, is then a whole new 1395 00:47:44,590 --> 00:47:46,119 set of turtles all the way down, but 1396 00:47:47,470 --> 00:47:48,369 we'll get that, too. 1397 00:47:48,370 --> 00:47:50,289 I think Bunny Wang is still working on 1398 00:47:50,290 --> 00:47:51,290 that one. 1399 00:47:52,050 --> 00:47:53,379 Do you want to talk about. 1400 00:47:53,380 --> 00:47:55,329 Oh, yes. So then if you want to talk 1401 00:47:55,330 --> 00:47:56,409 about the software distribution problem, 1402 00:47:56,410 --> 00:47:57,259 I think you. 1403 00:47:57,260 --> 00:47:58,779 Oh, well, I think this could be a whole 1404 00:47:58,780 --> 00:47:59,889 other talk. But I mean, 1405 00:48:01,000 --> 00:48:02,919 there's also this problem about when you 1406 00:48:02,920 --> 00:48:05,499 distribute software updates to people, 1407 00:48:05,500 --> 00:48:06,939 how do you know that everyone is getting 1408 00:48:06,940 --> 00:48:07,940 the same update? 1409 00:48:09,280 --> 00:48:11,799 And I think that's a pretty significant 1410 00:48:11,800 --> 00:48:13,989 problem in its own right, because this is 1411 00:48:13,990 --> 00:48:16,179 talking about how can people 1412 00:48:16,180 --> 00:48:18,429 who check verify that 1413 00:48:18,430 --> 00:48:20,289 binaries were produced from a particular 1414 00:48:20,290 --> 00:48:21,489 source code version. 1415 00:48:21,490 --> 00:48:23,139 But we also have this other family of 1416 00:48:23,140 --> 00:48:24,429 problems about, well, is everyone 1417 00:48:24,430 --> 00:48:26,499 actually getting the same update or 1418 00:48:26,500 --> 00:48:28,629 could, for example, the Tor project put 1419 00:48:28,630 --> 00:48:31,299 out one version of the browser 1420 00:48:31,300 --> 00:48:33,069 and another version of the browser and 1421 00:48:33,070 --> 00:48:34,779 have a malicious back door, one of them, 1422 00:48:34,780 --> 00:48:37,029 and give that to a small subset of users, 1423 00:48:37,030 --> 00:48:38,469 perhaps in a reproducible way, 1424 00:48:39,640 --> 00:48:40,929 but still in a way that would compromise 1425 00:48:40,930 --> 00:48:43,030 those users. So I think there's actually 1426 00:48:44,050 --> 00:48:46,299 for the concept of software transparency 1427 00:48:46,300 --> 00:48:47,859 and software development, transparency 1428 00:48:47,860 --> 00:48:49,749 generally, there are actually a few other 1429 00:48:49,750 --> 00:48:51,999 problems on top of this, not just making 1430 00:48:52,000 --> 00:48:53,769 sure that source code and binaries match, 1431 00:48:53,770 --> 00:48:56,139 but actually making sure that everyone 1432 00:48:56,140 --> 00:48:57,109 gets the same thing. 1433 00:48:57,110 --> 00:48:58,569 Now, the core problem here, in a 1434 00:48:58,570 --> 00:49:00,429 theoretical sense, is one of distributed 1435 00:49:00,430 --> 00:49:01,899 consensus. 1436 00:49:01,900 --> 00:49:03,579 Everybody has a potentially different 1437 00:49:03,580 --> 00:49:05,679 view of the Internet, and we're 1438 00:49:05,680 --> 00:49:06,969 seeing this more and more as there's more 1439 00:49:06,970 --> 00:49:09,099 censorship regimes and interception 1440 00:49:09,100 --> 00:49:10,149 hardware. 1441 00:49:10,150 --> 00:49:12,369 Just because somebody else can download 1442 00:49:12,370 --> 00:49:13,779 the canonical Tor browser from one 1443 00:49:13,780 --> 00:49:16,029 location doesn't mean that 1444 00:49:16,030 --> 00:49:17,859 somebody in China, when they try to 1445 00:49:17,860 --> 00:49:18,969 download it, they're going to necessarily 1446 00:49:18,970 --> 00:49:20,559 get the same packet packages. 1447 00:49:21,820 --> 00:49:24,039 So in order to try and 1448 00:49:24,040 --> 00:49:26,109 ensure this, one of the things that we're 1449 00:49:26,110 --> 00:49:28,209 planning on doing is having our upset 1450 00:49:28,210 --> 00:49:30,579 updates authenticated in addition 1451 00:49:30,580 --> 00:49:33,099 to signatures by the Tor consensus. 1452 00:49:33,100 --> 00:49:35,409 So there'll be a new URL with a hash 1453 00:49:35,410 --> 00:49:37,629 that itself contains the hashes 1454 00:49:37,630 --> 00:49:39,399 of all of the update files for the 1455 00:49:39,400 --> 00:49:40,400 browser. 1456 00:49:40,930 --> 00:49:43,029 And we can further strengthen that 1457 00:49:43,030 --> 00:49:45,189 by either storing records of the Tor 1458 00:49:45,190 --> 00:49:47,649 consensus in the Bitcoin block chain or 1459 00:49:47,650 --> 00:49:49,989 using the Bitcoin block directly, or 1460 00:49:49,990 --> 00:49:51,159 using something like certificate 1461 00:49:51,160 --> 00:49:53,769 transparency to record the Tor consensus 1462 00:49:53,770 --> 00:49:55,270 or the package archives. 1463 00:49:58,390 --> 00:50:00,609 And further ensure that everybody 1464 00:50:00,610 --> 00:50:02,529 sees essentially the same consensus view 1465 00:50:02,530 --> 00:50:04,329 of not just the terror network, but also 1466 00:50:04,330 --> 00:50:05,330 our packages. 1467 00:50:06,680 --> 00:50:08,419 So we have some links here that are also 1468 00:50:08,420 --> 00:50:10,279 in the slides we've just uploaded about 1469 00:50:10,280 --> 00:50:11,389 an hour before the talk 1470 00:50:12,440 --> 00:50:14,779 to the events 1471 00:50:14,780 --> 00:50:16,069 page. 1472 00:50:16,070 --> 00:50:18,259 So if you're interested in working on 1473 00:50:18,260 --> 00:50:20,419 this problem for your own software 1474 00:50:20,420 --> 00:50:22,009 project, there are some useful things. 1475 00:50:22,010 --> 00:50:24,199 The Web browser to design doc has 1476 00:50:24,200 --> 00:50:25,249 a section on builded security that 1477 00:50:25,250 --> 00:50:27,349 describes getting in again, the 1478 00:50:27,350 --> 00:50:29,269 Droid verification server. 1479 00:50:29,270 --> 00:50:30,649 Again, that's not complete yet. 1480 00:50:30,650 --> 00:50:31,819 So if you're interested in Android 1481 00:50:31,820 --> 00:50:33,049 development, I'm sure they'd love some 1482 00:50:33,050 --> 00:50:35,209 help making sure that all the all 1483 00:50:35,210 --> 00:50:37,259 the packages are ready to be user the 1484 00:50:37,260 --> 00:50:38,599 system. 1485 00:50:38,600 --> 00:50:40,489 And also, I'm sure, would love your help 1486 00:50:40,490 --> 00:50:42,289 with any any packages making them 1487 00:50:42,290 --> 00:50:43,290 reproducible. 1488 00:50:43,970 --> 00:50:45,589 And then if you would like more 1489 00:50:45,590 --> 00:50:47,779 information on the diverse 1490 00:50:47,780 --> 00:50:49,579 double compilation and how we can get to 1491 00:50:49,580 --> 00:50:51,729 true software 1492 00:50:51,730 --> 00:50:53,749 or binary integrity all all the way down 1493 00:50:53,750 --> 00:50:55,579 to the hardware, these two articles are 1494 00:50:55,580 --> 00:50:57,499 excellent. Boucherie give a very succinct 1495 00:50:58,730 --> 00:51:00,859 description of Dr. Wheeler's 1496 00:51:00,860 --> 00:51:03,019 double compilation and that LWR an 1497 00:51:03,020 --> 00:51:05,089 article actually points out, Oh, well, 1498 00:51:05,090 --> 00:51:06,939 you need to do not just the compiler, but 1499 00:51:06,940 --> 00:51:08,509 the whole operating, the whole operating 1500 00:51:08,510 --> 00:51:09,769 system. 1501 00:51:09,770 --> 00:51:12,439 I would also recommend Mike wrote 1502 00:51:12,440 --> 00:51:14,869 some very readable and 1503 00:51:14,870 --> 00:51:17,029 very useful blog posts on the tour 1504 00:51:17,030 --> 00:51:19,189 blog about the motivation 1505 00:51:19,190 --> 00:51:21,379 for all of this and about 1506 00:51:21,380 --> 00:51:23,959 how the project succeeded in doing it. 1507 00:51:23,960 --> 00:51:25,399 It's much the same material that you've 1508 00:51:25,400 --> 00:51:26,749 just heard, but if you want to see it in 1509 00:51:26,750 --> 00:51:28,909 written format, there 1510 00:51:28,910 --> 00:51:30,589 are a couple of great posts on the blog. 1511 00:51:30,590 --> 00:51:32,899 And again, Thomas DeLeon's piece 1512 00:51:32,900 --> 00:51:35,329 Offensive Work in Addiction, talking 1513 00:51:35,330 --> 00:51:37,039 about the incentive that an attacker 1514 00:51:37,040 --> 00:51:39,319 would have to 1515 00:51:39,320 --> 00:51:41,269 compromise infrastructure as a means of 1516 00:51:41,270 --> 00:51:43,009 compromising infrastructure, as a means 1517 00:51:43,010 --> 00:51:44,869 of compromising infrastructure and the 1518 00:51:44,870 --> 00:51:46,909 sort of addictiveness of, hey, I can get 1519 00:51:46,910 --> 00:51:49,189 more and more power over information 1520 00:51:49,190 --> 00:51:51,259 technology as a whole because of 1521 00:51:51,260 --> 00:51:53,239 the trust relationships between software 1522 00:51:53,240 --> 00:51:54,240 projects. 1523 00:51:57,190 --> 00:51:59,199 Those are contact info if you want to 1524 00:51:59,200 --> 00:52:01,809 email or contact. 1525 00:52:01,810 --> 00:52:02,919 Anyway, I guess we can open up for 1526 00:52:02,920 --> 00:52:04,989 questions if anyone 1527 00:52:04,990 --> 00:52:06,579 if you want to line up at the microphones 1528 00:52:06,580 --> 00:52:07,989 here, if anybody has any questions for 1529 00:52:07,990 --> 00:52:09,850 us, we have about five minute. 1530 00:52:18,980 --> 00:52:21,019 First order questions from the Internet 1531 00:52:22,250 --> 00:52:24,139 is, can you approach one microphone? 1532 00:52:26,260 --> 00:52:28,429 What if you leave? 1533 00:52:28,430 --> 00:52:30,529 Please take all your trust with 1534 00:52:30,530 --> 00:52:32,539 you and also take other people's trust 1535 00:52:32,540 --> 00:52:35,089 with you if you see by your own eyes and 1536 00:52:36,320 --> 00:52:37,789 approach one of the microphones if you 1537 00:52:37,790 --> 00:52:38,790 have questions. 1538 00:52:47,950 --> 00:52:50,009 So, um, 1539 00:52:50,010 --> 00:52:52,139 I have one the first question, 1540 00:52:52,140 --> 00:52:54,389 if I develop a computer is compromised 1541 00:52:54,390 --> 00:52:55,390 and 1542 00:52:57,810 --> 00:52:59,899 sticks and 1543 00:52:59,900 --> 00:53:02,099 an anniversary 1544 00:53:02,100 --> 00:53:04,469 could change a small part of 1545 00:53:04,470 --> 00:53:07,649 the code to make the code available 1546 00:53:07,650 --> 00:53:08,589 to attack. 1547 00:53:08,590 --> 00:53:10,439 In this case, the court could be 1548 00:53:10,440 --> 00:53:12,539 compromised from the very beginning. 1549 00:53:12,540 --> 00:53:14,679 Why would reproducible build 1550 00:53:14,680 --> 00:53:16,769 help to see this manipulation as a 1551 00:53:16,770 --> 00:53:18,899 computer would have to compromise code. 1552 00:53:19,970 --> 00:53:21,829 I guess that's referring to source code. 1553 00:53:21,830 --> 00:53:23,229 I think so, yeah, 1554 00:53:25,520 --> 00:53:26,629 you want to try to address that? 1555 00:53:26,630 --> 00:53:28,609 I mean, I think one of the problems that 1556 00:53:28,610 --> 00:53:30,679 we really do have in 1557 00:53:30,680 --> 00:53:32,719 terms of people introducing 1558 00:53:32,720 --> 00:53:34,789 vulnerabilities into systems, if 1559 00:53:34,790 --> 00:53:37,489 you look at the Obfuscated 1560 00:53:37,490 --> 00:53:40,009 V contest or the underhanded 1561 00:53:40,010 --> 00:53:42,079 C-code contest, there's an annual 1562 00:53:42,080 --> 00:53:44,719 contest for writing malicious 1563 00:53:44,720 --> 00:53:46,789 source code that can pass an 1564 00:53:46,790 --> 00:53:47,839 audit. 1565 00:53:47,840 --> 00:53:49,969 And it's very scary because you look 1566 00:53:49,970 --> 00:53:52,489 at some of the results and it's like, OK, 1567 00:53:52,490 --> 00:53:55,129 this code here has a malicious 1568 00:53:55,130 --> 00:53:56,449 functionality. 1569 00:53:56,450 --> 00:53:57,679 That's as follows. 1570 00:53:57,680 --> 00:53:58,819 And here's the source code. 1571 00:53:58,820 --> 00:54:00,349 Can you see it? 1572 00:54:00,350 --> 00:54:01,909 And people have managed to make some that 1573 00:54:01,910 --> 00:54:04,159 are very hard to see in the source code. 1574 00:54:04,160 --> 00:54:05,929 I mean, I think the best, simplest answer 1575 00:54:05,930 --> 00:54:07,579 is the one that the free software 1576 00:54:07,580 --> 00:54:08,929 community always gives is that 1577 00:54:10,070 --> 00:54:11,179 many eyes. 1578 00:54:11,180 --> 00:54:13,219 We hope that by having the software open 1579 00:54:13,220 --> 00:54:15,469 and having especially incentive programs, 1580 00:54:15,470 --> 00:54:17,749 vulnerability reward programs sponsored 1581 00:54:17,750 --> 00:54:20,029 by people who rely on that infrastructure 1582 00:54:20,030 --> 00:54:22,699 for their for critical things 1583 00:54:22,700 --> 00:54:25,909 to sponsor disclosure, 1584 00:54:25,910 --> 00:54:27,709 we hope that those sorts of mistakes can 1585 00:54:27,710 --> 00:54:29,779 be found. And also things like get 1586 00:54:29,780 --> 00:54:31,729 better source code integrity policies. 1587 00:54:31,730 --> 00:54:33,049 There's a lot of projects that just throw 1588 00:54:33,050 --> 00:54:35,119 tarballs up on FTP sites and 1589 00:54:35,120 --> 00:54:37,369 don't have any authenticated revision 1590 00:54:37,370 --> 00:54:38,569 history. 1591 00:54:38,570 --> 00:54:39,919 That's very dangerous, I think, for this 1592 00:54:39,920 --> 00:54:41,899 sort of attack. So things like GET and 1593 00:54:41,900 --> 00:54:43,669 other distributed version control can 1594 00:54:43,670 --> 00:54:44,599 help prevent that. 1595 00:54:44,600 --> 00:54:46,429 You have to compromise all the get repos 1596 00:54:46,430 --> 00:54:48,379 and get it in there or there's at least a 1597 00:54:48,380 --> 00:54:50,539 commit record in all of them 1598 00:54:50,540 --> 00:54:52,329 that shows that things can produce. 1599 00:54:53,420 --> 00:54:55,070 One question from microphone number two. 1600 00:54:57,570 --> 00:54:59,209 Right. 1601 00:54:59,210 --> 00:55:01,399 So I had a question where 1602 00:55:01,400 --> 00:55:04,369 you say that you hard code sha 256 1603 00:55:04,370 --> 00:55:05,869 hashes in the build process to sort of 1604 00:55:05,870 --> 00:55:08,359 verify that upstream is OK. 1605 00:55:08,360 --> 00:55:09,499 Have you considered the possibility of 1606 00:55:09,500 --> 00:55:11,899 actually adding in, like 1607 00:55:11,900 --> 00:55:14,269 signed get commits and then 1608 00:55:14,270 --> 00:55:15,709 having something in the build process 1609 00:55:15,710 --> 00:55:17,689 that will verify those based on hard 1610 00:55:17,690 --> 00:55:19,219 coded keys? 1611 00:55:19,220 --> 00:55:21,289 We do. We do both. 1612 00:55:21,290 --> 00:55:23,539 We have the key key rings for 1613 00:55:23,540 --> 00:55:25,699 all of the git repositories that actually 1614 00:55:25,700 --> 00:55:26,700 sign their comments 1615 00:55:27,830 --> 00:55:29,929 and packages that sign 1616 00:55:29,930 --> 00:55:33,169 their first releases. 1617 00:55:33,170 --> 00:55:35,809 So we use the those signatures 1618 00:55:35,810 --> 00:55:37,369 to verify the inputs where we believe 1619 00:55:37,370 --> 00:55:39,139 they're strong. And if we think they're 1620 00:55:39,140 --> 00:55:41,029 sketchy because there's 12 developers who 1621 00:55:41,030 --> 00:55:43,039 can all say the same thing or they sign 1622 00:55:43,040 --> 00:55:44,599 with MI five or whatever, that's when we 1623 00:55:44,600 --> 00:55:46,969 introduce the SHA hash into our repo 1624 00:55:46,970 --> 00:55:49,519 and we also sign the guest tags 1625 00:55:49,520 --> 00:55:51,469 that we build from for in the case of a 1626 00:55:51,470 --> 00:55:53,239 browser. So when you check out something 1627 00:55:53,240 --> 00:55:55,729 to build for an official release, it has 1628 00:55:55,730 --> 00:55:57,530 my signature or Gheorghe signature 1629 00:55:58,700 --> 00:56:00,529 on that actual commit. 1630 00:56:00,530 --> 00:56:02,539 OK, thank you. OK, another question from 1631 00:56:02,540 --> 00:56:04,699 the Internet from microphone three. 1632 00:56:04,700 --> 00:56:06,859 Um, question number two, what 1633 00:56:06,860 --> 00:56:09,170 procedures are there where 1634 00:56:10,360 --> 00:56:12,229 they are when there is a failure? 1635 00:56:12,230 --> 00:56:14,329 Notice to compromise 1636 00:56:14,330 --> 00:56:16,579 keys, block change or how to roll 1637 00:56:16,580 --> 00:56:17,750 back multiple steps. 1638 00:56:20,500 --> 00:56:22,599 I'm having trouble interpreting 1639 00:56:22,600 --> 00:56:24,639 the question, if there is a key 1640 00:56:24,640 --> 00:56:26,769 compromise, what procedures 1641 00:56:26,770 --> 00:56:29,319 are there is a failure 1642 00:56:29,320 --> 00:56:31,809 notice to the compromise 1643 00:56:31,810 --> 00:56:32,810 key? 1644 00:56:35,080 --> 00:56:36,339 It depends on which key. 1645 00:56:36,340 --> 00:56:37,509 I mean, there's a lot of key material. 1646 00:56:37,510 --> 00:56:38,929 There's the keys for the inputs 1647 00:56:40,380 --> 00:56:42,669 question to get in the direction 1648 00:56:42,670 --> 00:56:44,679 of what happens if the key is 1649 00:56:44,680 --> 00:56:45,680 compromised. 1650 00:56:46,480 --> 00:56:49,179 So that is why we want to extend 1651 00:56:49,180 --> 00:56:51,079 the software update authentication to the 1652 00:56:51,080 --> 00:56:53,379 Tor consensus in the Bitcoin block chain. 1653 00:56:53,380 --> 00:56:54,939 The key is that sign the software that 1654 00:56:54,940 --> 00:56:57,039 you download, you have 1655 00:56:57,040 --> 00:56:58,509 to compromise all of the builder's keys 1656 00:56:58,510 --> 00:57:00,429 because it's now multi signed. 1657 00:57:00,430 --> 00:57:02,859 But if you manage to do that, 1658 00:57:02,860 --> 00:57:04,929 we hope that the Tor consensus, which is 1659 00:57:04,930 --> 00:57:07,509 itself signed by a different set of keys 1660 00:57:07,510 --> 00:57:09,099 and can hopefully one day be 1661 00:57:09,100 --> 00:57:10,509 authenticated by another consensus 1662 00:57:10,510 --> 00:57:12,759 process like Bitcoin Blockin. 1663 00:57:12,760 --> 00:57:15,249 All of these systems provide security 1664 00:57:15,250 --> 00:57:17,529 like layers of security that 1665 00:57:17,530 --> 00:57:19,929 hopefully someone somewhere can say, hey, 1666 00:57:19,930 --> 00:57:22,119 this is a signature for this 1667 00:57:22,120 --> 00:57:24,729 package claims to be a valid signature, 1668 00:57:24,730 --> 00:57:26,859 but yet is signing something that is 1669 00:57:26,860 --> 00:57:29,679 not a hash that was published elsewhere. 1670 00:57:29,680 --> 00:57:31,089 So far that hasn't happened. 1671 00:57:31,090 --> 00:57:33,189 It's possible it has happened and nobody 1672 00:57:33,190 --> 00:57:35,379 has told us. But we haven't seen any 1673 00:57:35,380 --> 00:57:37,629 evidence of of that with the Tor software 1674 00:57:37,630 --> 00:57:38,889 at least. 1675 00:57:38,890 --> 00:57:40,569 OK, one more question from microphone 1676 00:57:40,570 --> 00:57:41,570 number two. 1677 00:57:43,570 --> 00:57:45,399 Hi, great to talk to both of you. 1678 00:57:45,400 --> 00:57:47,919 My question is for Seth, 1679 00:57:47,920 --> 00:57:50,139 is the install rootkit 1680 00:57:50,140 --> 00:57:51,889 program open source? 1681 00:57:51,890 --> 00:57:53,049 Anyway, I knew we were going to get this 1682 00:57:53,050 --> 00:57:54,050 one. 1683 00:57:56,240 --> 00:57:57,240 So. 1684 00:58:00,100 --> 00:58:02,169 I wasn't really planning to publish that. 1685 00:58:05,020 --> 00:58:06,999 We did give a version of this talk at 1686 00:58:07,000 --> 00:58:09,249 Mozilla and one of the Mozilla developers 1687 00:58:09,250 --> 00:58:11,679 apparently actually implemented it, 1688 00:58:11,680 --> 00:58:14,079 so apparently there is a version from 1689 00:58:14,080 --> 00:58:15,610 one of the Mozilla developers. 1690 00:58:17,260 --> 00:58:19,359 I'm not sure what it's called, but it 1691 00:58:19,360 --> 00:58:21,099 was actually out there and has this 1692 00:58:21,100 --> 00:58:22,100 functionality. 1693 00:58:23,050 --> 00:58:24,279 But, yeah, this is basically I think 1694 00:58:24,280 --> 00:58:25,959 there's a frak article from the 90s that 1695 00:58:25,960 --> 00:58:27,129 describes the techniques. 1696 00:58:27,130 --> 00:58:28,509 I mean, you're hooking the open system 1697 00:58:28,510 --> 00:58:30,639 call in the kernel 1698 00:58:30,640 --> 00:58:32,979 to see if the process that's opening 1699 00:58:32,980 --> 00:58:35,469 the file is named one. 1700 00:58:35,470 --> 00:58:37,479 So it's a really naive implementation of 1701 00:58:37,480 --> 00:58:39,339 this sort of thing. Again, real 1702 00:58:39,340 --> 00:58:40,839 sophisticated rootkit would do things 1703 00:58:40,840 --> 00:58:43,269 like also inspect the current process, 1704 00:58:43,270 --> 00:58:44,889 address space to make sure it really is a 1705 00:58:44,890 --> 00:58:46,899 compiler and other such things. 1706 00:58:46,900 --> 00:58:49,119 So, I mean, I know that people do want to 1707 00:58:49,120 --> 00:58:51,459 like referral their coworkers and so on. 1708 00:58:51,460 --> 00:58:53,469 So if someone can convince me that 1709 00:58:53,470 --> 00:58:55,000 there's a real gap that 1710 00:58:56,200 --> 00:58:58,509 existing rockets are just not 1711 00:58:58,510 --> 00:58:59,980 suitable, then 1712 00:59:01,120 --> 00:59:02,649 we can consider cleaning this up and 1713 00:59:02,650 --> 00:59:03,650 publishing it some. 1714 00:59:05,820 --> 00:59:07,130 So another question from over here. 1715 00:59:08,940 --> 00:59:11,009 Hello. What about 1716 00:59:11,010 --> 00:59:13,109 the role of the Hlinka and which one is 1717 00:59:13,110 --> 00:59:14,579 usually used of the two? 1718 00:59:14,580 --> 00:59:16,859 Linkous from Bin Utils, 1719 00:59:16,860 --> 00:59:18,569 I believe we started, I didn't get to 1720 00:59:18,570 --> 00:59:19,769 actually describe it in detail. 1721 00:59:19,770 --> 00:59:22,409 I believe we switched to the Gold Linko, 1722 00:59:22,410 --> 00:59:24,809 which is, I think the new Lakers. 1723 00:59:24,810 --> 00:59:26,219 That's the new one. The new one. 1724 00:59:26,220 --> 00:59:28,439 So it turned out the old one had an issue 1725 00:59:28,440 --> 00:59:30,569 with the implementation of SHA one 1726 00:59:30,570 --> 00:59:32,789 on thirty two bit machines where 1727 00:59:32,790 --> 00:59:35,249 if it was taking the build ID 1728 00:59:35,250 --> 00:59:37,529 for debug link to link the debugging 1729 00:59:37,530 --> 00:59:39,959 symbols to a stripped binary, you 1730 00:59:39,960 --> 00:59:41,759 can have these things in separate files. 1731 00:59:41,760 --> 00:59:43,439 And then there's this building that 1732 00:59:43,440 --> 00:59:45,639 contains the shot, one of the 1733 00:59:45,640 --> 00:59:48,059 the build to associate 1734 00:59:48,060 --> 00:59:50,159 the two that 1735 00:59:50,160 --> 00:59:52,289 sha one implementation in the old Linko 1736 00:59:52,290 --> 00:59:54,299 had some sort of issue with very large 1737 00:59:54,300 --> 00:59:56,249 files. We ran into it on Lib's ruled out. 1738 00:59:56,250 --> 00:59:58,469 So possibly because with the bug symbols, 1739 00:59:58,470 --> 00:59:59,939 that beast ends up larger than four 1740 00:59:59,940 --> 01:00:02,549 gigabytes and we ended up with 1741 01:00:02,550 --> 01:00:04,889 random values for that shamash 1742 01:00:04,890 --> 01:00:06,869 not different values than the official 1743 01:00:06,870 --> 01:00:09,029 sha hash, but random values for 1744 01:00:09,030 --> 01:00:11,219 that shot. So we had to switch off of 1745 01:00:11,220 --> 01:00:13,409 that the old linker and 1746 01:00:13,410 --> 01:00:14,729 start using the new linker for that 1747 01:00:14,730 --> 01:00:17,069 reason. I think the new Linko also has 1748 01:00:17,070 --> 01:00:18,070 better memory 1749 01:00:19,440 --> 01:00:21,599 overhead properties for linking very 1750 01:00:21,600 --> 01:00:23,519 large things like Lib's, the Firefox 1751 01:00:23,520 --> 01:00:24,509 Library. 1752 01:00:24,510 --> 01:00:26,820 So one more question from the Internet 1753 01:00:28,350 --> 01:00:29,459 default question. 1754 01:00:31,710 --> 01:00:33,899 How do you propagate a vector among 1755 01:00:33,900 --> 01:00:35,249 compilers? 1756 01:00:35,250 --> 01:00:37,619 You can, uh, 1757 01:00:37,620 --> 01:00:39,599 you can use fingerprints, but they are 1758 01:00:39,600 --> 01:00:41,729 not for you to future proof 1759 01:00:41,730 --> 01:00:43,859 and detecting is a piece of 1760 01:00:43,860 --> 01:00:47,219 code is compiler is quite 1761 01:00:47,220 --> 01:00:49,319 enforcible from the information 1762 01:00:49,320 --> 01:00:51,119 for theoretical people. 1763 01:00:51,120 --> 01:00:52,919 We yeah. 1764 01:00:52,920 --> 01:00:55,169 So the the basic point is 1765 01:00:55,170 --> 01:00:57,539 that it might be implausible 1766 01:00:57,540 --> 01:00:59,669 to have a real world vector that 1767 01:00:59,670 --> 01:01:01,229 propagates itself through very 1768 01:01:01,230 --> 01:01:03,149 heterogeneous code bases, especially 1769 01:01:03,150 --> 01:01:03,959 compilers. 1770 01:01:03,960 --> 01:01:05,579 And I think that's absolutely correct. 1771 01:01:07,470 --> 01:01:09,899 I think that the 1772 01:01:09,900 --> 01:01:11,909 more realistic scenario, if you look at 1773 01:01:11,910 --> 01:01:13,319 something like Thomas DeLeon's 1774 01:01:13,320 --> 01:01:15,419 presentation, is actually sort of 1775 01:01:15,420 --> 01:01:18,059 an actively maintained backdoor 1776 01:01:18,060 --> 01:01:20,039 where someone is compromising not only in 1777 01:01:20,040 --> 01:01:22,109 code bases, but servers 1778 01:01:22,110 --> 01:01:23,909 and is periodically logging into the 1779 01:01:23,910 --> 01:01:25,769 servers and manually intervening in the 1780 01:01:25,770 --> 01:01:27,089 development process. 1781 01:01:27,090 --> 01:01:29,129 So I think the real threat scenario would 1782 01:01:29,130 --> 01:01:31,619 be that someone has compromised 1783 01:01:31,620 --> 01:01:33,809 developer workstations or built servers 1784 01:01:33,810 --> 01:01:36,779 inside various organizations and projects 1785 01:01:36,780 --> 01:01:39,089 and periodically adjusts the back 1786 01:01:39,090 --> 01:01:41,609 door in order to maintain access. 1787 01:01:41,610 --> 01:01:43,439 That's the sort of whereas you don't get 1788 01:01:43,440 --> 01:01:45,119 and you can even get that ability against 1789 01:01:45,120 --> 01:01:46,949 Aagot computers. You just keep seeding 1790 01:01:46,950 --> 01:01:48,749 the parking lot with new USB keys that 1791 01:01:48,750 --> 01:01:50,939 have the malware on it and they propagate 1792 01:01:50,940 --> 01:01:52,499 through all the computers until somebody 1793 01:01:52,500 --> 01:01:54,749 plugs into a USB key that goes into the 1794 01:01:54,750 --> 01:01:56,219 build server. But you don't get that 1795 01:01:56,220 --> 01:01:58,349 property with a compiler because it 1796 01:01:58,350 --> 01:02:00,869 has to be baked in that compiler once. 1797 01:02:00,870 --> 01:02:04,229 And then all of the binary versions 1798 01:02:04,230 --> 01:02:06,959 of that rootkit have to be there. 1799 01:02:06,960 --> 01:02:08,699 And you get in the diverse double 1800 01:02:08,700 --> 01:02:10,319 compilation, you actually get to use old 1801 01:02:10,320 --> 01:02:11,879 compilers so you can go back into your 1802 01:02:11,880 --> 01:02:14,309 archives and pull out a compiler binary 1803 01:02:14,310 --> 01:02:16,589 from nineteen eighty three and 1804 01:02:16,590 --> 01:02:18,359 use that as long as it implements the 1805 01:02:18,360 --> 01:02:20,519 subset of C that GCAS 1806 01:02:20,520 --> 01:02:22,379 can compile g, c, C, you can use that 1807 01:02:22,380 --> 01:02:24,149 one. So I think this area would have had 1808 01:02:24,150 --> 01:02:25,859 to back gone back in time to nineteen 1809 01:02:25,860 --> 01:02:27,479 eighty three to get their multi 1810 01:02:27,480 --> 01:02:29,669 architecture support 1811 01:02:29,670 --> 01:02:31,379 introduced for an architecture that was 1812 01:02:31,380 --> 01:02:32,219 developed later. 1813 01:02:32,220 --> 01:02:34,679 Like I don't know what would be 1814 01:02:34,680 --> 01:02:36,579 armed was Armony one. 1815 01:02:36,580 --> 01:02:38,099 So I think the intuition of the person 1816 01:02:38,100 --> 01:02:39,509 asking the question is that it's very 1817 01:02:39,510 --> 01:02:41,399 difficult for a vector to propagate 1818 01:02:41,400 --> 01:02:42,959 without human intervention through 1819 01:02:42,960 --> 01:02:44,309 heterogeneous compilers. 1820 01:02:44,310 --> 01:02:45,749 And I think we completely agree with 1821 01:02:45,750 --> 01:02:47,789 that. And in fact, that difficulty 1822 01:02:47,790 --> 01:02:50,609 becomes a part of the safety strategy 1823 01:02:50,610 --> 01:02:53,009 and that the real attack strategy is 1824 01:02:53,010 --> 01:02:55,199 probably more manual on the part of it as 1825 01:02:55,200 --> 01:02:56,069 we are running out of time. 1826 01:02:56,070 --> 01:02:57,999 One final question from you and please, 1827 01:02:58,000 --> 01:02:59,129 the short answer. 1828 01:02:59,130 --> 01:03:01,499 Yeah, I would like to know how well do, 1829 01:03:01,500 --> 01:03:02,999 uh, multiple threat works was 1830 01:03:03,000 --> 01:03:05,549 reproducible, so he used more 1831 01:03:05,550 --> 01:03:06,689 than one core to build. 1832 01:03:06,690 --> 01:03:09,059 Your binary is just built 1833 01:03:09,060 --> 01:03:11,579 in general, still usable. 1834 01:03:11,580 --> 01:03:13,139 That is where you run into things like 1835 01:03:13,140 --> 01:03:15,029 the file system ordering problem, where 1836 01:03:15,030 --> 01:03:17,129 the different make threads 1837 01:03:17,130 --> 01:03:19,259 or processes will be writing files into 1838 01:03:19,260 --> 01:03:21,149 the directory in different orders between 1839 01:03:21,150 --> 01:03:23,159 different machines, depending on how many 1840 01:03:23,160 --> 01:03:24,659 processes running. 1841 01:03:24,660 --> 01:03:27,569 So we do use multiple 1842 01:03:27,570 --> 01:03:29,669 cores on in the browser build process 1843 01:03:29,670 --> 01:03:31,079 and we solve that through sorting. 1844 01:03:31,080 --> 01:03:32,819 We sort everything before we make any 1845 01:03:32,820 --> 01:03:35,039 archive of tar or zip or 1846 01:03:35,040 --> 01:03:37,169 jar and during bill process. 1847 01:03:37,170 --> 01:03:38,759 And that takes care of that. 1848 01:03:38,760 --> 01:03:39,809 Thanks. 1849 01:03:39,810 --> 01:03:41,009 OK, thank you very much. 1850 01:03:41,010 --> 01:03:42,389 Please get another round off to the.