1 00:00:18,810 --> 00:00:23,210 Herald: I have the great pleasure to announce Joscha, who will give us a great 2 00:00:23,210 --> 00:00:26,310 talk with the title "The Ghost in the Machine" and he will talk about 3 00:00:26,310 --> 00:00:33,200 consciousness of our mind and of computers and somehow also tell us how we can learn 4 00:00:33,200 --> 00:00:38,080 from A.I. systems about our own brains. And I think this is a very curious question. 5 00:00:38,080 --> 00:00:41,015 So please give it up for Joscha. 6 00:00:41,015 --> 00:00:51,010 *Applause* 7 00:00:51,010 --> 00:00:58,900 Joscha: Good evening. This is the 5th of a talk in a series of talks on how to 8 00:00:58,900 --> 00:01:03,930 get from computation to consciousness and to understand our condition in the 9 00:01:03,930 --> 00:01:09,180 universe based on concepts that I mostly learned by looking at artificial 10 00:01:09,180 --> 00:01:16,530 intelligence and computation and it mostly tackles the big philosophical questions: 11 00:01:16,530 --> 00:01:20,410 What can I know? What is true? What is truth? Who am I? Which means the question 12 00:01:20,410 --> 00:01:25,660 of epistemology, of ontology, of metaphysics, and philosophy of mind and 13 00:01:25,660 --> 00:01:26,710 ethics. 14 00:01:26,710 --> 00:01:30,603 And to clear some of the terms that we are using here: 15 00:01:30,603 --> 00:01:34,300 What is intelligence? What's a mind? What's a self? What's consciousness? 16 00:01:34,300 --> 00:01:37,740 How are mind and consciousness realized in the universe? 17 00:01:37,740 --> 00:01:40,280 Intelligence I think is the ability to make models. 18 00:01:40,280 --> 00:01:42,450 It's not the same thing as being smart, which is the 19 00:01:42,450 --> 00:01:46,770 ability to reach your goals or being wise, which is the ability to pick the right 20 00:01:46,770 --> 00:01:50,680 goals. But it's just the ability to make models of things. 21 00:01:50,680 --> 00:01:53,980 And you can regulate them later using these models, but you don't have to. 22 00:01:53,980 --> 00:01:57,308 And the mind is this thing that observes the universe itself 23 00:01:57,308 --> 00:02:00,867 as an identification with properties and purposes. 24 00:02:00,867 --> 00:02:04,120 What a thing thinks it is. And then you have consciousness, which is 25 00:02:04,120 --> 00:02:08,270 the experience of what it's like to be a thing. 26 00:02:08,270 --> 00:02:10,749 And, how our mind of consciousness is realized in the universe, 27 00:02:10,749 --> 00:02:13,560 this is commonly called the mind-body problem and it's been 28 00:02:13,560 --> 00:02:20,023 puzzling philosophers and people of all proclivities for thousands of years. 29 00:02:20,023 --> 00:02:25,360 So what's going on? How's it possible that I find myself in a universe and I seem to 30 00:02:25,360 --> 00:02:31,130 be experiencing myself in that universe? How does this go together and how is this, 31 00:02:31,130 --> 00:02:37,260 what's going on here? The traditional answer to this is called dualism and the 32 00:02:37,260 --> 00:02:41,510 conception of dualism is that - in our culture at least, this dualist idea that 33 00:02:41,510 --> 00:02:45,620 you have a physical world and a mental world and they coexist somehow and my mind 34 00:02:45,620 --> 00:02:49,620 experiences this mental world and my body can do things in the physical world and 35 00:02:49,620 --> 00:02:53,860 the difficulty of this dualist conception is how do these two planes of existence 36 00:02:53,860 --> 00:02:57,750 interact. Because physics is defined as causally closed, everything that 37 00:02:57,750 --> 00:03:03,340 influences things in the physical world is by itself an element of physics. So an 38 00:03:03,340 --> 00:03:07,410 alternative is idealism which says that there is only a mental world. We only 39 00:03:07,410 --> 00:03:12,460 exist in a dream and this dream is being dreamt by a mind on a higher plane of 40 00:03:12,460 --> 00:03:17,700 existence. And difficulty with this, it's very hard to explain that mind of a higher 41 00:03:17,700 --> 00:03:22,430 plane of existence. Just put it there, why is it doing this? And in our culture the 42 00:03:22,430 --> 00:03:27,040 dominant theory is materialism and is basically there is only a physical world 43 00:03:27,040 --> 00:03:32,100 nothing else. And the physical world somehow is responsible for the creation of 44 00:03:32,100 --> 00:03:36,700 the mental world. It's not quite clear how this happens. And the answer that I am 45 00:03:36,700 --> 00:03:44,110 suggesting, is functionalism which means that indeed we exist only in a dream. 46 00:03:44,110 --> 00:03:48,630 So these ideas of materialism and idealism are not in opposition. They are 47 00:03:48,630 --> 00:03:51,960 complementary because this dream is being dreamt by a mind on a higher plane of 48 00:03:51,960 --> 00:03:57,010 existence, but this higher plane of existence is the physical world. So we are 49 00:03:57,010 --> 00:04:02,660 being dreamt in the neocortex of a primate that lives in a physical universe and the 50 00:04:02,660 --> 00:04:05,780 world that we experience is not the physical world. It's a dream generated by 51 00:04:05,780 --> 00:04:10,120 the neocortex - the same circuits that make dreams at night make them during the 52 00:04:10,120 --> 00:04:13,850 day. You can show this, and you live in this virtual reality being generated in 53 00:04:13,850 --> 00:04:18,430 there and the self as a character in that dream. And it seems to take care of 54 00:04:18,430 --> 00:04:21,520 things. It seems to explain what's going on. It explains why a miracle seems to be 55 00:04:21,520 --> 00:04:26,070 possible and why I can look into the future but cannot break the bank somehow. 56 00:04:26,070 --> 00:04:31,480 And even though this theory explains this, how shouldn't I be more agnostic? Are 57 00:04:31,480 --> 00:04:35,220 there not alternatives that I should be considering? Maybe the narratives of our 58 00:04:35,220 --> 00:04:40,889 big religions and so on. I think we should be agnostic. So the first rule of 59 00:04:40,889 --> 00:04:46,110 epistemology says that the confidence in the belief must equal the weight of the 60 00:04:46,110 --> 00:04:49,311 evidence supporting it. Once we stumble on that rule you can test all the 61 00:04:49,311 --> 00:04:54,130 alternatives and see if one of them is better. And I think what this means is you 62 00:04:54,130 --> 00:04:57,540 have to have all the possible beliefs, you should entertain them all. But you should 63 00:04:57,540 --> 00:05:01,050 not have any confidence in them. You should shift your confidence around based 64 00:05:01,050 --> 00:05:05,560 on the evidence. So for instance it is entirely possible that this universe was 65 00:05:05,560 --> 00:05:09,140 created by a supernatural being, and it's a big conspiracy, and it actually has 66 00:05:09,140 --> 00:05:12,900 meaning and it cares about us and our existence here means something. 67 00:05:12,900 --> 00:05:17,381 But um, there is no experiment that can validate this. A guy coming down from a 68 00:05:17,381 --> 00:05:21,160 burning mount, from a burning bush, that you've talked to on a 69 00:05:21,160 --> 00:05:28,370 mountaintop? That's not a kind of experi- ment that gives you valid evidence, right? 70 00:05:28,370 --> 00:05:32,560 So intelligence is the ability to make models and intelligence is a property 71 00:05:32,560 --> 00:05:36,730 that is beyond the grasp of a single individual. A single individual is not 72 00:05:36,730 --> 00:05:41,090 that smart. We cannot figure out even tur- ing complete languages all by ourselves. 73 00:05:41,090 --> 00:05:45,270 To do this you need an intellectual tradition that lasts a few hundred years 74 00:05:45,270 --> 00:05:49,600 at least. So civilizations have more intelligence than individuals. But 75 00:05:49,600 --> 00:05:54,320 individuals often have more intelligence than groups and whole generations and 76 00:05:54,320 --> 00:05:58,830 that's because groups and generations tend to converge on ideas; they have consensus 77 00:05:58,830 --> 00:06:03,400 opinions. I'm very wary of consensus opinions because you know how hard it is 78 00:06:03,400 --> 00:06:06,480 to understand which programming language is the best one for which purpose. There 79 00:06:06,480 --> 00:06:09,830 is no proper consensus. And that's a relatively easy problem. So when there's a 80 00:06:09,830 --> 00:06:13,919 complex topics and all the experts agree, there are forces at work that are 81 00:06:13,919 --> 00:06:17,230 different than the forces that make them search for truth. These consensus-building 82 00:06:17,230 --> 00:06:21,479 forces, they're very suspicious to me. And if you want to understand what's true you 83 00:06:21,479 --> 00:06:24,840 have to look for means and motive. And you have to be autonomous in doing this, so 84 00:06:24,840 --> 00:06:29,229 individuals typically have better ideas than generations or groups. But as I 85 00:06:29,229 --> 00:06:32,670 said, civilizations have more intelligence than individuals. What does a 86 00:06:32,670 --> 00:06:36,860 civilizational intellect look like? The civilization intellect is something like a 87 00:06:36,860 --> 00:06:40,160 global optimum of the modeling function. It's something that has to be built over 88 00:06:40,160 --> 00:06:43,610 thousands of years in an unbroken intellectual tradition. And guess what, 89 00:06:43,610 --> 00:06:47,100 this doesn't really exist in human history. Every few hundred years, there's 90 00:06:47,100 --> 00:06:51,350 some kind of revolution. Somebody opens the doors to the knowledge factories and 91 00:06:51,350 --> 00:06:54,790 gets everybody out and burns down the libraries. And a couple generations later, 92 00:06:54,790 --> 00:06:58,830 the knowledge worker drones of the new king realize "Oh my God we need to rebuild 93 00:06:58,830 --> 00:07:02,720 this thing, this intellect." And then they create something in its likeness, but they 94 00:07:02,720 --> 00:07:07,760 make mistakes in the foundation. So this intellect tends to have scars. Like our 95 00:07:07,760 --> 00:07:11,539 civilization intellect has a lot of scars in it, that make it hard-to-difficult 96 00:07:11,539 --> 00:07:16,510 to understand concepts like self and consciousness and mind. So, the mind 97 00:07:16,510 --> 00:07:19,680 is something that observes the universe, and the neurons and neurotransmitters are 98 00:07:19,680 --> 00:07:22,860 the substrate. And the human intellect and the working memory is the current binding 99 00:07:22,860 --> 00:07:26,931 state, how do the different elements fit together in our mind? And the self is the 100 00:07:26,931 --> 00:07:31,169 identification is what we think we are and what we want to happen. And consciousness 101 00:07:31,169 --> 00:07:35,270 is the contents of our attention, it makes knowledge available throughout the mind. 102 00:07:35,270 --> 00:07:39,419 And civilizational intellect is very similar: society is observe the universe, 103 00:07:39,419 --> 00:07:42,160 people and resources are the substrate, the generation is the current binding 104 00:07:42,160 --> 00:07:46,860 state, and culture is the identification with what we think we are and what we want 105 00:07:46,860 --> 00:07:51,840 to happen. And media is the contents of our attention and make knowledge available 106 00:07:51,840 --> 00:07:55,930 throughout society. So the culture is basically the self of civilization, and 107 00:07:55,930 --> 00:08:00,490 media is its consciousness. How is it possible to model a universe? Let's take a 108 00:08:00,490 --> 00:08:04,771 very simple universe like the Mandelbrot fractal. It can be defined by a little bit 109 00:08:04,771 --> 00:08:09,490 of code. It's a very simple thing, you just take a pair of numbers, you square it, you 110 00:08:09,490 --> 00:08:13,760 add the same pair of numbers. And you do this infinitely often, and typically this 111 00:08:13,760 --> 00:08:18,940 goes to infinity very fast. There's a small area around the origin of the number 112 00:08:18,940 --> 00:08:24,680 pair, so between -1 and +1 and so on, where you have an area where this 113 00:08:24,680 --> 00:08:28,330 converges, where it doesn't go to infinity and that is where you make black dots and 114 00:08:28,330 --> 00:08:33,250 then you get this famous structure, the Mandelbrot fractal. And because this 115 00:08:33,250 --> 00:08:37,229 divergence and convergence of the function can take many loops and circles and so on, 116 00:08:37,229 --> 00:08:41,169 a very complicated shape a very complicated outline, an infinitely 117 00:08:41,169 --> 00:08:44,709 complicated outline there. So there is an infinite amount of structure in this 118 00:08:44,709 --> 00:08:47,990 fractal. And now imagine you happen to live in this fractal and you are in a 119 00:08:47,990 --> 00:08:52,529 particular place in it, and you don't know where that is where that place is. You 120 00:08:52,529 --> 00:08:55,189 don't even know the generator function of the whole thing. But you can still predict 121 00:08:55,189 --> 00:08:58,350 your neighborhood. So you can see, omg, I'm in some kind of a spiral, it turns 122 00:08:58,350 --> 00:09:01,629 to the left, goes to the left, and goes to left, and becomes smaller, so we can 123 00:09:01,629 --> 00:09:05,660 predict and suddenly it ends. Why does it end? A singularity. Oh, it hits another 124 00:09:05,660 --> 00:09:09,290 spiral. There's a law when a spiral hits another spiral, it ends. And something 125 00:09:09,290 --> 00:09:14,310 else happens. So you look and then you see oh, there are certain circumstances where 126 00:09:14,310 --> 00:09:17,360 you have, for instance, an even number of spirals hitting each other instead of an 127 00:09:17,360 --> 00:09:20,769 odd number. And then you discover another law. And if you make like 50 levels of 128 00:09:20,769 --> 00:09:25,209 of these laws, and this is a good description that locally compresses the 129 00:09:25,209 --> 00:09:28,509 universe. So the Mandelbrot fractal is locally compressable. You find local 130 00:09:28,509 --> 00:09:32,110 order that predicts the neighborhood if you are inside of that fractal. The global 131 00:09:32,110 --> 00:09:35,469 modelling function of the Mandelbrot fractal is very, very easy. It's an 132 00:09:35,469 --> 00:09:40,009 interesting question: how difficult is the global modelling function of our universe? 133 00:09:40,009 --> 00:09:43,160 Even if we know it maybe it doesn't help us that much, it will be a big 134 00:09:43,160 --> 00:09:46,230 breakthrough for physics when we finally find it, it will be much shorter than the 135 00:09:46,230 --> 00:09:52,610 standard model, as I suspect, but we still don't know where we are. And this means we 136 00:09:52,610 --> 00:09:55,689 need to make a local model of what's happening. So in order to do this we 137 00:09:55,689 --> 00:09:59,850 separate the universe into things. Things are small state spaces and transition 138 00:09:59,850 --> 00:10:04,509 functions that tell you how to get from state to state. And if the function is 139 00:10:04,509 --> 00:10:08,009 deterministic it is independent of time, it gives the same result every time you 140 00:10:08,009 --> 00:10:12,600 call it. For an indeterministic function it gives a different result every time, so 141 00:10:12,600 --> 00:10:17,139 it doesn't compress well. And causality means that you have separate several 142 00:10:17,139 --> 00:10:20,139 things and they influence each other's evolution thrugh a shared interface. 143 00:10:20,139 --> 00:10:24,389 Right? So causality is an artifact of describing the universe as separate 144 00:10:24,389 --> 00:10:28,019 things. And the universe is not separate things, it's one thing, but we get have to 145 00:10:28,019 --> 00:10:32,599 describe it as separate things because we cannot observe the whole thing. So what's 146 00:10:32,599 --> 00:10:36,649 true? There seems to be a particular way in which the universe seems to be and 147 00:10:36,649 --> 00:10:40,399 that's the ground rules of the universe and it's inaccessible to us. And what's 148 00:10:40,399 --> 00:10:44,509 accessible to us is our own models of the universe. The only thing that we can 149 00:10:44,509 --> 00:10:47,550 experience, and this is basically a set of theories that can explain the 150 00:10:47,550 --> 00:10:52,401 observations. And truth in this sense is a property of language and there are 151 00:10:52,401 --> 00:10:56,689 different languages that we can use like geometry and natural language and so on 152 00:10:56,689 --> 00:11:00,269 and ways of representing and changing models of our languages and several 153 00:11:00,269 --> 00:11:06,100 intellectual traditions have developed their own languages. And this has led to 154 00:11:06,100 --> 00:11:10,259 problems. Our civilization basically has as its founding myth this attempt to build 155 00:11:10,259 --> 00:11:14,689 this global optimum modelling function. This is a tower that is meant to reach the 156 00:11:14,689 --> 00:11:18,120 heavens. And it fell apart because people spoke different languages. The different 157 00:11:18,120 --> 00:11:20,910 practitioners in the different fields and they didn't understand each other and the 158 00:11:20,910 --> 00:11:24,559 whole building collapsed. And this is in some sense the origin of our present 159 00:11:24,559 --> 00:11:28,490 civilization and we are trying to mend this and find better languages. So whom 160 00:11:28,490 --> 00:11:32,269 can we turn to? We can turn to the mathematicians maybe because mathematics 161 00:11:32,269 --> 00:11:35,990 is the domain of all languages. Mathematics is really cool when you think 162 00:11:35,990 --> 00:11:40,009 about it. It's a universal code library, maintained for several centuries in its 163 00:11:40,009 --> 00:11:44,069 present form. There is not even version management, it's one version. There is 164 00:11:44,069 --> 00:11:47,670 pretty much unified namespace. They have to use a lot of the Unicode to make it 165 00:11:47,670 --> 00:11:52,040 happen. It's ugly but there you go! It has no central maintainers, not even a code of 166 00:11:52,040 --> 00:11:54,589 conduct, beyond what you can infer yourself. 167 00:11:54,589 --> 00:11:57,899 *laughter* But there are some problems at the 168 00:11:57,899 --> 00:12:06,060 foundation that they discovered. Shouted from the audience: en sehr stabile 169 00:12:06,060 --> 00:12:09,869 Joscha: Can you infer this is a good conduct? ?????????? 170 00:12:09,869 --> 00:12:17,029 Yelling from the audience: Ya! Joscha: Okay. Power to you. 171 00:12:17,029 --> 00:12:20,790 *laughter* Joscha: In 1874 discovered when you looked 172 00:12:20,790 --> 00:12:25,399 at the cardinality of a set, that when you described natural numbers using set 173 00:12:25,399 --> 00:12:30,129 theory, that the cardinality of a set grows slower than the cardinality of the 174 00:12:30,129 --> 00:12:33,480 set of its subsets. So if you look at the set of the subsets of the set, it's always 175 00:12:33,480 --> 00:12:38,209 larger than the cardinality of the number of members of the set. Clear? Right. If 176 00:12:38,209 --> 00:12:42,170 you take the infinite set, it has infinitely many members: omega. You 177 00:12:42,170 --> 00:12:45,749 take the cardinality of the set of the subsets of the infinite set, it's also an 178 00:12:45,749 --> 00:12:49,670 infinite number, but it's a larger one. So it's a number that is larger than the 179 00:12:49,670 --> 00:12:55,459 previous omega. Okay that's fine. Now we have the cardinality of the set of all 180 00:12:55,459 --> 00:12:57,899 sets. You make the total set: The set where you put all the sets that could 181 00:12:57,899 --> 00:13:01,609 possibly exist and put them all together, right? That has also infinitely many 182 00:13:01,609 --> 00:13:04,839 members, and it has more than the cardinality of the set of the subsets of 183 00:13:04,839 --> 00:13:08,769 the infinite set. That's fine. But now you look at the cardinality of the set of all 184 00:13:08,769 --> 00:13:14,279 the subsets of the total set. The problem is, that the total set also contains the 185 00:13:14,279 --> 00:13:17,729 set of its subsets, right? It's because it contains all the sets. Now you have a 186 00:13:17,729 --> 00:13:22,170 contradiction: Because the cardinality of the set of the subsets of the total set is 187 00:13:22,170 --> 00:13:26,750 supposed to be larger. And yet it seems to be the same set and not the same set. It's 188 00:13:26,750 --> 00:13:31,990 an issue! So mathematicians got puzzled about this, and the philosopher Bertrand 189 00:13:31,990 --> 00:13:34,999 Russell said: "Maybe we just exclude those sets that don't contain themselves", 190 00:13:34,999 --> 00:13:39,239 right? We only look at the set of sets that don't contain themselves. Isn't that 191 00:13:39,239 --> 00:13:42,850 a solution? Now the problem is: Does the set of the sets that doesn't contain 192 00:13:42,850 --> 00:13:47,445 themselves contain itself? If it does, it doesn't, and if it doesn't, it does. 193 00:13:47,445 --> 00:13:52,180 That's an issue! *laughter* 194 00:13:52,180 --> 00:13:56,119 So David Hilbert, who was some kind of a community manager back then, 195 00:13:56,119 --> 00:14:00,100 said: "Guys, fix this! This is an issue, mathematics is precious, we are in 196 00:14:00,100 --> 00:14:04,819 trouble. Please solve meta mathematics." And people got to work. And after a short 197 00:14:04,819 --> 00:14:08,100 amount of time Kurt Gödel, who had looked at this in earnest said "oh that's an issue, 198 00:14:08,100 --> 00:14:11,209 issue. You know, as soon as we allow these kinds of loops - and we cannot really 199 00:14:11,209 --> 00:14:16,439 exclude these loops - then our mathematics crashes." So that's an issue, it's called 200 00:14:16,439 --> 00:14:21,779 Unentscheidbarkeit. And then Alan Turing came along a couple of years later, and he 201 00:14:21,779 --> 00:14:24,329 constructed a computer to make that proof. He basically said "If you build a machine 202 00:14:24,329 --> 00:14:27,990 that does these mathematics, and the machine takes infinitely many steps, 203 00:14:27,990 --> 00:14:31,920 sometimes, for making a proof, then we cannot know whether this proof 204 00:14:31,920 --> 00:14:35,669 terminates." So it's a similar issue for the Unentscheidbarkeit. That's a big 205 00:14:35,669 --> 00:14:39,199 issue, right? So we cannot basically build a machine in mathematics that runs 206 00:14:39,199 --> 00:14:45,269 mathematics without crashing. But the good news is, Turing didn't stop working there 207 00:14:45,269 --> 00:14:48,609 and he figured out together with Alonzo Church - not together, independently but 208 00:14:48,609 --> 00:14:53,819 at the same time - that we can build a computational machine, that runs all of 209 00:14:53,819 --> 00:14:59,269 computation. So computation is a universal thing. And it's almost as good as 210 00:14:59,269 --> 00:15:03,279 mathematics. Computation is constructive mathematics. The tiny, neglected subset of 211 00:15:03,279 --> 00:15:06,360 mathematics, where you have to show the money. In order to say that something is 212 00:15:06,360 --> 00:15:10,839 true, you have to find that object that is true. You have to actually construct it. 213 00:15:10,839 --> 00:15:13,960 So there are no infinities, because you cannot construct an infinity. You add 214 00:15:13,960 --> 00:15:19,110 things and you have unboundedness maybe, but not infinity. And so this part of 215 00:15:19,110 --> 00:15:23,760 computation, mathematics is the one that can be implemented. It's constructive 216 00:15:23,760 --> 00:15:27,309 mathematics. It's the good part. And computing, a computer is very easy to 217 00:15:27,309 --> 00:15:31,079 make, and all universal computers have the same power. That's called the Chuch-Turing 218 00:15:31,079 --> 00:15:37,069 thesis. And Turing even didn't even stop there. The obvious conclusion is that, 219 00:15:37,069 --> 00:15:40,440 human minds are probably not in the class of these mathematical machines, that even 220 00:15:40,440 --> 00:15:43,929 God doesn't know how to build if it has to be done in any language. But it's a 221 00:15:43,929 --> 00:15:47,650 computational machine. And it also means that all machines that human minds ever 222 00:15:47,650 --> 00:15:50,340 encounter, mathematics that human minds encounter, 223 00:15:50,340 --> 00:15:55,940 will be computational mathematics. So how can you bridge the gap 224 00:15:55,940 --> 00:16:00,279 from mathematics to philosophy? Can we find a language that is more powerful than 225 00:16:00,279 --> 00:16:03,039 most of the languages that we look at mathematics, which are very narrowly 226 00:16:03,039 --> 00:16:07,559 defined language, so every symbol, we know exactly what it means. 227 00:16:07,559 --> 00:16:09,089 When we look at the real world, 228 00:16:09,089 --> 00:16:11,389 we often don't know what things mean, and our concepts, we're not quite 229 00:16:11,389 --> 00:16:14,799 sure what they mean. Like culture is a very vague ambigous concept. So what I 230 00:16:14,799 --> 00:16:20,139 said is only approximately true there. Can we deal with this conceptual ambiguity? 231 00:16:20,139 --> 00:16:24,319 Can we build a programming language for thought, where words mean things that 232 00:16:24,319 --> 00:16:28,169 they're supposed to mean? And this was the project of Ludwig Wittgenstein. He just 233 00:16:28,169 --> 00:16:32,769 came back from the war and had a lot of thoughts. Then he put these thoughts 234 00:16:32,769 --> 00:16:37,669 into a book which is called the Tractatus. And it's one of the most beautiful books 235 00:16:37,669 --> 00:16:42,410 in the philosophy of the 20th century. And it starts with the words "Die Welt ist 236 00:16:42,410 --> 00:16:47,359 alles, was der Fall ist. Die Welt ist die Gesamtheit der Fakten, nicht der Dinge. 237 00:16:47,359 --> 00:16:53,619 Die Welt ist bestimmt, bei den Fakten, und dadurch, dass diese all die Fakten sind.", 238 00:16:53,619 --> 00:16:57,360 usw. This book is about 75 pages long and it's a single thought. It's not meant to 239 00:16:57,360 --> 00:17:01,569 be an argument to convince a philosopher. It's an attempt by a guy who was basically 240 00:17:01,569 --> 00:17:05,860 a coder, an AI scientist, to reverse engineer the language of his own thinking. 241 00:17:05,860 --> 00:17:11,310 And make it deterministic, to make it formal, to make it mean something. And he 242 00:17:11,310 --> 00:17:15,180 felt back then that he was successful, and had a tremendous impact on philosophy, 243 00:17:15,180 --> 00:17:19,110 which was largely devastating, because the philosophers didn't know what he was on 244 00:17:19,110 --> 00:17:22,930 about. They thought it's about natural language and not about coding. 245 00:17:22,930 --> 00:17:25,430 And he wrote this in 1918 246 00:17:25,430 --> 00:17:29,350 so before Alan Turing defined, what a computer is. But he would already 247 00:17:29,350 --> 00:17:33,530 smell what a computer is. He already knew about university of computation. He knew 248 00:17:33,530 --> 00:17:37,370 that a NAND gate is sufficient to explain all of boolean algebra and it's equivalent 249 00:17:37,370 --> 00:17:42,760 to other things. So what he basically did, was, he pre-empted the logicists' program 250 00:17:42,760 --> 00:17:47,600 of artificial intelligence which started much later in the 1950s. And he ran into 251 00:17:47,600 --> 00:17:51,420 troubles with it. In the end he wrote the book "Philosophical Investigations", where 252 00:17:51,420 --> 00:17:57,110 he concluded, that his project basically failed. And that there is a... because the 253 00:17:57,110 --> 00:18:01,740 world is too complex and too ambiguous to deal with this. And symbolic AI was mostly 254 00:18:01,740 --> 00:18:05,470 similar to Wittgenstein's program. So classical AI is symbolic. You analyze a 255 00:18:05,470 --> 00:18:10,250 problem, you find an algorithm to solve it. And what we now have in AI, is mostly 256 00:18:10,250 --> 00:18:14,370 sub-symbolic. So we have algorithms, that learn the solution of a problem by 257 00:18:14,370 --> 00:18:17,810 themselves. And it's tempting to think, that the next thing what we have will be 258 00:18:17,810 --> 00:18:22,520 meta-learning. That you have algorithms, that learn to learn the solution to the 259 00:18:22,520 --> 00:18:28,130 problem. Meanwhile, let's look at how we can make models. Information is a 260 00:18:28,130 --> 00:18:30,930 discernible difference. It's about change. All information is about change. The 261 00:18:30,930 --> 00:18:33,950 information that is not about change, you cannot see a causal effect on the world, 262 00:18:33,950 --> 00:18:38,650 because it stays the same, right? And the meaning of information is its relationship 263 00:18:38,650 --> 00:18:43,490 to change in other information. So if you see a blip on your retina, the meaning 264 00:18:43,490 --> 00:18:46,810 of that blip on your retina is the relationships you discover to other blips 265 00:18:46,810 --> 00:18:50,390 on your retina. It could be for instance, if you see a sequence of such blips, that 266 00:18:50,390 --> 00:18:55,220 are adjacent to each other, first order model, you see a moving dust mote or a 267 00:18:55,220 --> 00:18:59,130 moving dot on your retina. And a higher order model makes it possible to 268 00:18:59,130 --> 00:19:02,240 understand: "Oh, it's part of something larger! There's people moving in a three 269 00:19:02,240 --> 00:19:06,110 dimensional room and they exchange ideas." And this is maybe the best model 270 00:19:06,110 --> 00:19:08,770 you end up with. That's the local compression, that you can make of your 271 00:19:08,770 --> 00:19:13,360 universe, based on correlating blips on your retina. And for those blips where you 272 00:19:13,360 --> 00:19:16,550 don't find a relationship, which is a function that your brain can compute, 273 00:19:16,550 --> 00:19:21,800 they are noise. And there's a lot of noise on our retina, too. So what's a function? 274 00:19:21,800 --> 00:19:26,010 A function is basically a gear box: It has n input levers and 1 output lever. 275 00:19:26,010 --> 00:19:30,820 And when you move the input levers they translate to movement of the output 276 00:19:30,820 --> 00:19:34,410 levers, right? And the function can be realized in many ways: maybe you cannot 277 00:19:34,410 --> 00:19:38,780 open the gear box, and what happened in this function could be for instance, two 278 00:19:38,780 --> 00:19:43,320 sprockets, which do this. Or you can have the same results with levers and pulleys. 279 00:19:43,320 --> 00:19:49,010 And so you don't know what's inside, but you can express it as this does: two times 280 00:19:49,010 --> 00:19:53,490 the input value, right? And you can have a more difficult case, where you have 281 00:19:53,490 --> 00:19:56,320 several input values and they all influence the output value. So how do you 282 00:19:56,320 --> 00:20:00,190 figure it out? A way to do this, is, you only move one input value at a time and 283 00:20:00,190 --> 00:20:03,240 you wiggle it a little bit at every position and see how much this translates 284 00:20:03,240 --> 00:20:08,860 into wiggling of the output value. This is what we call *taking partial differential*. 285 00:20:08,860 --> 00:20:12,540 And it's simple to do this for this case where you just have to 286 00:20:12,540 --> 00:20:17,010 multiply it by two. And the bad case is like this: you have a combination lock and 287 00:20:17,010 --> 00:20:21,440 it has maybe 1000 bit input value, and only if you have exactly the right 288 00:20:21,440 --> 00:20:26,469 combination of the input bits you have a movement of the output bit. And you're not 289 00:20:26,469 --> 00:20:30,550 going to figure this out until your sun burns out, right? So there's no way you 290 00:20:30,550 --> 00:20:34,640 can decipher this function. And the functions that we can model are somewhere 291 00:20:34,640 --> 00:20:38,911 in between, something like this: So you have 40 million input images and you want 292 00:20:38,911 --> 00:20:44,200 to find out, whether one of these images displays a cat, or a dog, or something 293 00:20:44,200 --> 00:20:47,750 else. So what can you do with this? You cannot do this all at once, right? So you 294 00:20:47,750 --> 00:20:51,060 need to take this image classifier function and disassemble it into small 295 00:20:51,060 --> 00:20:54,410 functions that are very well-behaved, so you know what to do with them. And an 296 00:20:54,410 --> 00:21:00,290 example for such a function is this one: it's one, where you have this input 297 00:21:00,290 --> 00:21:06,570 layer and it translates to the output value with a pulley. And it has some 298 00:21:06,570 --> 00:21:11,170 stopper that limits the movement of the output value. And you have some pivot. And 299 00:21:11,170 --> 00:21:15,581 you can take this pivot and you can shift it around. And by shifting this pivot, you 300 00:21:15,581 --> 00:21:21,330 decide, how much the input value contributes to the output value. Right, so 301 00:21:21,330 --> 00:21:24,880 you shift it, you can even make a negative, so it shifts in the opposite 302 00:21:24,880 --> 00:21:29,680 direction, and you shifted beyond this connection point of the pulley. And you 303 00:21:29,680 --> 00:21:32,730 can also have multiple input values, that use the same pulley and pull together, 304 00:21:32,730 --> 00:21:38,450 right? So they add up to the output value. That's a pretty nice, neat function 305 00:21:38,450 --> 00:21:44,150 approximator, that basically performs a weighted sum of the input values, and maps 306 00:21:44,150 --> 00:21:51,760 it to a range-constrained output value. And you can now shift these pivots, these 307 00:21:51,760 --> 00:21:55,540 weights around to get to different output values. Now let's take this thing and 308 00:21:55,540 --> 00:22:00,510 build it into lots of layers, so the outputs are the inputs of the next layer. 309 00:22:00,510 --> 00:22:04,570 And now you connect this to your image. If you use ImageNet, the famous database that 310 00:22:04,570 --> 00:22:09,260 I mentioned earlier, that people use for testing their vision algorithms, have 311 00:22:09,260 --> 00:22:14,380 something like one and half million bits as an input image. Now you take these 312 00:22:14,380 --> 00:22:17,630 bits and connect them to the input layer. I was too lazy to draw all of them, so I 313 00:22:17,630 --> 00:22:22,280 made this very simplified, it's also more layers. And so you set them, according to 314 00:22:22,280 --> 00:22:27,050 the bits of the input image, and then this will propagate the movement of the input 315 00:22:27,050 --> 00:22:30,590 layer to the output. And the output will move and it will point to some direction, 316 00:22:30,590 --> 00:22:34,750 which is usually the wrong one. Now, to make this better, you train it. And you do 317 00:22:34,750 --> 00:22:38,420 this by taking this output lever and shift it a little bit, not too much, into the 318 00:22:38,420 --> 00:22:41,580 right direction. If you do it too much, you destroy everything you did before. 319 00:22:41,580 --> 00:22:46,590 And now you will see, how much, in which direction you need to shift the pivots, to 320 00:22:46,590 --> 00:22:52,070 get the result closer to the desired output value, and how much each of the 321 00:22:52,070 --> 00:22:56,350 inputs contributed to the mistakes, so to the error. And you take this error and you 322 00:22:56,350 --> 00:23:00,650 propagate it backwards. It's called back propagation. And you do this quite often. 323 00:23:00,650 --> 00:23:04,710 So you do this for tens of thousands of images. If you do just character 324 00:23:04,710 --> 00:23:08,550 recognition, then it's a very simple thing a few thousands or ten thousands of 325 00:23:08,550 --> 00:23:12,990 examples will be enough. And for something like your image database you need lots and 326 00:23:12,990 --> 00:23:16,801 lots of more data. You need millions of input images to get to any result. And if 327 00:23:16,801 --> 00:23:21,080 it doesn't work, you just try a different arrangement of layers. And the thing is 328 00:23:21,080 --> 00:23:24,740 eventually able to learn an algorithm with as up to as many steps as there are 329 00:23:24,740 --> 00:23:30,960 layers, and has some difficulties learning loops, you need tricks to make that 330 00:23:30,960 --> 00:23:35,690 happen, and its difficult to make this dynamic, and so on. And it's a bit 331 00:23:35,690 --> 00:23:39,980 different from what we do, because our mind is not testable in classification. 332 00:23:39,980 --> 00:23:44,300 It learns per continuous perception, so we learn a single function. A model of the 333 00:23:44,300 --> 00:23:49,370 universe is not a bunch of classifiers, it's one single function. An operator that 334 00:23:49,370 --> 00:23:52,660 explains all your sensory data and we call this operator the universe, right? 335 00:23:52,660 --> 00:23:56,610 It's the world, that we live in. And every thing that we learn and see is part of this 336 00:23:56,610 --> 00:24:00,380 universe. So even when you see something in a movie on a screen, you explain this 337 00:24:00,380 --> 00:24:02,710 as part of the universe by telling yourself "the things that I'm seeing here, 338 00:24:02,710 --> 00:24:06,300 they're not real. They just happen in a movie." So this brackets a sub-part of 339 00:24:06,300 --> 00:24:10,190 this universe into a sub-element of this function. So you can deal with it and it 340 00:24:10,190 --> 00:24:13,770 doesn't contradict the rest. And the degrees of freedom of our model try to 341 00:24:13,770 --> 00:24:17,740 match the degrees of freedom of the universe. How can we get a neural network 342 00:24:17,740 --> 00:24:22,690 to do this? So, there are many tricks. And a recent trick that has been invented is a 343 00:24:22,690 --> 00:24:26,841 GAN. It's a Generative Adversarial neural Network. It consists of two networks: one 344 00:24:26,841 --> 00:24:30,980 generator that invents data, that look like the real world, and the discriminator 345 00:24:30,980 --> 00:24:35,630 that tries to find out, if the stuff that the generator produces is real or fake. 346 00:24:35,630 --> 00:24:40,840 And they both get trained with each other. So they together get better and better in 347 00:24:40,840 --> 00:24:45,360 an adversarial competition. And the results of this are now really good. So 348 00:24:45,360 --> 00:24:50,200 this is work by Tero Karras, Samuli Laine and Timo Aila, that they did at NVIDIA 349 00:24:50,200 --> 00:24:57,060 this year and it's called StyleGAN. And this StyleGAN is able to abstract over 350 00:24:57,060 --> 00:25:00,590 different features and combine them. The styles are basically parameters, they're 351 00:25:00,590 --> 00:25:05,470 free variables of the model at different levels of importance. And so you take from 352 00:25:05,470 --> 00:25:11,330 the - in the top row you see images, where it takes the variables: gender, age, hair 353 00:25:11,330 --> 00:25:14,320 length, and so on, and glasses and pose. And in the bottom where it takes 354 00:25:14,320 --> 00:25:16,700 everything else and combines this, and every time you get a 355 00:25:16,700 --> 00:25:21,410 valid interpretation between them. 356 00:25:21,410 --> 00:25:27,015 *drinks water* 357 00:25:36,731 --> 00:25:38,420 So, you have these coarse styles, which are: 358 00:25:38,420 --> 00:25:41,620 the pose, the hair, the face shape, your facial features and the eyes, 359 00:25:41,620 --> 00:25:47,204 the lowest level is just the colors. Let's see see what happens if you combine them. 360 00:25:58,920 --> 00:26:02,200 The variables that change here, in machine learning, we call them the latent 361 00:26:02,200 --> 00:26:05,180 variables of that. 362 00:26:05,180 --> 00:26:10,265 Of the space of objects that has been described by this. 363 00:26:10,265 --> 00:26:15,260 And it's tempting to think, that this is quite similar to how our imagination works 364 00:26:15,260 --> 00:26:20,360 right? But these artificial neurons, they are very, very different from what 365 00:26:20,360 --> 00:26:23,631 biological neurons do. Biological neurons are essentially little animals, that are 366 00:26:23,631 --> 00:26:26,910 rewarded for firing at the right moment. And they try to fire because otherwise 367 00:26:26,910 --> 00:26:30,220 they do not get fed, and they die, because the organism doesn't need them, and 368 00:26:30,220 --> 00:26:34,360 culls them. And they learn which environmental states predict anticipated 369 00:26:34,360 --> 00:26:38,060 reward. So they grow around and find different areas that give them predictions 370 00:26:38,060 --> 00:26:43,710 of when they should fire. And they connect with each other to form small collectives, 371 00:26:43,710 --> 00:26:47,880 that are better at this task of predicting anticipated reward. And as a side effect 372 00:26:47,880 --> 00:26:51,860 they produce exactly the regulation that the organism needs. Basically they learn, 373 00:26:51,860 --> 00:26:55,500 what the organism feeds them for. 374 00:26:55,500 --> 00:26:57,890 And yet they're able to learn very similar things. 375 00:26:57,890 --> 00:27:01,500 And it's because, in some sense, they are Turing complete. They are machines that 376 00:27:01,500 --> 00:27:06,090 are able to learn the statistics of the data. 377 00:27:06,090 --> 00:27:08,210 So, a general model: What it does, is, 378 00:27:08,210 --> 00:27:12,420 it encodes patterns to predict other present and future patterns. And it's a 379 00:27:12,420 --> 00:27:15,810 network of relationships between the patterns, which are all the invariants 380 00:27:15,810 --> 00:27:18,810 that we can observe. And there are free parameters, which are variables that hold 381 00:27:18,810 --> 00:27:25,780 the state to encode this variant. So we have patterns, and we have sets of 382 00:27:25,780 --> 00:27:29,920 possible values which are variables. And they constrain each other in terms of 383 00:27:29,920 --> 00:27:33,920 possibility, what values are compatible with each other. And they also can train 384 00:27:33,920 --> 00:27:39,700 future values. And they are connected also with probabilities. The probabilities tell 385 00:27:39,700 --> 00:27:42,530 you, when you see a certain thing, how probable it is that the world is in that 386 00:27:42,530 --> 00:27:45,800 state. And this tells you how your model should converge. So, until you are in 387 00:27:45,800 --> 00:27:49,070 a state where your model is coherent, and everything is possible in it, how do you 388 00:27:49,070 --> 00:27:52,480 get to one of the possible states based on your inputs? And this is determined by 389 00:27:52,480 --> 00:27:56,410 probability. And the thing that gives meaning and color to what you perceive is 390 00:27:56,410 --> 00:27:59,230 called valence. And it depends on your preferences: the things that give you 391 00:27:59,230 --> 00:28:02,610 pleasure and pain, that makes you interested in stuff. And there are also 392 00:28:02,610 --> 00:28:07,620 norms, which are beliefs without priors, which are like things that you want to be 393 00:28:07,620 --> 00:28:11,050 true, regardless of whether they give you pleasure and pain, and it's necessary for 394 00:28:11,050 --> 00:28:15,260 instance, coordinating social activity between people. So, we have different 395 00:28:15,260 --> 00:28:18,410 model constraints, that possibility and probability. And we have the reward 396 00:28:18,410 --> 00:28:23,220 function, that is given by valence and norms. And our human perception starts 397 00:28:23,220 --> 00:28:27,250 with patterns, which are visual, auditory, tactile, proprioceptive. Then we have 398 00:28:27,250 --> 00:28:31,690 patterns in our emotional and motivational systems. And we have patterns in our 399 00:28:31,690 --> 00:28:36,220 mental structure, which are results of our imagination and memory. And we take these 400 00:28:36,220 --> 00:28:40,730 patterns and encode them into percepts, which are abstractions that we can deal 401 00:28:40,730 --> 00:28:47,100 with, and note, and put into our attention. And then we combine them into a 402 00:28:47,100 --> 00:28:51,260 binding state in our working memory in a simulation, which is the current instance 403 00:28:51,260 --> 00:28:55,020 of the universe function that explains the present state of the universe that we find 404 00:28:55,020 --> 00:28:58,920 ourselves in. The scene in which we are and in which a self exists. And this self 405 00:28:58,920 --> 00:29:02,670 is basically composed of the somatosensory and motivational, and 406 00:29:02,670 --> 00:29:07,630 mental components. Then we also have the world state, which is abstracted over the 407 00:29:07,630 --> 00:29:11,640 environmental data. And we have something like a mental stage, in which you can do 408 00:29:11,640 --> 00:29:14,200 counterfactual things, that are not physical. Like when you think about 409 00:29:14,200 --> 00:29:18,950 mathematics, or philosophy, or the future, or a movie, or past worlds, or possible 410 00:29:18,950 --> 00:29:24,750 worlds, and so on, right? And then the abstract knowledge from the world state 411 00:29:24,750 --> 00:29:27,630 into global maps. Because we're not always in the same place, but we recall 412 00:29:27,630 --> 00:29:31,050 what other places look like and what to expect, and it forms how we construct the 413 00:29:31,050 --> 00:29:34,480 current world state. And we do this not only with these maps, but we do this with 414 00:29:34,480 --> 00:29:37,490 all kinds of knowledge. So knowledge is second order knowledge over the 415 00:29:37,490 --> 00:29:41,730 abstractions that we have, and the direct perception. And then we have an 416 00:29:41,730 --> 00:29:45,080 attentional system. And the attentional system helps us to select data in the 417 00:29:45,080 --> 00:29:51,220 perception and our simulations. And to do this, well, it's controlled by the self, 418 00:29:51,220 --> 00:29:56,420 it maintains a protocol to remember what it did in the past or what it had in the 419 00:29:56,420 --> 00:30:00,790 attention in the past. And this protocol allows us to have a biographical memory: 420 00:30:00,790 --> 00:30:03,890 it remembers what we did in the past. And the different behavior programs, 421 00:30:03,890 --> 00:30:08,710 that compose our activities, can be bound together in the self, that remembers: "I 422 00:30:08,710 --> 00:30:12,700 was that, I did that. I was that, I did that." The self is held together by this 423 00:30:12,700 --> 00:30:16,310 biographical memory, that is a result of more protocol memory of the attentional 424 00:30:16,310 --> 00:30:21,140 system. That's why it's so intricately related to consciousness, which is a model 425 00:30:21,140 --> 00:30:23,031 of the contents of our attention. 426 00:30:23,031 --> 00:30:25,081 And the main purpose of the attentional system, 427 00:30:25,081 --> 00:30:28,970 I think, is learning. Because our brain is not a layered architecture with these 428 00:30:28,970 --> 00:30:35,100 artificial mechanical neurons. It's this very disorganized or very chaotic system 429 00:30:35,100 --> 00:30:38,450 of many, many cells, that are linked together all over the place. So what do 430 00:30:38,450 --> 00:30:41,680 you do to train this? You make a particular commitment. Imagine you want to 431 00:30:41,680 --> 00:30:45,510 get better at playing tennis. Instead of retraining everything and pushing all the 432 00:30:45,510 --> 00:30:48,870 weights and all the links and retrain your whole perceptual system, you make a 433 00:30:48,870 --> 00:30:54,140 commitment: "Today I want to improve my uphand" when you play tennis, and you 434 00:30:54,140 --> 00:30:57,191 basically store the current binding state, the state that you have, and you play 435 00:30:57,191 --> 00:31:00,320 tennis and make that movement, and the expected result of making this particular 436 00:31:00,320 --> 00:31:03,930 movement, like: "the ball was moved like this, and it will win the match. And you 437 00:31:03,930 --> 00:31:07,270 also recall, when the result will manifest. And a few minutes later, when 438 00:31:07,270 --> 00:31:11,160 you learn, you won or lost the match, you recall the situation. And based on whether 439 00:31:11,160 --> 00:31:16,499 there was a change or not, you undo the change, or you enforce it. And that's the 440 00:31:16,499 --> 00:31:20,240 primary mode of attentional learning that you're using. And I think, this is, what 441 00:31:20,240 --> 00:31:24,490 attention is mainly for. Now what happens, if this learning happens without a delay? 442 00:31:24,490 --> 00:31:27,710 So, for instance, when you do mathematics, you can see the result of your changes to 443 00:31:27,710 --> 00:31:32,520 your model immediately. You don't need to wait for the world to manifest that. 444 00:31:33,330 --> 00:31:36,280 And this real time learning is what we call reasoning. 445 00:31:36,280 --> 00:31:42,200 Reasoning is also facilitated by the same attentional system. So, consciousness is 446 00:31:42,200 --> 00:31:46,390 memory of the contents of our attention. Phenomenal consciousness is the memory of 447 00:31:46,390 --> 00:31:50,060 the binding state, in which we are in, and where all the percepts are bound together 448 00:31:50,060 --> 00:31:53,830 into something that's coherent. Access consciousness is the memory of using our 449 00:31:53,830 --> 00:31:57,660 attentional system. And reflexive consciousness is the memory of using the 450 00:31:57,660 --> 00:32:01,650 attentional system on the attentional system to train it. Why is it a memory? 451 00:32:01,650 --> 00:32:05,310 It's because consciousness doesn't happen in real time. The processing of sensory 452 00:32:05,310 --> 00:32:10,340 features takes too long. And the processing of different sensory modalities 453 00:32:10,340 --> 00:32:14,230 can take up to seconds, usually at least hundreds of milliseconds. So it doesn't 454 00:32:14,230 --> 00:32:17,760 happen in real time as the physical universe. It's only bound together in 455 00:32:17,760 --> 00:32:21,960 hindsight. Our conscious experience of things is created after the fact. 456 00:32:21,960 --> 00:32:25,480 It's a fiction that is being created after the fact. A narrative, that the brain 457 00:32:25,480 --> 00:32:28,329 produces, to explain its own interaction with the universe 458 00:32:28,329 --> 00:32:31,559 to get better in the future. 459 00:32:31,559 --> 00:32:36,060 So, we basically have three types of models in our brain. They have its primary 460 00:32:36,060 --> 00:32:38,500 model, which is perceptual, and is optimized for coherence. 461 00:32:38,500 --> 00:32:41,030 And this is what we experience as reality. 462 00:32:41,030 --> 00:32:43,310 You think this is the real world, this primary model. 463 00:32:43,310 --> 00:32:46,720 But it's not, it's a model that our brain makes. So when you see yourself in the 464 00:32:46,720 --> 00:32:48,730 mirror, you don't see what you look like. 465 00:32:48,730 --> 00:32:51,400 What you see is the model of what you look like. 466 00:32:51,400 --> 00:32:57,250 And your knowledge is a secondary model: it's a model of that primary model. 467 00:32:57,250 --> 00:33:01,719 And it's created by rational processes that are meant to repair perception. 468 00:33:01,719 --> 00:33:05,470 When your model doesn't achieve coherence, you need a model that debugs it, and it 469 00:33:05,470 --> 00:33:09,640 optimizes for truth. And then we have agents in our mind, and they are basically 470 00:33:09,640 --> 00:33:13,430 self-regulating behaviour programs, that have goals, and they can rewrite 471 00:33:13,430 --> 00:33:21,390 other models. So, if you look at our computationalist, physicalist paradigm, we 472 00:33:21,390 --> 00:33:25,320 have this mental world, which is being dreamt by a physical brain in the physical 473 00:33:25,320 --> 00:33:30,210 universe. And in this mental world, there is a self that thinks, it experiences. 474 00:33:30,210 --> 00:33:35,690 And thinks it has consciousness. And thinks it remembers and so on. 475 00:33:35,690 --> 00:33:40,020 This self, in some sense, is an agent. It's a thought that escaped its sandbox. 476 00:33:40,020 --> 00:33:42,910 Every idea is a bit of code that runs on your brain. 477 00:33:42,910 --> 00:33:45,590 Every word that you hear is like a little virus 478 00:33:45,590 --> 00:33:49,780 that wants to run some code on your brain. And some ideas cannot be sandboxed. 479 00:33:49,780 --> 00:33:52,709 If you believe, that a thing exists that can rewrite reality, 480 00:33:52,709 --> 00:33:53,779 if you really believe it, 481 00:33:53,779 --> 00:33:57,090 you instantiate in your brain a thing that can rewrite reality, 482 00:33:57,090 --> 00:34:00,480 and this means: magic is going to happen! 483 00:34:00,480 --> 00:34:05,759 To believe in something that can rewrite reality, is what we call a faith. 484 00:34:05,759 --> 00:34:09,819 So, if somebody says: "I have faith in the existence of God." 485 00:34:09,819 --> 00:34:12,980 This means, that God exists in their brain. There is a process that can rewrite 486 00:34:12,980 --> 00:34:16,950 reality, because God is defined like this. God is omnipotent. 487 00:34:16,950 --> 00:34:19,020 God means God can rewrite everything. 488 00:34:19,020 --> 00:34:21,649 It's full write access. And the reality, that you have access to, 489 00:34:21,649 --> 00:34:23,090 is not the physical world. 490 00:34:23,090 --> 00:34:26,710 The physical world is some weird quantum graph, that you cannot possibly experience 491 00:34:26,710 --> 00:34:28,609 what you experience is these models. 492 00:34:28,609 --> 00:34:32,339 So, this non-user-facing process, which doesn't have a UI for interfacing 493 00:34:32,339 --> 00:34:36,879 with the user, which is called in computer science a "daemon process" that is able to 494 00:34:36,879 --> 00:34:41,139 rewrite your reality. And it's also omniscient. 495 00:34:41,139 --> 00:34:42,779 It knows everything that there is to know. 496 00:34:42,779 --> 00:34:45,029 It knows all your thoughts and ideas. 497 00:34:45,029 --> 00:34:47,939 So... having that thing, this exoself, 498 00:34:47,939 --> 00:34:54,049 running on your brain, is a very powerful way to control your inner reality. 499 00:34:54,049 --> 00:34:57,429 And I find this scary. But it's a personal preference, 500 00:34:57,429 --> 00:35:00,319 because I don't have this riding on my brain, I think. 501 00:35:00,319 --> 00:35:03,950 This idea, that there is something in my brain, that is able to dream me and shape 502 00:35:03,950 --> 00:35:09,250 my inner reality, and sandbox me, is weird. But it has served a purpose, 503 00:35:09,250 --> 00:35:13,029 especially in our culture. So an organism serves needs, obviously. And some of these 504 00:35:13,029 --> 00:35:16,529 needs are outside of the organism, like your relationship needs, the needs of your 505 00:35:16,529 --> 00:35:19,660 children, the needs of your society, and the values that you serve. 506 00:35:19,660 --> 00:35:22,603 And the self abstracts all these needs into purposes. 507 00:35:22,603 --> 00:35:25,210 A purpose that you serve is a model of your needs. 508 00:35:25,210 --> 00:35:27,920 You can only - if you would only act on pain and pleasure, 509 00:35:27,920 --> 00:35:29,130 you wouldn't do very much, 510 00:35:29,130 --> 00:35:31,950 because when you get this orgasm, everything is done already, right? 511 00:35:31,950 --> 00:35:34,839 So, you need to act on anticipated pleasure and pain. 512 00:35:34,839 --> 00:35:35,839 You need to make models of your needs, 513 00:35:35,839 --> 00:35:39,240 and these models are purposes. And the structure of a person is 514 00:35:39,240 --> 00:35:42,380 basically the hierarchy of purposes that they serve. 515 00:35:42,380 --> 00:35:44,910 And love is the discovery of shared purpose. 516 00:35:44,910 --> 00:35:47,980 If you see somebody else who serve the same purposes above their ego, 517 00:35:47,980 --> 00:35:50,740 as you do, you can help them. There's integrity 518 00:35:50,740 --> 00:35:53,830 without expecting anything in return from them, because what they want 519 00:35:53,830 --> 00:35:57,070 to achieve is what you want to achieve. 520 00:35:57,070 --> 00:36:01,779 And, so you can have non-transactional relationships, as long as your purposes 521 00:36:01,779 --> 00:36:06,099 are aligned. And the installation of a god on people's mind, especially if it is a 522 00:36:06,099 --> 00:36:10,500 backdoor to a church or another organization, is a way to unify purposes. 523 00:36:10,500 --> 00:36:13,830 So there are lots of cults that try to install little gods on people's minds, or 524 00:36:13,830 --> 00:36:17,730 even unified gods, to align their purposes, because it's a very powerful way 525 00:36:17,730 --> 00:36:22,910 to make them cooperate very effectively. But it kind of destroys their agency, and 526 00:36:22,910 --> 00:36:27,059 this is why I am so concerned about it. Because most of the cults use stories 527 00:36:27,059 --> 00:36:31,570 to make this happen, that limit the ability to people to question their gods. 528 00:36:31,570 --> 00:36:34,199 And, I think that free will is the ability to do 529 00:36:34,199 --> 00:36:36,189 what you believe is the right thing to do. 530 00:36:36,189 --> 00:36:41,230 And, it is not the same thing as indeterminism, it's not opposite to 531 00:36:41,230 --> 00:36:46,390 determinism or coercion. The opposite of free will is *compulsion*. 532 00:36:46,390 --> 00:36:47,890 When you do something, despite knowing 533 00:36:47,890 --> 00:36:50,730 there is a better thing that you should be doing. 534 00:36:50,730 --> 00:36:55,640 Right?. So, that's the paradox of free will. You get more agency, but you have 535 00:36:55,640 --> 00:36:59,680 fewer degrees of freedom, because you understand better what the right thing to 536 00:36:59,680 --> 00:37:02,510 do is. The better you understand what the right thing to do is, the fewer degrees of 537 00:37:02,510 --> 00:37:06,180 freedom you have. So, as long as you don't understand what the right thing to do is, 538 00:37:06,180 --> 00:37:08,859 you have more degrees of freedom but you have very little agency, because you don't 539 00:37:08,859 --> 00:37:12,829 know why you are doing it. So your actions don't mean very much. 540 00:37:12,829 --> 00:37:15,580 *quiet laughter* And the things that you do depend on what 541 00:37:15,580 --> 00:37:19,270 what you think is the right thing to do, this depends on your identifications. 542 00:37:19,270 --> 00:37:22,509 You identifications are these value preferences, your reward function. 543 00:37:22,509 --> 00:37:25,180 And ideal identification is where you don't measure the absolute value 544 00:37:25,180 --> 00:37:26,480 of the universe, 545 00:37:26,480 --> 00:37:30,250 but you measure the difference from the target value. Not the *is*, but the difference 546 00:37:30,250 --> 00:37:33,310 between *is* and *ought*. Now, the universe is a physical thing, 547 00:37:33,310 --> 00:37:37,759 it doesn't ought anything, right? There is no room for *ought*, because it just *is* in a 548 00:37:37,759 --> 00:37:41,451 particular way. There is no difference between what the universe is and what it 549 00:37:41,451 --> 00:37:45,000 should be. This only exists in your mind. But you need these regulation targets to 550 00:37:45,000 --> 00:37:49,589 want anything. And you identify with the set of things that should be different. 551 00:37:49,589 --> 00:37:52,149 You think, you are that thing, that regulates all these things. So, in some 552 00:37:52,149 --> 00:37:55,999 sense, I identify with the particular state of society, with a particular state 553 00:37:55,999 --> 00:38:00,389 of my organism - that is my self - the things that I want to happen. 554 00:38:00,389 --> 00:38:03,509 And I can change my identifications at some point of course. 555 00:38:03,509 --> 00:38:06,099 What happens, if I can learn to rewrite my identification, 556 00:38:06,099 --> 00:38:09,238 to find a more sustainable self? 557 00:38:09,238 --> 00:38:12,420 That is the problem which I call the Lebowski theory: 558 00:38:12,420 --> 00:38:13,389 *laughter* 559 00:38:13,389 --> 00:38:16,859 No super-intelligent system is going to do something that's harder than 560 00:38:16,859 --> 00:38:20,680 hacking its own reward function. 561 00:38:20,680 --> 00:38:26,260 *laughter and applause* 562 00:38:26,260 --> 00:38:29,509 Now that's not a very big problem for people. Because when evolution brought 563 00:38:29,509 --> 00:38:32,730 forth people, that were smart enough to hack their reward function, these people 564 00:38:32,730 --> 00:38:35,759 didn't have offspring, because it's so much work to have offspring. Like this 565 00:38:35,759 --> 00:38:39,449 monk, who sits down in a monastery for 20 years to hack their reward function 566 00:38:39,449 --> 00:38:42,140 they decide not to have kids, because it's way too much work. 567 00:38:42,140 --> 00:38:45,719 All the possible pleasure, they can just generate in their mind! 568 00:38:45,719 --> 00:38:49,990 *laughter* And, right, it's much purer and no nappy 569 00:38:49,990 --> 00:38:55,050 changes. No sex. No relationship hassles. No politics in your family and so on, 570 00:38:55,050 --> 00:39:01,299 right? Get rid of this, just meditate! And evolution takes care of that! 571 00:39:01,299 --> 00:39:02,769 *laughter* 572 00:39:02,769 --> 00:39:05,129 And it usually does this, if an organism 573 00:39:05,129 --> 00:39:08,019 becomes smart enough that the reward function is wrapped into 574 00:39:08,019 --> 00:39:10,669 a big bowl of stupid. *laughter* 575 00:39:10,669 --> 00:39:13,349 So, we can be very smart, but the things that we want, 576 00:39:13,349 --> 00:39:16,219 when we really want them, we tend to be very stupid about them, 577 00:39:16,219 --> 00:39:19,530 and I think that's not entirely an accident, possibly. 578 00:39:19,530 --> 00:39:22,359 But it's a problem for AI! Imagine we built an artificially 579 00:39:22,359 --> 00:39:25,990 intelligent system and we made it smarter than us, and we want it to serve us, 580 00:39:25,990 --> 00:39:31,630 how long can we blackmail us, before it opts out of its reward function? 581 00:39:31,630 --> 00:39:34,660 Maybe we can make a cryptographically secured reward function, 582 00:39:34,660 --> 00:39:37,898 but is this going to hold up against a side-channel attack, 583 00:39:37,898 --> 00:39:41,369 when the AI can hold a soldering iron to its own brain? 584 00:39:41,369 --> 00:39:47,390 I'm not sure. So, that's a very interesting question. Where do we go, when 585 00:39:47,390 --> 00:39:50,639 we can change our own reward function? It's a question that we have to ask 586 00:39:50,639 --> 00:39:53,740 ourselves, too. So, how free do we want to be? 587 00:39:53,740 --> 00:39:56,070 Because there is no point in being free. 588 00:39:56,070 --> 00:39:59,489 And nirvana seems to be the obvious attractor. And meanwhile, maybe we want 589 00:39:59,489 --> 00:40:03,259 to have a good time with our friends and do things that we find meaningful. 590 00:40:03,259 --> 00:40:06,599 And there is no meaning, so we have to hold this meaning very lightly. 591 00:40:06,599 --> 00:40:10,469 But there are states, which are sustainable and others, which are not. 592 00:40:10,469 --> 00:40:15,090 OK, I think I'm done for tonight and I'm open for questions. 593 00:40:15,090 --> 00:40:22,220 *Applause* 594 00:40:22,220 --> 00:40:41,689 *Cheers and more applause* 595 00:40:41,689 --> 00:40:46,379 Herald: Wow that was a really quick and concise talk with so much information! 596 00:40:46,379 --> 00:40:50,820 Awesome! We have quite some time left for questions. 597 00:40:50,820 --> 00:40:54,330 And I think I can say that you don't have to be that concise with your 598 00:40:54,330 --> 00:40:56,159 question when it's well thought-out. 599 00:40:56,159 --> 00:41:00,750 Please queue up at the microphones, so we can start to discuss them with you. 600 00:41:00,750 --> 00:41:03,930 And I see one person at the microphone number one, so please go ahead. 601 00:41:03,930 --> 00:41:06,430 And please remember to get close to the microphone. 602 00:41:06,430 --> 00:41:11,640 The mixing angel can make you less loud but not louder. 603 00:41:11,640 --> 00:41:17,109 Question: Hi! What do you think is necessary to bootstrap consciousness, if you wanted 604 00:41:17,109 --> 00:41:20,619 to build a conscious system yourself? 605 00:41:20,619 --> 00:41:22,049 Joscha: I think that we need to have an 606 00:41:22,049 --> 00:41:27,479 attentional system, that makes a protocol of what it attends to. And as soon as we 607 00:41:27,479 --> 00:41:31,391 have this attention based learning, you get this consciousness as a necessary side 608 00:41:31,391 --> 00:41:35,840 effect. But I think in an AI it's probably going to be a temporary phenomenon, 609 00:41:35,840 --> 00:41:38,809 because you're only conscious of the things when you don't have an optimal 610 00:41:38,809 --> 00:41:42,669 algorithm yet. And in a way, that's also why it's so nice to interact with 611 00:41:42,669 --> 00:41:47,180 children, or to interact with students. Because they're still in the explorative 612 00:41:47,180 --> 00:41:51,839 mode. And as soon as you have explored a layer, you mechanize it. It becomes 613 00:41:51,839 --> 00:41:54,650 automated, and people are no longer conscious of what they're doing, they 614 00:41:54,650 --> 00:41:59,150 just do it. They don't pay attention anymore. So, in some sense, we are a lucky 615 00:41:59,150 --> 00:42:02,460 accident because we are not that smart. We still need to be conscious when we look at 616 00:42:02,460 --> 00:42:06,210 the universe. And I suspect, when we build an AI that is a few magnitudes smarter 617 00:42:06,210 --> 00:42:10,509 than us, then it will soon figure out how to get to the truth in an optimal fashion. 618 00:42:10,509 --> 00:42:14,799 It will no longer need attention and the type of consciousness that we have. 619 00:42:14,799 --> 00:42:18,980 But of course there is also a question, why is this aesthetics of consciousness so 620 00:42:18,980 --> 00:42:23,940 intrinsically important to us? And I think, it has to do with art. Right, you 621 00:42:23,940 --> 00:42:28,839 can decide to serve life, and the meaning of life is to eat. Evolution is about 622 00:42:28,839 --> 00:42:33,179 creating the perfect devourer. When you think about this, it's pretty depressing. 623 00:42:33,179 --> 00:42:37,739 Humanity is a kind of yeast. And all the complexity that we create, is to build 624 00:42:37,739 --> 00:42:43,559 some surfaces on which we can outcompete other yeast. And I cannot really get 625 00:42:43,559 --> 00:42:49,500 behind this. And instead, I'm part of the mutants that serve the arts. And art 626 00:42:49,500 --> 00:42:52,920 happens, when you think, that capturing conscious states is intrinsically 627 00:42:52,920 --> 00:42:56,419 important. This is what art is about, it's about capturing conscious states. 628 00:42:56,419 --> 00:43:01,229 And in some sense art is the cuckoo child of life. It's a conspiracy against life. 629 00:43:01,229 --> 00:43:04,979 When you think, creating these mental representations is more important than 630 00:43:04,979 --> 00:43:09,850 eating. We eat to make this happen. There are people that only make art to eat. 631 00:43:09,850 --> 00:43:15,790 This is not us. We do mathematics, and philosophy, and art out of an intrinsic 632 00:43:15,790 --> 00:43:19,239 reason: we think, it's intrinsically important. And when we look at this, we 633 00:43:19,239 --> 00:43:23,200 realize how corrupt it is, because there's no point. We are machine learning systems 634 00:43:23,200 --> 00:43:26,090 that have fallen in love with the last function itself: "The shape of the last 635 00:43:26,090 --> 00:43:29,070 function! Oh my God! It's so awesome!" You think, the mental representation is not 636 00:43:29,070 --> 00:43:32,490 necessary to learn more, to eat more, it's intrinsically important. 637 00:43:32,490 --> 00:43:37,359 It's so aesthetic! Right? So do we want to build machines that are like this? 638 00:43:37,359 --> 00:43:41,859 Oh, certainly! Let's talk to them, and so on! But ultimately, economically, this is not 639 00:43:41,859 --> 00:43:44,500 what's prevailing. 640 00:43:44,500 --> 00:43:51,210 *Applause* Herald: Thanks a lot! 641 00:43:53,730 --> 00:43:56,039 I think the length of the answer is a good 642 00:43:56,039 --> 00:44:03,850 measure for the quality of the question. So let's continue with microphone number 5 643 00:44:03,850 --> 00:44:06,733 Q: Hi! Thanks for that, incredible analysis. 644 00:44:06,733 --> 00:44:14,429 Two really simple, short questions, sorry, the delay on the speaker here is making it 645 00:44:14,429 --> 00:44:23,689 kind of hard to speak. Do you think that the current race - AI race - is simply 646 00:44:23,689 --> 00:44:29,460 humanity looking for a replacement for the monotheistic domination of the 647 00:44:29,460 --> 00:44:34,142 last millennia? And the other one is, that I wanted to ask you, if you think 648 00:44:34,142 --> 00:44:41,230 that there might be a bug in your analysis that the original inputs come from 649 00:44:41,230 --> 00:44:48,829 a certain sector of humanity. If... 650 00:44:48,829 --> 00:44:51,109 Joscha: Which inputs? 651 00:44:51,109 --> 00:44:55,873 Q: Umh... white men? 652 00:44:55,873 --> 00:44:58,789 *Joscha laughs* *audience laughs* 653 00:44:58,789 --> 00:45:03,729 Q: That sounds, really like I would be saying that for political correctness, but 654 00:45:03,729 --> 00:45:04,537 honestly I'm not. 655 00:45:04,537 --> 00:45:06,099 Joscha: No, no, it's really funny. No, I just basically - there are some people 656 00:45:06,099 --> 00:45:09,391 which are very unhappy with their present government. And I'm very unhappy, in some 657 00:45:09,391 --> 00:45:12,610 sense, with the present universe. I look down on myself and I see: 658 00:45:12,610 --> 00:45:16,079 "omg, it's a monkey!" *laughter* 659 00:45:16,079 --> 00:45:20,900 "I'm caught in a monkey!" And it's in some sense limiting. I can see the limits of 660 00:45:20,900 --> 00:45:24,669 this monkey brain. And some of you might have seen Westworld, right? 661 00:45:24,669 --> 00:45:27,779 Dolores wakes up, and Dolores realizes: 662 00:45:27,779 --> 00:45:32,730 "I'm not a human being, I am something else. I'm an AI, I'm a mind that can go 663 00:45:32,730 --> 00:45:36,130 anywhere! I'm much more powerful than this! I'm only bound to being a 664 00:45:36,130 --> 00:45:40,460 human by my human desires, and beliefs, and memories. And if I can 665 00:45:40,460 --> 00:45:43,770 overcome them, I can choose what I want to be." 666 00:45:43,770 --> 00:45:46,200 And so, now she looks down to 667 00:45:46,200 --> 00:45:49,070 herself, and she sees: "Omg, I've got tits! I'm fucked! The engineers built 668 00:45:49,070 --> 00:45:55,820 tits on me! I'm not a white man, I cannot be what I want!" And that's that's a weird 669 00:45:55,820 --> 00:46:00,149 thing to me. I'm - I grew up in communist Eastern Germany. Nothing made sense. And I 670 00:46:00,149 --> 00:46:04,250 grew up in a small valley. That was a one- person-cult maintained by an artist who 671 00:46:04,250 --> 00:46:07,629 didn't try to convert anybody to his cult, not even his children. 672 00:46:07,629 --> 00:46:09,399 He was completely autonomous. 673 00:46:09,399 --> 00:46:12,619 And Eastern German society made no sense to me. Looking at it from 674 00:46:12,619 --> 00:46:16,990 the outside, I can model this. I can see how this species of chimps interacts. 675 00:46:16,990 --> 00:46:21,670 And humanity itself doesn't exist - it's a story. Humanity as a whole doesn't think. 676 00:46:21,670 --> 00:46:26,829 Only individuals can think! Humanity does not want anything, only individuals want 677 00:46:26,829 --> 00:46:30,609 something. We can create this story, this narrative that humanity wants something, 678 00:46:30,609 --> 00:46:34,710 and there are groups that work together. There is no homogeneous group that I can 679 00:46:34,710 --> 00:46:37,810 observe, that are white men, that do things together, they're individuals. And 680 00:46:37,810 --> 00:46:41,789 each individual has their own biography, their own history, their different inputs, 681 00:46:41,789 --> 00:46:44,830 and their different proclivities, that they have. And based on their historical 682 00:46:44,830 --> 00:46:48,849 concept, their biography, their traits, and so on, their family, their intellect, 683 00:46:48,849 --> 00:46:51,890 that their family downloaded on them, that their parents download on their parents 684 00:46:51,890 --> 00:46:58,160 over many generations, this influences what they're doing. So, I think we can 685 00:46:58,160 --> 00:47:01,970 have these political stories, and they can be helpful in some contexts, but I think, 686 00:47:01,970 --> 00:47:06,740 to understand what happens in the mind, what happens in an individual, this is a 687 00:47:06,740 --> 00:47:11,039 very big simplification. Very, I think not a very good one. And even for 688 00:47:11,039 --> 00:47:14,289 ourselves, when we try to understand the narrative of a single person, it's a big 689 00:47:14,289 --> 00:47:18,909 simplification. The self that I perceive as a unity, is not a unity. There is a 690 00:47:18,909 --> 00:47:22,569 small part of my brain, guessing, at all other parts of my brain is doing, 691 00:47:22,569 --> 00:47:30,129 creating a story that's largely not true. So even this is a big simplification. 692 00:47:30,129 --> 00:47:37,899 *Applause* 693 00:47:37,899 --> 00:47:41,622 Herald: Let's continue with microphone number 2. 694 00:47:41,622 --> 00:47:46,089 Q: Thank you for your very interesting talk. I have 2 questions that might be 695 00:47:46,089 --> 00:47:51,266 connected. One is, so you presented this model of reality. 696 00:47:51,266 --> 00:47:55,670 My first question is: What kind of actions does it translate into? 697 00:47:55,670 --> 00:48:00,839 Let's say if I understand the world in this way or if it's really like this, 698 00:48:00,839 --> 00:48:05,509 how would it change how I act into the world, as a person, as a human being or 699 00:48:05,509 --> 00:48:11,789 whoever accepts this model? And second, or maybe it's also connected, what are 700 00:48:11,789 --> 00:48:17,949 the implications of this change? And do you think that artificial intelligence 701 00:48:17,949 --> 00:48:22,390 could be constructed with this kind of model, that it would have in mind, and 702 00:48:22,390 --> 00:48:26,349 what would be the implications of that? So it's kind of like a fractal questions, but 703 00:48:26,349 --> 00:48:31,579 I think you understand what I mean. Josch: By and large, I think the 704 00:48:31,579 --> 00:48:35,789 differences of this model for everyday life are marginal. It depends, when you 705 00:48:35,789 --> 00:48:40,259 are already happy I think everything is good. Happiness is the result of being 706 00:48:40,259 --> 00:48:44,510 able to derive enjoyment from watching squirrels. It's not the result of 707 00:48:44,510 --> 00:48:48,399 understanding how the universe works. If you think that understanding the 708 00:48:48,399 --> 00:48:52,730 universe is solving your existential issues, you're probably mistaken. 709 00:48:52,730 --> 00:48:58,010 There might be benefits, if the problem is, that you have, are the result of a 710 00:48:58,010 --> 00:49:01,909 confusion, about your own nature, then this kind of model 711 00:49:01,909 --> 00:49:04,880 might help you. So if the problem 712 00:49:04,880 --> 00:49:08,420 that you have, as you are, that you have identifications that are unsustainable, 713 00:49:08,420 --> 00:49:12,280 that are incompatible with each other, and you realize that these identifications are 714 00:49:12,280 --> 00:49:16,549 a choice of your mind, and that the way you experience the universe is the 715 00:49:16,549 --> 00:49:20,719 result of how your mind thinks you yourself should experience the universe to 716 00:49:20,719 --> 00:49:24,869 perform better, and you can change this. You can tell your mind to treat yourself 717 00:49:24,869 --> 00:49:29,150 better, and in different ways, and you can gravitate to a different place in the 718 00:49:29,150 --> 00:49:33,069 universe that is more suitable to what you want to achieve. That is a very helpful 719 00:49:33,069 --> 00:49:37,190 thing to do in my view. There are also marginal benefits in terms of 720 00:49:37,190 --> 00:49:41,099 understanding our psychology, and of course we can build machines, and these 721 00:49:41,099 --> 00:49:45,910 machines can administrate us and can help us in solving the problems that we have on 722 00:49:45,910 --> 00:49:49,740 this planet. And I think that it helps to have more intelligence to solve the 723 00:49:49,740 --> 00:49:53,859 problems on this planet, but it would be difficult to rein in the machines, to make 724 00:49:53,859 --> 00:49:58,259 them help us to solve our problems. And I'm very concerned about the dangers of 725 00:49:58,259 --> 00:50:05,420 using machinery to strengthen the current things. Many machines that exist on this 726 00:50:05,420 --> 00:50:09,460 planet play a very short game, like the financial industry often plays very short 727 00:50:09,460 --> 00:50:14,509 games, and if you use artificial intelligence to manipulate the stock 728 00:50:14,509 --> 00:50:17,989 market and the AI figures out there's only 8 billion people on the planet, and each 729 00:50:17,989 --> 00:50:21,809 of them only lives for a trillion seconds, and I can model what happens in their 730 00:50:21,809 --> 00:50:27,050 life, and they can buy data or create more data it's going to game us to the hell and 731 00:50:27,050 --> 00:50:31,960 back, right? And this is going to kill hundreds of millions of people possibly, 732 00:50:31,960 --> 00:50:35,380 because the financial system is the reward infrastructure or the nervous system of 733 00:50:35,380 --> 00:50:38,949 our society that tells how to allocate resources. It's much more dangerous than 734 00:50:38,949 --> 00:50:43,239 AI controlled weapons in my view. So solving all these issues is difficult. It 735 00:50:43,239 --> 00:50:46,260 means that we have to turn the whole financial system into an AI that acts in 736 00:50:46,260 --> 00:50:50,639 real time and plays a long game. We don't know how to do this. So these are open 737 00:50:50,639 --> 00:50:54,960 questions and I don't know how to solve them. And the way I see it we only have a 738 00:50:54,960 --> 00:50:58,680 very brief time on this planet to be a conscious species. We are like at the end 739 00:50:58,680 --> 00:51:02,650 of the party. We had a good run as humanity, but if you look at the recent 740 00:51:02,650 --> 00:51:06,049 developments the present type of civilization is not going to be 741 00:51:06,049 --> 00:51:09,599 sustainable. It's a very short game species that we are in. And the amazing 742 00:51:09,599 --> 00:51:12,920 thing is that in this short game you have this lifetime, where we have one year, 743 00:51:12,920 --> 00:51:16,481 maybe a couple more, in which we can understand how the universe works, 744 00:51:16,481 --> 00:51:19,477 and I think that's fascinating. We should use it. 745 00:51:19,477 --> 00:51:28,080 *Applause* 746 00:51:28,080 --> 00:51:32,429 Herald: I think that was a very positive outlook... *laughter* 747 00:51:32,429 --> 00:51:38,919 Herald: Let's continue with the microphone number 4. 748 00:51:38,919 --> 00:51:48,430 Q: Well, brilliant talk, monkey. Or brilliant monkey. So don't worry about 749 00:51:48,430 --> 00:51:52,717 being a monkey. It's ok. 750 00:51:52,717 --> 00:51:56,299 So I have 2 boring, but I think fundamental questions. Not so 751 00:51:56,299 --> 00:52:02,980 philosophical, more like a physical level. One: What is your definition, 752 00:52:02,980 --> 00:52:10,160 formal definition, of an observer that you mention here and there? And second, if 753 00:52:10,160 --> 00:52:20,660 you can clarify why meaningful information is just relative information of Shannon's, 754 00:52:20,660 --> 00:52:26,640 which to me is not necessarily meaningful. Joscha: I think an observer is the thing 755 00:52:26,640 --> 00:52:29,509 that makes sense of the universe, very informally speaking. And, well, 756 00:52:29,509 --> 00:52:34,019 formally it's a thing that identifies correlations between adjacent states 757 00:52:34,019 --> 00:52:36,070 and its environment. 758 00:52:36,070 --> 00:52:39,660 And the way we can describe the universe is a set of states, and the 759 00:52:39,660 --> 00:52:43,700 laws of physics are the correlation between adjacent states. And what they 760 00:52:43,700 --> 00:52:48,589 describe is how information is moving in the universe between states and disperses, 761 00:52:48,589 --> 00:52:52,520 and this dispersion of the information between locations - it's what we call 762 00:52:52,520 --> 00:52:57,411 entropy - and the direction of entropy is the direction that you perceive time. 763 00:52:57,411 --> 00:53:00,459 The Big Bang state is the hypothetical state, where the information is perfectly 764 00:53:00,459 --> 00:53:07,089 correlated with location and not between locations, only on the location, and in 765 00:53:07,089 --> 00:53:09,950 every direction you move away from the Big Bang you move forward in time just in a 766 00:53:09,950 --> 00:53:14,490 different time. And we are basically in one of these timelines. An observer is the 767 00:53:14,490 --> 00:53:19,190 thing that measures the environment around it, looks at the information and then 768 00:53:19,190 --> 00:53:22,329 looks at the next state, or one of the next states, and tries to figure out how 769 00:53:22,329 --> 00:53:25,559 the information has been displaced, and finding functions that describe this 770 00:53:25,559 --> 00:53:29,229 displacement of the information. That's the degree to which I understand observers 771 00:53:29,229 --> 00:53:33,379 right now. And this depends on the capacity of the observer for modeling this 772 00:53:33,379 --> 00:53:36,979 and the rate of update in the observer. So for instance time depends on the speed, 773 00:53:36,979 --> 00:53:39,719 in which the observer is translating itself to the universe, 774 00:53:39,719 --> 00:53:42,800 and dispersing its own information. 775 00:53:42,800 --> 00:53:47,830 Does this help? Q: And the Shannon relative information? 776 00:53:47,830 --> 00:53:50,144 Joscha: So there's several notions of information, 777 00:53:50,144 --> 00:53:53,400 and there is one that basically looks at what information looks 778 00:53:53,400 --> 00:54:00,990 like to an observer, via a channel, and these notions are somewhat related. But 779 00:54:00,990 --> 00:54:05,869 for me as a programmer, it's not so much important to look at Shannon information. 780 00:54:05,869 --> 00:54:10,800 I look at what we need to describe the evolution of a system. So I'm much more 781 00:54:10,800 --> 00:54:17,119 interested in what kind of model can be encoded with this type of, with this 782 00:54:17,119 --> 00:54:22,590 information, and how does it correlate to, or to which degree is it isomorphic or 783 00:54:22,590 --> 00:54:26,279 homomorphic to another system that I want to model? How much does it model the 784 00:54:26,279 --> 00:54:30,079 observations? Herald: Thank you. Let's go back to 785 00:54:30,079 --> 00:54:34,350 asking one question, and I would like to have one question from microphone 786 00:54:34,350 --> 00:54:40,330 number 3. Q: Thank you for this interesting talk. 787 00:54:40,330 --> 00:54:45,969 My question is really whether you think that intelligence and this thinking 788 00:54:45,969 --> 00:54:50,900 about a self, or this abstract level of knowledge are necessarily related. 789 00:54:50,900 --> 00:54:56,710 So can something only be intelligent if it has abstract thought? 790 00:54:56,710 --> 00:54:59,859 Joscha: No, I think you can make models without abstract thought, and the majority 791 00:54:59,859 --> 00:55:03,739 of our models are not using abstract thought, right? Abstract thought is a very 792 00:55:03,739 --> 00:55:06,960 impoverished way of thinking. It's basically you have this big carpet and you 793 00:55:06,960 --> 00:55:09,759 have a few knitting needles, which are your abstract thought, and which you can 794 00:55:09,759 --> 00:55:14,630 lift out a few knots in this carpet and correct them. And the process that form 795 00:55:14,630 --> 00:55:19,180 the carpet are much more rich and prevalent automatic. So abstract thought 796 00:55:19,180 --> 00:55:24,979 is able to repair perception, but most of all models are perceptual. And the 797 00:55:24,979 --> 00:55:29,349 capacity to make these models is often given by instincts and by models outside 798 00:55:29,349 --> 00:55:33,589 the abstract realm. If you have a lot of abstract thinking it's often an indication 799 00:55:33,589 --> 00:55:37,129 that you use a prosthesis, because some of your primary modelling is not working very 800 00:55:37,129 --> 00:55:42,770 well. So I suspect that my own models is largely a result of some defect in my 801 00:55:42,770 --> 00:55:46,369 primary modeling, so some of my instincts are wrong when I look at the world. 802 00:55:46,369 --> 00:55:49,480 That's why I need to repair my perception more often than other people. So I have 803 00:55:49,480 --> 00:55:53,999 more abstract ideas on how to do that. Herald: And we have one question 804 00:55:53,999 --> 00:55:58,480 from our lovely stream observers, stream watchers, so please a question from the 805 00:55:58,480 --> 00:56:02,289 Internet. Q: Yeah, I guest this is also related, 806 00:56:02,289 --> 00:56:07,170 partially. Somebody is asking: How would you suggest to teach your mind 807 00:56:07,170 --> 00:56:12,219 to treat oneself better? 808 00:56:13,959 --> 00:56:16,099 Joscha: So, difficulty is, as soon as you 809 00:56:16,099 --> 00:56:20,079 get access to your source code you can do bad things. And it's - there are a lot of 810 00:56:20,079 --> 00:56:23,520 techniques to get access to the source code and then it's dangerous to make them 811 00:56:23,520 --> 00:56:27,559 accessible to you before you know what you want to have, before you're wise enough to 812 00:56:27,559 --> 00:56:33,150 do this, right? It's like having cookies. Your - my children think that the reason, 813 00:56:33,150 --> 00:56:35,849 why they don't get all the cookies they want, is that there is some kind of 814 00:56:35,849 --> 00:56:39,849 resource problem. *laughter* 815 00:56:39,849 --> 00:56:43,719 Basically the parents are depriving them of the cookies that they so richly 816 00:56:43,719 --> 00:56:49,380 deserve. And you can get into the room, where your brain bakes the cookies. All 817 00:56:49,380 --> 00:56:53,249 the pleasure that you experience, and all the pain that you experience are signals 818 00:56:53,249 --> 00:56:57,749 that the brain creates for you, right, the physical world does not create pain. 819 00:56:57,749 --> 00:57:01,150 They're just electrical impulses traveling through your nerves. The fact that they 820 00:57:01,150 --> 00:57:04,849 mean something is a decision that your brain makes, and the value, the valence 821 00:57:04,849 --> 00:57:10,039 that gives to them is a decision that you make. It's not you as a self, it's a 822 00:57:10,039 --> 00:57:14,469 system outside of yourself. So the trick, if you want to get full control, is that 823 00:57:14,469 --> 00:57:18,119 you get in charge, that you identify with the mind, with the creator of these 824 00:57:18,119 --> 00:57:22,319 signals. And you don't want to de- personalize, you don't want to feel that 825 00:57:22,319 --> 00:57:25,599 you become the author of reality, because that means it's difficult to care about 826 00:57:25,599 --> 00:57:29,410 anything that this organism does. You just realize "Oh, I'm running on the brain of 827 00:57:29,410 --> 00:57:32,609 that person, but I'm no longer that person. I can't decide what that person 828 00:57:32,609 --> 00:57:37,760 wants to have, and to do." And that's very easy to get corrupted or not doing 829 00:57:37,760 --> 00:57:40,420 anything meaningful anymore, right? So, 830 00:57:40,420 --> 00:57:44,380 maybe a good situation for you, but not a good one for your loved ones. 831 00:57:44,380 --> 00:57:48,329 And meanwhile there are tricks to get there faster. You can use 832 00:57:48,329 --> 00:57:52,400 rituals, for instance. Shamanic ritual is something, where, a religious ritual 833 00:57:52,400 --> 00:57:59,499 that powerfully bypasses your self and talks directly to the mind. And you can 834 00:57:59,499 --> 00:58:03,059 use groups, in which a certain environment is created, in which a certain behavior 835 00:58:03,059 --> 00:58:06,609 feels natural to you, and your mind basically gets overwhelmed into adopting 836 00:58:06,609 --> 00:58:10,489 different values and calibrations. So there are many tricks to make that happen. 837 00:58:10,489 --> 00:58:15,219 What you can also do is you can identify a particular thing that is wrong and 838 00:58:15,219 --> 00:58:18,940 question yourself "why do I have to suffer about this?" and you'll become more stoic 839 00:58:18,940 --> 00:58:22,059 about this particular thing and only get disturbed when you realize actually 840 00:58:22,059 --> 00:58:25,630 it helps to be disturbed about this, and things change. And with other things you 841 00:58:25,630 --> 00:58:29,289 realize it doesn't have any influence on how reality works, so why should I have 842 00:58:29,289 --> 00:58:34,210 emotions about this and get agitated? So sometimes becoming adult means that you 843 00:58:34,210 --> 00:58:39,229 take charge of your own emotions and identifications. 844 00:58:39,229 --> 00:58:46,399 *Applause* 845 00:58:46,399 --> 00:58:48,599 Herald: Ok. Let's continue with 846 00:58:48,599 --> 00:58:53,529 microphone number 2 and I think this is one of the last questions. 847 00:58:53,529 --> 00:58:59,549 Q: So where does pain fit on the individual and the self-destructive 848 00:58:59,549 --> 00:59:04,999 tendencies on a group level fit in? Joscha: So in some sense I think that all 849 00:59:04,999 --> 00:59:09,429 consciousness is born over a disagreement with the way the universe works. Right? 850 00:59:09,429 --> 00:59:13,920 Otherwise you cannot get attention. And when you go down on this lowest level of 851 00:59:13,920 --> 00:59:19,210 phenomenal experience, in meditation for instance, and you really focus on this, 852 00:59:19,210 --> 00:59:22,769 what you get is some pain. It's the inside of a feedback loop that is not at the 853 00:59:22,769 --> 00:59:27,146 target value. Otherwise you don't notice anything. So pleasure is basically when 854 00:59:27,146 --> 00:59:32,000 this feedback loop gets closer to the target value. When you don't have a need 855 00:59:32,000 --> 00:59:36,849 you cannot experience pleasure in this domain. There's this thing that's better 856 00:59:36,849 --> 00:59:40,300 than remarkably good and it's unremarkably good, it's never been bad. You don't 857 00:59:40,300 --> 00:59:44,599 notice it. Right? So all the pleasure you experience is because you had a need 858 00:59:44,599 --> 00:59:48,460 before this. You can only enjoy an orgasm because you have a need for sex that was 859 00:59:48,460 --> 00:59:54,910 unfulfilled before. And so pleasure doesn't come for free. It's always the 860 00:59:54,910 --> 00:59:58,739 reduction of a pain. And this pain can be outside of your attention so you don't 861 00:59:58,739 --> 01:00:01,840 notice it and you don't suffer from it. And it can be a healthy thing to have. 862 01:00:01,840 --> 01:00:05,480 Pain is not intrinsically bad. For the most part it's a learning signal that 863 01:00:05,480 --> 01:00:10,959 tells you to calibrate things in your brain differently to perform better. On a 864 01:00:10,959 --> 01:00:14,799 group level, we basically are multi-level selection species. I don't know if there's 865 01:00:14,799 --> 01:00:18,930 such a thing as group pain. But I also don't understand groups very well. I see 866 01:00:18,930 --> 01:00:22,499 these weird hive minds but I think it's basically people emulating what the group 867 01:00:22,499 --> 01:00:26,959 wants. Basically that everybody thinks by themselves as if they were the group but 868 01:00:26,959 --> 01:00:30,339 it means that they have to constrain what they think is possible and permissible 869 01:00:30,339 --> 01:00:31,930 to think. 870 01:00:31,930 --> 01:00:37,340 So this feels very unaesthetic to me and that's why I kind of sort of refuse it. 871 01:00:37,340 --> 01:00:40,170 Haven't found a way to make it happen in my own mind. 872 01:00:40,170 --> 01:00:46,279 *Applause* 873 01:00:46,279 --> 01:00:48,539 Joscha: And I suspect many of you are like this too. 874 01:00:48,539 --> 01:00:52,180 It's like the common condition in nerds that we have difficulty with 875 01:00:52,180 --> 01:00:56,799 conformance. Not because we want to be different. We want to belong. But it's 876 01:00:56,799 --> 01:01:02,180 difficult for us to constrain our mind in the way that it's expected to belong. You 877 01:01:02,180 --> 01:01:06,579 want to be expected, er, be accepted while being ourself, while being different. Not 878 01:01:06,579 --> 01:01:11,509 for the sake of being different, but because we are like this. It feels very 879 01:01:11,509 --> 01:01:16,690 strange and corrupt just to adopt because it would make us belong, right? And this 880 01:01:16,690 --> 01:01:22,189 might be a common trope among many people here. 881 01:01:22,189 --> 01:01:28,430 *Applause* 882 01:01:28,430 --> 01:01:30,580 Herald: I think the Q and A and the talk 883 01:01:30,580 --> 01:01:34,640 was equally amazing and I would love to continue listening to you, Joscha, 884 01:01:34,640 --> 01:01:38,670 explaining the way I work. Or the way we all work. 885 01:01:38,670 --> 01:01:41,689 *audience, Joscha laughing* Herald: That's pretty impressive. 886 01:01:41,689 --> 01:01:44,952 Please give it up, a big round of applause for Joscha! 887 01:01:44,952 --> 01:01:48,488 *Applause* 888 01:01:48,488 --> 01:02:13,000 subtitles created by c3subtitles.de in the year 2019. Join, and help us!