1 00:00:00,000 --> 00:00:18,220 *35C3 preroll music* 2 00:00:18,220 --> 00:00:23,829 Herald Angel: What happens if you mix Shannon's information theory and 3 00:00:23,829 --> 00:00:31,029 biological systems? A dish better served hot. Please welcome our computational 4 00:00:31,029 --> 00:00:37,680 systems biology chef, who will guide you through investigating the information flow 5 00:00:37,680 --> 00:00:43,300 in living systems. Please welcome with a very warm round of applause Jürgen Pahle. 6 00:00:43,300 --> 00:00:51,530 *applause* 7 00:00:51,530 --> 00:00:56,670 Jürgen Pahle: Thanks a lot and thanks for having me. It's great that so many of you 8 00:00:56,670 --> 00:01:01,460 are interested in that topic, which is not about technical systems but actually 9 00:01:01,460 --> 00:01:07,229 biological cells. So, I am leading a group in Heidelberg at the university 10 00:01:07,229 --> 00:01:16,790 there and we are mostly interested in how information is processed, sensed, stored, 11 00:01:16,790 --> 00:01:25,189 communicated between biological cells. And we are interested in that because it's not 12 00:01:25,189 --> 00:01:30,620 obvious that they actually manage to do that in a reliable fashion. They don't 13 00:01:30,620 --> 00:01:35,299 have transistors. They only can use their molecules mostly proteins, big molecules 14 00:01:35,299 --> 00:01:44,670 that are little engines or little motors in the cell that allow them to fulfill 15 00:01:44,670 --> 00:01:51,880 their biological functions. If information processing fails in cells, you get diseases 16 00:01:51,880 --> 00:02:00,350 like epilepsy, cancer and of course others. Now, cellular signaling pathways 17 00:02:00,350 --> 00:02:07,200 have been studied in some detail - mostly single pathways. More and more also 18 00:02:07,200 --> 00:02:14,580 networks of pathways but surprisingly little conceptual work has been 19 00:02:14,580 --> 00:02:19,190 done on them. So we know the molecules that are involved, we know how they 20 00:02:19,190 --> 00:02:27,980 react, how they combine to build these pathways. But we don't know how, actually, 21 00:02:27,980 --> 00:02:35,690 information is transferred or communicated across these pathways and we intend to 22 00:02:35,690 --> 00:02:42,780 fill that gap in our group. And, of course, first we have to we have to model 23 00:02:42,780 --> 00:02:51,730 these networks, we have to model these biochemical pathways. And this is how we 24 00:02:51,730 --> 00:02:57,480 proceed. So you have a you have a cell - you can't see that here - but on the upper 25 00:02:57,480 --> 00:03:02,200 left corner you have that scheme of a cell with all the different components. You 26 00:03:02,200 --> 00:03:08,910 have volumes in the cell where chemical reactions happen. So chemical 27 00:03:08,910 --> 00:03:15,120 reactions take biochemical species: ions, proteins, what have you, and they convert 28 00:03:15,120 --> 00:03:20,400 them into other chemical species, and these reactions happen in the different 29 00:03:20,400 --> 00:03:26,850 compartments. Now it's very important to assign speeds or velocities to these 30 00:03:26,850 --> 00:03:33,340 reactions because these speeds determine how fast the reactions happen and how the 31 00:03:33,340 --> 00:03:39,040 dynamic behavior then results. And once you have done that, you can translate all 32 00:03:39,040 --> 00:03:44,930 of that into a mathematical model like the one shown here on the right. This is an 33 00:03:44,930 --> 00:03:49,310 ordinary differential equation system, I don't want to go into detail. I only have 34 00:03:49,310 --> 00:03:55,510 like two or three formulas that might be interesting for you. So this is just 35 00:03:55,510 --> 00:04:00,870 any mathematical model you have of these systems and then you can start 36 00:04:00,870 --> 00:04:05,560 analyzing them. You can ask questions like: "How does the system change over 37 00:04:05,560 --> 00:04:10,490 time?" That's simulation. "Which parts influence the behavior most?" "What other 38 00:04:10,490 --> 00:04:16,810 stable states? Do you have oscillations, do you have a steady state?" and so on. 39 00:04:16,810 --> 00:04:22,310 Now, you don't have to do that by hand, because we are actually also developing 40 00:04:22,310 --> 00:04:27,710 software - that's just another thing. I guess you know that all models are wrong. 41 00:04:27,710 --> 00:04:33,810 We try to build useful ones. So I said you don't have to do this by hand because we 42 00:04:33,810 --> 00:04:40,620 are also into method development and we are building scientific software. One of 43 00:04:40,620 --> 00:04:44,620 the softwares we build is called COPASI: COmplex PAthway SImulator. It's free and 44 00:04:44,620 --> 00:04:50,260 open source, you can all go to that website, download it, play around with it 45 00:04:50,260 --> 00:04:58,530 if you want. Because we also use more demanding computations which we send to 46 00:04:58,530 --> 00:05:03,117 compute clusters, we also developed a scripting interface for COPASI, which is 47 00:05:03,117 --> 00:05:09,640 called CoRC, the COPASI R connector. And this allows you to use the COPASI backend 48 00:05:09,640 --> 00:05:15,560 with all the different tools that are in COPASI from your R programming environment 49 00:05:15,560 --> 00:05:21,140 and then you can build workflows and send them to compute cluster. We think it's 50 00:05:21,140 --> 00:05:28,280 easy to use. If you play around with it and you get stuck, then just let me know. 51 00:05:28,280 --> 00:05:31,420 So this is software you can use, you can play around with. And where do we get the 52 00:05:31,420 --> 00:05:37,340 models? Well, there is a model database that is called Biomodels.net, also free to 53 00:05:37,340 --> 00:05:41,770 use, you can go there and download models. At the moment they have almost 800 54 00:05:41,770 --> 00:05:47,530 different manually curated models, and almost ten times of that that are built 55 00:05:47,530 --> 00:05:52,760 automatically. You can just download them in the so-called SBML format, which is the 56 00:05:52,760 --> 00:05:59,560 Systems Biology Markup Language, then import it into COPASI or other software and 57 00:05:59,560 --> 00:06:01,860 play around with them. 58 00:06:01,860 --> 00:06:08,290 OK, so coming back to biology, one of our favorite systems is 59 00:06:08,290 --> 00:06:13,650 calcium signaling. Calcium signaling works roughly like this: You have these 60 00:06:13,650 --> 00:06:20,860 little - I mean the oval thing is a cell - then you have these red cones, that are 61 00:06:20,860 --> 00:06:26,490 hormones, and other substances that you have in your bloodstream or somewhere 62 00:06:26,490 --> 00:06:32,120 outside the cell. They bind to these black things, which are receptors on the cell 63 00:06:32,120 --> 00:06:37,260 membrane. And then a cascade of processes happens that in the end leads to 64 00:06:37,260 --> 00:06:43,960 an in-stream of calcium ions, these blue balls, from the ER - which is not 65 00:06:43,960 --> 00:06:47,620 emergency room, but endoplasmatic reticulum, which is one of the 66 00:06:47,620 --> 00:06:52,979 compartments in the cell - into the the main compartment, the cytosol of the cell. 67 00:06:52,979 --> 00:06:58,460 And also calcium streams into the cell from outside the cell. And this leads to a 68 00:06:58,460 --> 00:07:04,589 sharp increase of the concentration of calcium, until it's pumped out again. There 69 00:07:04,589 --> 00:07:10,010 are pumps that take calcium ions and remove them from the cytosol, and pump 70 00:07:10,010 --> 00:07:16,010 them out of the cell and back into the ER. This is very important because calcium is 71 00:07:16,010 --> 00:07:21,300 a very versatile second messenger. That's what they call it. It regulates 72 00:07:21,300 --> 00:07:25,861 a number of very important cellular processes. If you move your muscles, your 73 00:07:25,861 --> 00:07:31,210 muscle contraction is regulated by calcium, learning, secretion of 74 00:07:31,210 --> 00:07:37,170 neurotransmitters, transmitters in your brain, fertilization. A lot of different 75 00:07:37,170 --> 00:07:45,780 things are regulated by calcium and, if you simulate the dynamic processes, you get 76 00:07:45,780 --> 00:07:51,100 behavior like that. Here you can see it oscillates, it shows these regular spikes. 77 00:07:51,100 --> 00:07:59,229 So this is the calcium concentration over time. Now, if you actually measure this in 78 00:07:59,229 --> 00:08:06,379 real cells, and this is data measured by collaboration partners of mine in England, 79 00:08:06,379 --> 00:08:12,040 you see it's not that smooth. You get these differences in amplitude of the 80 00:08:12,040 --> 00:08:17,610 peaks, you get secondary spikes, you get fluctuations around the basal level, and 81 00:08:17,610 --> 00:08:22,850 this is because you have random fluctuations in your system. Intrinsic 82 00:08:22,850 --> 00:08:27,639 random fluctuations that are just due to random fluctuations in the timings of 83 00:08:27,639 --> 00:08:33,440 single reactive events. Single reactions, biochemical reactions that happen. And in 84 00:08:33,440 --> 00:08:37,419 in order to capture this behavior, because this behavior is 85 00:08:37,419 --> 00:08:42,179 important, that can hamper reliable information transfer, we have to resort to 86 00:08:42,179 --> 00:08:47,760 special simulation algorithms, for example the so-called Gillespie algorithm. And if 87 00:08:47,760 --> 00:08:51,640 you do that and apply it to the calcium system, you can see you can actually 88 00:08:51,640 --> 00:08:57,000 capture these secondary peaks and all the different other fluctuations you have in 89 00:08:57,000 --> 00:09:03,560 there. Now, this is just a Monte Carlo simulation. I say "just". It's really time 90 00:09:03,560 --> 00:09:07,380 consuming and demanding, because you have to calculate each and every single 91 00:09:07,380 --> 00:09:12,130 reactive event in the cell. And that takes a lot of time. That's why we do that on a 92 00:09:12,130 --> 00:09:16,760 compute cluster. I told you already, that calcium is a very versatile second 93 00:09:16,760 --> 00:09:21,870 messenger. So you have very many different triggers of a calcium response in the 94 00:09:21,870 --> 00:09:27,580 cell, things that lead to a certain calcium dynamics. And on the other hand, 95 00:09:27,580 --> 00:09:33,240 downstream, calcium regulates many different things. And so you have this 96 00:09:33,240 --> 00:09:37,940 hourglass or bow tie structure, and that's why people have speculated about the 97 00:09:37,940 --> 00:09:46,440 calcium code: How can it be, that the proteins - I should go back - that actually do 98 00:09:46,440 --> 00:09:53,550 all these cellular functions - [Softly] sorry - these green cylinders that bind 99 00:09:53,550 --> 00:09:58,860 calcium and are then activated or inhibited by it, how can it be that they 100 00:09:58,860 --> 00:10:06,910 know, which stimulus or which hormone is outside of the cell? They don't see them, 101 00:10:06,910 --> 00:10:12,790 because there is a cell membrane around the cell, around the cytosol. So people 102 00:10:12,790 --> 00:10:19,930 have speculated: Is there information encoded in the specific calcium waveform? 103 00:10:19,930 --> 00:10:28,039 Is there calcium code? And how can it be that the proteins actually decode that code? 104 00:10:28,039 --> 00:10:35,340 It's fairly established, that calcium shows amplitude modulation. So the 105 00:10:35,340 --> 00:10:40,750 higher the amplitude of calcium, the more active get some proteins. It also shows 106 00:10:40,750 --> 00:10:45,870 frequency modulation, meaning the higher the frequency of the calcium oscillations, 107 00:10:45,870 --> 00:10:50,260 the more active get some proteins. But, maybe, there are other information carrying 108 00:10:50,260 --> 00:10:57,470 features in the waveform, like duration, waveform timing and so on. Now a doctoral 109 00:10:57,470 --> 00:11:02,060 student in my group, Arne Schoch, has looked into frequency modulation and he 110 00:11:02,060 --> 00:11:06,660 actually showed that there are proteins, in that case NFAT, which is the nuclear 111 00:11:06,660 --> 00:11:12,500 factor of activated T-cells, which are important in your immune system. They only 112 00:11:12,500 --> 00:11:17,791 react to calcium oscillations of a certain frequency. So they they get activated in a 113 00:11:17,791 --> 00:11:25,020 very narrow frequency band, and that's why we call it band-pass activation. 114 00:11:25,020 --> 00:11:32,589 Okay, so I guess you all know signaling speeds of technical systems, they are fairly fast by 115 00:11:32,589 --> 00:11:37,170 now. One of our results, because we quantify actually information transfer, is 116 00:11:37,170 --> 00:11:42,210 that calcium signalling operates at roughly point four bits per second. If you 117 00:11:42,210 --> 00:11:47,060 compare that to technical systems, that seems very low, but maybe that's enough 118 00:11:47,060 --> 00:11:52,780 for all the functions that a cell has to fulfill. So how did we arrive at this result? 119 00:11:52,780 --> 00:11:58,670 Well, we used information theory, classical information theory, pioneered by 120 00:11:58,670 --> 00:12:05,600 people like Claude Shannon in the 40s, also by Hartley, Tuckey and a few other people. 121 00:12:05,600 --> 00:12:09,310 So, they looked at technical systems, and they have this prototypical 122 00:12:09,310 --> 00:12:14,370 communication system, where there is an information source on the left side, 123 00:12:14,370 --> 00:12:19,230 then this information is somehow encoded. It's transmitted over a noisy channel 124 00:12:19,230 --> 00:12:24,580 where the message is scrambled. Then it's received by a receiver, decoded, and then 125 00:12:24,580 --> 00:12:30,080 hopefully you get the same message at the destination, that was chosen at the 126 00:12:30,080 --> 00:12:37,540 information source. And in our case we look at calcium as an information source 127 00:12:37,540 --> 00:12:45,700 and we study how much information is actually transferred to downstream proteins. 128 00:12:45,700 --> 00:12:53,250 How do you do that? Well, information theory 101. Information theory primer. 129 00:12:53,250 --> 00:12:59,060 In statistical information theory of the Shannon type, you look at random 130 00:12:59,060 --> 00:13:04,170 variables. You look at events that have a certain probability of happening. So let's 131 00:13:04,170 --> 00:13:12,769 say you have an event that has a probability of happening, and then Shannon 132 00:13:12,769 --> 00:13:19,800 said that the information content of this event should be the negative logarithm - 133 00:13:19,800 --> 00:13:24,800 which is shown here, the curve on the right hand side - should be the 134 00:13:24,800 --> 00:13:29,800 negative logarithm of the probability, meaning that if an event happens all the 135 00:13:29,800 --> 00:13:34,700 time - and I will show you an example later - there is no information content. 136 00:13:34,700 --> 00:13:39,300 The information content is zero. There is no surprise, if that event happens, because 137 00:13:39,300 --> 00:13:45,249 it happens all the time, it's like there's a sunny day somewhere in the desert. 138 00:13:45,249 --> 00:13:51,941 However, if you go to lower probabilities, then the surprisal becomes bigger and the 139 00:13:51,941 --> 00:13:58,690 information content rises. Now, in a system you have several events that are possible. 140 00:13:58,690 --> 00:14:02,470 And if you take the average uncertainty of all possible events you get something that 141 00:14:02,470 --> 00:14:08,079 Shannon called entropy. This is still not information, because information is a 142 00:14:08,079 --> 00:14:12,460 difference in entropy. So you have to calculate the entropy of a system, and 143 00:14:12,460 --> 00:14:17,970 then you calculate the entropy that is remaining after an observation, say. And 144 00:14:17,970 --> 00:14:24,010 this difference is the information gained by the observation. Now, coming to a 145 00:14:24,010 --> 00:14:28,450 simple example, let's say we have a very simple weather system where you can only 146 00:14:28,450 --> 00:14:33,970 have rainy and sunny days. And let's say they are equally likely. So you have a 147 00:14:33,970 --> 00:14:46,230 probability of 50%, the average of the negative logarithm is 1. So, when you 148 00:14:46,230 --> 00:14:51,579 observe the weather in the system, you gain one bit per day. You can also think of 149 00:14:51,579 --> 00:14:57,470 bits as the information you need, or a cell needs, to answer or decide on one yes 150 00:14:57,470 --> 00:15:07,640 or no question. Now, if it's always sunny and no rain, then you get zero information 151 00:15:07,640 --> 00:15:13,180 content or uncertainty. The average is zero. So you don't get any information if 152 00:15:13,180 --> 00:15:20,430 you observe the weather in the desert, say. 80/20: You get a certain bit number per 153 00:15:20,430 --> 00:15:29,930 day, in that case .64 per day, and you can do that for Leipzig. In that case, 154 00:15:29,930 --> 00:15:34,430 Leipzig has ninety nine rainy days per year, according to the Deutsche 155 00:15:34,430 --> 00:15:39,760 Wetterdienst. This gives you an information of .84 bit per day. You can do 156 00:15:39,760 --> 00:15:44,310 it in a general way. So let's say you have one event with a probability of p and 157 00:15:44,310 --> 00:15:49,579 another event with a probability of 1 minus p and then you get this curve, which 158 00:15:49,579 --> 00:15:56,570 shows you that the information content is actually maximal if you have maximal 159 00:15:56,570 --> 00:16:02,149 uncertainty, if you have equally likely events. If you have more possible events - 160 00:16:02,149 --> 00:16:07,740 in that case four different ones: sunny, cloudy, rainy, and thunderstorm - you get 161 00:16:07,740 --> 00:16:12,060 two bit and this is because of the logarithm. So if you have double the 162 00:16:12,060 --> 00:16:18,800 amount of events and they are equally likely you get one bit more. Hope I didn't lose 163 00:16:18,800 --> 00:16:25,720 anyone? Now we are always looking at processes, dynamic things, things that 164 00:16:25,720 --> 00:16:30,350 change over time, and if we look at processes we have to look at transition 165 00:16:30,350 --> 00:16:34,690 probabilities. So we have to change probabilities to transition probabilities 166 00:16:34,690 --> 00:16:42,630 and you can summarize them in a matrix. So let's say, if we have a sunny day today, 167 00:16:42,630 --> 00:16:47,730 it's more likely that it's also sunny tomorrow and less likely that it's 168 00:16:47,730 --> 00:16:52,360 raining, maybe only 25 percent. And, if it's rainy today, you can't tell, it's 169 00:16:52,360 --> 00:17:01,750 equally likely. These processes are also called Markov process. Markov was a 170 00:17:01,750 --> 00:17:07,609 Russian mathematician and you have them everywhere. These Markovian processes are 171 00:17:07,609 --> 00:17:13,130 used in your cell phones, in your hard drives, they're used for error correction, 172 00:17:13,130 --> 00:17:20,050 the page rank algorithm of Google is one big Markov process. So, you're using them 173 00:17:20,050 --> 00:17:29,260 all the time, nothing technological would work nowadays without them. Because we 174 00:17:29,260 --> 00:17:36,770 have knowledge about today's weather, the uncertainty about tomorrow's weather decreases. 175 00:17:36,770 --> 00:17:45,910 So now we have an entropy rate, instead of an entropy. The difference is, 176 00:17:45,910 --> 00:17:51,030 again, the information you gain by today's weather. You can do the maths in our 177 00:17:51,030 --> 00:17:58,720 example. The entropy would be .92 bit per day and the entropy rate, given that you 178 00:17:58,720 --> 00:18:05,580 know today's weather, is less. It's .87 bit per day. Now, to complicate things a 179 00:18:05,580 --> 00:18:11,490 bit more, maybe, we also look at a second process in that case air pressure and you 180 00:18:11,490 --> 00:18:17,030 can measure air pressure with these little devices, the barometers and maybe, if it's 181 00:18:17,030 --> 00:18:22,100 sunny today and the air pressure is high, in 90 percent you get a sunny day 182 00:18:22,100 --> 00:18:26,441 tomorrow. Normally in 10 percent of the cases you get a rainy day and so on you 183 00:18:26,441 --> 00:18:32,390 can go through the table. In our case, I looked it up yesterday. We had a high air 184 00:18:32,390 --> 00:18:39,310 pressure and it was raining. So in our little model system it would mean, that 185 00:18:39,310 --> 00:18:47,500 it's sunny today. Now, I told you information is a decrease in uncertainty. 186 00:18:47,500 --> 00:18:51,910 How much information do we get by the barometer, by knowing the air pressure? 187 00:18:51,910 --> 00:18:56,410 This is the difference in uncertainty without barometer and with the barometer 188 00:18:56,410 --> 00:19:00,970 and in our case we have to assume that the probability of high and low air pressure 189 00:19:00,970 --> 00:19:08,160 is the same. And we get .39 bit per day, that we gain by looking at the air 190 00:19:08,160 --> 00:19:12,640 pressure. Now, what does that have to do with biological systems? Well we have two 191 00:19:12,640 --> 00:19:17,169 processes. We have a calcium process that shows some dynamics and we have the 192 00:19:17,169 --> 00:19:22,610 process of an activated protein that does something in the cell. So we can look at 193 00:19:22,610 --> 00:19:28,280 both of these and then calculate how much information is actually transferred from 194 00:19:28,280 --> 00:19:32,740 calcium to the protein. How much uncertainty do we lose about the 195 00:19:32,740 --> 00:19:36,929 protein dynamics, if we know the calcium dynamics? This is mathematically exactly 196 00:19:36,929 --> 00:19:43,010 what we are doing and this is called transfer entropy. It's an information- 197 00:19:43,010 --> 00:19:49,360 theoretic measure developed by Thomas Schreiber in 2000. There are some 198 00:19:49,360 --> 00:19:55,600 practical complications, that we are working on, and this is what we are using 199 00:19:55,600 --> 00:20:00,900 actually for the calculations. So in our case we have data from experiments or we 200 00:20:00,900 --> 00:20:06,770 use models of calcium oscillations and then we couple a model of a protein to 201 00:20:06,770 --> 00:20:13,760 these calcium dynamics. This gives us time courses, both of calcium and protein, 202 00:20:13,760 --> 00:20:20,260 stochastic time courses, including the random fluctuations. And then we use the 203 00:20:20,260 --> 00:20:25,920 information-theoretic machinery to study them. And some of our results I want to show 204 00:20:25,920 --> 00:20:29,650 you. For example, if you increase the system size, if you increase the particle 205 00:20:29,650 --> 00:20:35,480 numbers, if you make the cell bigger, then the information that you can transfer is 206 00:20:35,480 --> 00:20:40,720 higher. Meaning, if the cell invests more energy and produces more proteins, it can 207 00:20:40,720 --> 00:20:44,830 actually achieve a more reliable information transfer, which comes of course 208 00:20:44,830 --> 00:20:52,120 with costs for the cell. Also, it seems, that if you use more complicated dynamics 209 00:20:52,120 --> 00:20:56,330 - meaning not only spiking, but maybe bursting behavior where you have secondary 210 00:20:56,330 --> 00:21:00,160 spikes - then you can transmit more information because the input signal 211 00:21:00,160 --> 00:21:07,500 carries more information or can carry more information in its different features. 212 00:21:07,500 --> 00:21:12,280 Another result is that proteins - a very interesting result I think - is that 213 00:21:12,280 --> 00:21:17,720 proteins can actually be tuned to certain characteristics of the calcium input. 214 00:21:17,720 --> 00:21:22,260 Meaning, with all the different calcium sensitive proteins in the cell they are 215 00:21:22,260 --> 00:21:27,559 tuned to a specific signal. So they only get activated or these pathways only 216 00:21:27,559 --> 00:21:33,270 allow information transmission, if a certain signal is observed in the cell by 217 00:21:33,270 --> 00:21:39,440 these proteins. So, in a way the 3D structure of the protein defines how it 218 00:21:39,440 --> 00:21:46,580 behaves dynamically, how quickly it binds and so on, how many binding sites it has, 219 00:21:46,580 --> 00:21:54,620 and then this dynamic behavior determines to what input signals that protein is 220 00:21:54,620 --> 00:21:59,690 actually sensitive. On the right hand side you can see some calculations we did. The 221 00:21:59,690 --> 00:22:05,621 peaks actually show where this specific protein, which is a calmodulin-like protein 222 00:22:05,621 --> 00:22:09,780 - you don't have to memorize that, it's a very important calcium sensitive protein - 223 00:22:09,780 --> 00:22:15,730 where these differently parameterized models actually get activated and allow 224 00:22:15,730 --> 00:22:20,490 information transfer. And this allows differential regulation because you have 225 00:22:20,490 --> 00:22:25,660 all the different proteins. You have only one calcium concentration and only the 226 00:22:25,660 --> 00:22:31,680 proteins that are sensitive to a specific input get activated or do their things in 227 00:22:31,680 --> 00:22:36,210 the cell. Now if you look at more complicated proteins - so Calmodulin, the 228 00:22:36,210 --> 00:22:41,240 one I just showed you, was only activated by calcium - more complicated proteins, 229 00:22:41,240 --> 00:22:47,460 like protein kinase C, for example, they are both activated and inhibited. So they show 230 00:22:47,460 --> 00:22:52,040 biphasic behavior, where in an intermediate range of calcium 231 00:22:52,040 --> 00:22:55,940 concentration they get activated, with very high or very low concentrations they 232 00:22:55,940 --> 00:23:02,030 are inactivated. You can actually see that these more complicated proteins allow a 233 00:23:02,030 --> 00:23:07,650 higher information transfer and again producing these more complicated proteins 234 00:23:07,650 --> 00:23:13,470 might be more costly for the cell, but it can be valuable, because they allow more 235 00:23:13,470 --> 00:23:18,179 information to be transferred. And this you can see in this plot where we actually 236 00:23:18,179 --> 00:23:23,090 scanned over the activation and the inhibition constant of these model 237 00:23:23,090 --> 00:23:26,929 proteins and you can see that you have these sweet spots where you get a very 238 00:23:26,929 --> 00:23:32,100 high information transfer. So color coded is transfer entropy. Now, coming to a 239 00:23:32,100 --> 00:23:37,630 different system: Just quickly, we also looked at other systems of course. Calcium 240 00:23:37,630 --> 00:23:42,560 signaling is just one of our favorite ones. We also looked at bacteria and this is 241 00:23:42,560 --> 00:23:50,580 E. coli, a very famous model system for biologists. These are cells that can 242 00:23:50,580 --> 00:23:58,620 actually move around because they have little propellers at their end. They want to 243 00:23:58,620 --> 00:24:05,100 find sources of nutrients, for example, to get food. So they swim into a direction 244 00:24:05,100 --> 00:24:10,909 and then they decide whether to keep swimming in that direction 245 00:24:10,909 --> 00:24:17,340 or whether to tumble, reorient randomly, and swim in some other direction. The 246 00:24:17,340 --> 00:24:24,110 problem for them is they are too small. They can't detect a concentration gradient 247 00:24:24,110 --> 00:24:30,039 of nutrients, of food between their front and the back of the cell. So they have to 248 00:24:30,039 --> 00:24:35,480 swim in one direction and then they have to remember some nutrient concentration of 249 00:24:35,480 --> 00:24:40,541 some time back and then they have to compare: Is the nutrient 250 00:24:40,541 --> 00:24:44,720 concentration actually increasing? Then I should continue swimming. If it's 251 00:24:44,720 --> 00:24:49,669 decreasing, I should reorient and swim in some other direction. This allows them to, 252 00:24:49,669 --> 00:24:58,280 on average, swim towards sources of food. In order to compare over time the nutrient 253 00:24:58,280 --> 00:25:04,750 concentrations they have to memorize, they have to know how much nutrients where 254 00:25:04,750 --> 00:25:11,730 there sometime ago. For that they have a little memory and the memory is actually 255 00:25:11,730 --> 00:25:16,740 in the - you can see on the left hand side the receptor that actually senses these 256 00:25:16,740 --> 00:25:21,960 nutrients. They can be modified, these receptors, we call that methylated. So they 257 00:25:21,960 --> 00:25:27,130 get a methylation group attached. They have different states of methylation, five 258 00:25:27,130 --> 00:25:33,840 different ones in that model we are looking at. This builds a memory. And we 259 00:25:33,840 --> 00:25:38,400 looked into that, we quantified that with information theory. This is a measure, 260 00:25:38,400 --> 00:25:42,800 this is called mutual information. It's not transfer entropy, it's another measure 261 00:25:42,800 --> 00:25:50,059 of, in that case, statical information. You can see, this is the amount of 262 00:25:50,059 --> 00:25:55,630 information that is actually stored about the nutrient concentration that is outside 263 00:25:55,630 --> 00:26:01,990 of the cell. This is in nats, it's not in bits. It's just a different - you can 264 00:26:01,990 --> 00:26:06,940 translate them - it's just a different unit for information. You can also see how the 265 00:26:06,940 --> 00:26:14,230 different methylation states - so these are the colored curves - how they go 266 00:26:14,230 --> 00:26:22,230 through or how they are active with different nutrient concentrations. This is 267 00:26:22,230 --> 00:26:26,330 ongoing research. So, maybe, next time, hopefully, next time, I can show you much 268 00:26:26,330 --> 00:26:32,290 more. Just to finish this, we also look at timescales, because the timescales have to 269 00:26:32,290 --> 00:26:38,760 be right. The system adapts. So if you keep that cell in a certain nutrient 270 00:26:38,760 --> 00:26:42,740 concentration, it adapts to that nutrient concentration and goes back to its normal 271 00:26:42,740 --> 00:26:48,430 operating level. Now, if you increase the nutrient concentration again, it shows some 272 00:26:48,430 --> 00:26:53,950 swimming behavior. So it adapts, but it also has to decide, it also has to compare 273 00:26:53,950 --> 00:26:59,620 the different nutrients at different positions. That's how they have to manage 274 00:26:59,620 --> 00:27:04,679 the different timescales of decision making and memory or adaptation and we are 275 00:27:04,679 --> 00:27:10,120 looking into that as well. Coming to the conclusions, I hope I could convince you 276 00:27:10,120 --> 00:27:14,150 that information theory can be applied to biology, that it's a very interesting 277 00:27:14,150 --> 00:27:22,720 topic, it's a fascinating area and we are just at the beginning to do that. I also 278 00:27:22,720 --> 00:27:28,970 showed you that it's such that in signaling pathways the components can be 279 00:27:28,970 --> 00:27:34,290 tuned to their input, which allows differential regulation. So even though 280 00:27:34,290 --> 00:27:40,230 you don't have wires you can still specifically activate different proteins 281 00:27:40,230 --> 00:27:49,820 with one signal or multiplex, if you want. We are of course in the process of 282 00:27:49,820 --> 00:27:55,970 studying what features of the input signal are actually information-carrying. So we 283 00:27:55,970 --> 00:28:02,990 are looking into things like wave form and timing. And we want to look into how these 284 00:28:02,990 --> 00:28:08,730 things change in the deceased case. So, if you have things like cancer where certain 285 00:28:08,730 --> 00:28:15,610 signalling pathways are perturbed or fail, we want to exactly find out what does that 286 00:28:15,610 --> 00:28:21,270 do to the information processing capabilities of the cell. We also found 287 00:28:21,270 --> 00:28:27,320 out that estimating these information theoretical quantities can be a very 288 00:28:27,320 --> 00:28:33,390 tricky business. Another project we are doing at the moment is actually only on 289 00:28:33,390 --> 00:28:39,929 how to interpret these in a reliable manner, how to estimate these from sparse 290 00:28:39,929 --> 00:28:45,279 and noisy data. So that's also ongoing work. I would like to thank some of my 291 00:28:45,279 --> 00:28:50,960 collaborators, of course, my own group, but also some others, in particular the Copasi 292 00:28:50,960 --> 00:28:57,400 team, that is spread all over the world. And with that I would like to thank you 293 00:28:57,400 --> 00:29:00,940 for your attention and I would be happy to answer any question you might have. 294 00:29:00,940 --> 00:29:02,000 Thank you. 295 00:29:02,000 --> 00:29:05,210 *applause* 296 00:29:05,210 --> 00:29:13,049 Herald Angel: ... a very warm applause for Jürgen. If you have questions, there 297 00:29:13,049 --> 00:29:16,380 are two microphones, microphone number one, microphone number two and please 298 00:29:16,380 --> 00:29:23,180 speak loudly into the microphone. And, I think the first one is microphone number two. 299 00:29:23,180 --> 00:29:25,059 Your question please. Microphone 2: Has there been any work done 300 00:29:25,059 --> 00:29:29,529 on computational modelling of the G-protein coupled receptors and the second messenger 301 00:29:29,529 --> 00:29:32,450 cascades there. Jürgen: Can you repeat that, sorry. 302 00:29:32,450 --> 00:29:36,179 Microphone 2: Has there any work been done on computational modelling of G protein- 303 00:29:36,179 --> 00:29:37,910 coupled receptors Jürgen: G protein? 304 00:29:37,910 --> 00:29:40,110 Microphone 2: Yeah. Jürgen: Oh yes, I mean we are doing that 305 00:29:40,110 --> 00:29:44,460 because calcium is actually... I mean the calcium signal is actually triggered by a 306 00:29:44,460 --> 00:29:49,580 cascade that includes the G protein. Most of these receptors are actually G coupled 307 00:29:49,580 --> 00:29:54,410 or G protein coupled receptors. So that's what we are doing. 308 00:29:54,410 --> 00:29:57,499 Angel: Thank you. Microphone number two again. 309 00:29:57,499 --> 00:30:01,461 Microphone 2: First of all thanks for the talk. I want to ask you talked a 310 00:30:01,461 --> 00:30:07,710 little bit about how different proteins get activated by different signals and 311 00:30:07,710 --> 00:30:15,190 could you go a bit into detail about what kind of signal qualities the proteins can 312 00:30:15,190 --> 00:30:22,309 detect? So are they triggered by specific frequencies or specific decays, like which 313 00:30:22,309 --> 00:30:27,999 characteristics of the signals can be picked up by the different proteins? 314 00:30:27,999 --> 00:30:32,429 Jürgen: Well, that's actually what we study. I mean we have another package that 315 00:30:32,429 --> 00:30:36,870 is linked here, the last one, the oscillator generator. This is a package in 316 00:30:36,870 --> 00:30:43,180 R that allows you to create artificial inputs, where you have complete control of 317 00:30:43,180 --> 00:30:48,809 all the parameters like amplitude and duration of the peak, duration of the 318 00:30:48,809 --> 00:30:55,230 secondary peak, frequencies of the primary peaks of the secondary peaks, refraction 319 00:30:55,230 --> 00:30:59,549 period and so on. You have complete control and at the moment we are also 320 00:30:59,549 --> 00:31:04,679 running scans and want to find out what proteins are actually sensitive to what 321 00:31:04,679 --> 00:31:09,900 parameters in the input signal. What we know from calcium is that, for example, 322 00:31:09,900 --> 00:31:18,600 calcium calmodulin kinase 2, also a very important protein in the nervous system, 323 00:31:18,600 --> 00:31:25,769 that shows frequency modulation. It has also been shown experimentally where they 324 00:31:25,769 --> 00:31:29,700 put that protein on a surface, they immobilized it on a surface, and then they 325 00:31:29,700 --> 00:31:34,489 superfused it with calcium concentrations or with solutions of different calcium 326 00:31:34,489 --> 00:31:39,149 concentration in a pulsed manner and they measured the activity of that protein and 327 00:31:39,149 --> 00:31:44,720 they showed that, with increasing frequency, the activation gets bigger. At the same 328 00:31:44,720 --> 00:31:48,320 time it also shows amplitude modulation, okay? It's also sensitive to the 329 00:31:48,320 --> 00:31:55,250 amplitude, meaning the absolute height of the concentration of calcium. 330 00:31:55,250 --> 00:31:57,110 Microphone 2: Thanks. Jürgen: Thank you. 331 00:31:57,110 --> 00:32:00,720 Angel: And again number two please. Microphone 2: Hey. So you talked about a 332 00:32:00,720 --> 00:32:07,170 lot of on and off kinetics and I wonder, if you think about neurons, which are not only 333 00:32:07,170 --> 00:32:14,159 having on and off, but also many amplitudes that take a big role in development of 334 00:32:14,159 --> 00:32:20,990 cells and synapses. How do you measure that, so how do you measure like baseline, 335 00:32:20,990 --> 00:32:26,039 sporadic activity of calcium? Jürgen: Well, in our case there are 336 00:32:26,039 --> 00:32:28,990 different ways of measuring calcium. That's not what we are doing... 337 00:32:28,990 --> 00:32:32,460 Microphone 2: ... not really measuring, sorry, but more like how do you integrate it 338 00:32:32,460 --> 00:32:37,419 in your system? Because it's not really an on/off reaction but it's more like a 339 00:32:37,419 --> 00:32:43,450 sporadic miniature. Jürgen: Yeah, I mean in the case of 340 00:32:43,450 --> 00:32:48,920 calcium you have these time courses, okay? And we look at the complete time 341 00:32:48,920 --> 00:32:53,179 course. So we have the calcium concentration sampled at every second or 342 00:32:53,179 --> 00:32:58,620 half second in the cell by different methods. So our collaboration partners 343 00:32:58,620 --> 00:33:04,920 they use different dyes that show fluorescence, say, when they bind calcium. 344 00:33:04,920 --> 00:33:10,250 Some others show bioluminescence. And then we use these time courses. In the neural 345 00:33:10,250 --> 00:33:17,820 system it's a bit different. There you also get the analog mode, where neurons are 346 00:33:17,820 --> 00:33:23,600 directly connected and they exchange substances, but most of the case you have 347 00:33:23,600 --> 00:33:28,730 action potentials and I didn't go into neural systems at all because things there 348 00:33:28,730 --> 00:33:34,440 are totally different. You get these action potentials that are uniform mostly, 349 00:33:34,440 --> 00:33:38,100 so they they all have the same duration, they all have the same amplitude. And then 350 00:33:38,100 --> 00:33:44,061 people in neuroscience or computational neuroscience mostly they boil the 351 00:33:44,061 --> 00:33:49,970 information down to just the timings of these peaks and they use this information 352 00:33:49,970 --> 00:33:53,940 and mathematically this is a point process and you can use different mathematical 353 00:33:53,940 --> 00:33:59,830 tools to study that. We are not really looking into neurons. We are mostly 354 00:33:59,830 --> 00:34:06,559 interested in non-excitable cells, like liver cells, pancreatic cells and so on, 355 00:34:06,559 --> 00:34:12,319 cells that are not activated, they don't show massive depolarization, like 356 00:34:12,319 --> 00:34:18,079 neurons. Thank you. Angel: Thank you. And obviously again 357 00:34:18,079 --> 00:34:22,228 number two. Microphone 2: Hi. So, you mentioned CaM 358 00:34:22,228 --> 00:34:28,699 kinases 2. I got that you don't work on neuroscience specifically, but I'm 359 00:34:28,699 --> 00:34:33,458 pretty sure you have a quite extensive knowledge in the subject. What do you 360 00:34:33,458 --> 00:34:42,071 think about this, I would say, hypotheses that were quite popular a few years ago, I 361 00:34:42,071 --> 00:34:48,659 think in the US mainly, about the fact that the cytoskeleton of neurons can 362 00:34:48,659 --> 00:34:58,079 actually encode and decode through kinases in the cytoskeleton memories like bits in 363 00:34:58,079 --> 00:35:03,209 - you know - in a hard drive. What's your feeling? 364 00:35:03,209 --> 00:35:07,209 Jürgen: Well, I'm not going to speculate on that specific hypothesis because I'm 365 00:35:07,209 --> 00:35:12,319 not really into that, but I know that many people are also looking into spatial 366 00:35:12,319 --> 00:35:16,709 effects which I didn't mention here. I mean the model I showed you is a spatially 367 00:35:16,709 --> 00:35:22,259 homogeneous model. We don't look at concentration gradients within the cell, 368 00:35:22,259 --> 00:35:27,140 our cells are homogeneous at the moment, but people do that. And of course then you 369 00:35:27,140 --> 00:35:33,500 can look into things, for example, like a new topic is morphological computation, 370 00:35:33,500 --> 00:35:39,079 meaning that spatially you can also perform computations. But, if you're 371 00:35:39,079 --> 00:35:41,119 interested in that, I mean, we can talk offline... 372 00:35:41,119 --> 00:35:43,559 Microphone 2: ... do you buy into this theory... 373 00:35:43,559 --> 00:35:45,390 Jürgen: ... I can give you some pointers there.. 374 00:35:45,390 --> 00:35:49,529 Microphone 2: ... but do you have a good feeling about these theories or you think 375 00:35:49,529 --> 00:35:52,019 they're clueless. Jürgen: Well, I think that the spatial 376 00:35:52,019 --> 00:35:56,059 aspect is a very important thing. And that's also something we should 377 00:35:56,059 --> 00:36:01,609 look at. I mean, to me random fluctuations are very important, intrinsic fluctuations 378 00:36:01,609 --> 00:36:05,819 because you can't separate them from the dynamics of the system. They are always 379 00:36:05,819 --> 00:36:11,589 there, at least some of the fluctuations. And also the spatial effects are very 380 00:36:11,589 --> 00:36:15,230 important, because you not only have these different compartments, 381 00:36:15,230 --> 00:36:19,979 where the reactions happen, but you also have concentration gradients across the 382 00:36:19,979 --> 00:36:24,549 cell. Especially with calcium, people have looked into calcium puffs and calcium 383 00:36:24,549 --> 00:36:29,769 waves because, when you have a channel, that allows calcium to enter, of course directly 384 00:36:29,769 --> 00:36:34,059 at that channel you get a much higher calcium concentration and then in some 385 00:36:34,059 --> 00:36:39,710 cases you get waves that are travelling across the the cell. And to me it sounds 386 00:36:39,710 --> 00:36:43,990 plausible that this also has a major impact on the information processing. 387 00:36:43,990 --> 00:36:49,389 Yeah. Thank you. Angel: Thank you. In this case, Jürgen, 388 00:36:49,389 --> 00:36:54,540 thank you for the talk. And please give a very warm applause to him. 389 00:36:54,540 --> 00:36:56,140 *applause* 390 00:36:56,140 --> 00:36:58,612 Jürgen: Thank you. 391 00:36:58,612 --> 00:37:03,432 *applause* 392 00:37:03,432 --> 00:37:08,315 *postroll music* 393 00:37:08,315 --> 00:37:26,000 subtitles created by c3subtitles.de in the year 2019. Join, and help us!