Max Perutz Lecture 2021: The coming of age of de novo protein design - David Baker

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
oh hi oh yeah thanks mark um so yeah so today it's my really great pleasure to introduce professor david becker for this year's favorites lecture so david did this phd with randy scheckman in uc berkeley and then he took a bath's ride to go to ucsf for postdoc in david edgar's staff now david baker is now the director of the institute of protein design at the university of washington and an hhmi investigator david is really a pioneer in the field of protein design and a protein structure prediction in particular he contributed the the software rosetta which now blossomed into this amazing street of software for protein prediction and and protein design and i'm sure that many of us today have used this software and i think it's really a tribute to david's work and over the years that now he and the field are to the point that they can essentially design proteins that will spontaneously fall in an amazing range of different shapes and i'm interacting with david i have the feeling there is absolutely no limit in what those guys can do and now they are even to the point that they can make proteins that will uh you know use energy to do stuff meaning they can make molecular machines they can from scratch design binders for protein of interest or even do design new materials that will interact with biological systems in novel ways and i think this is really transformative for biology definitely has changed the way i'm doing biology because now we can use protein design to investigate biological system in completely orthodox ways leading to a new insights now i think that david is really a visionary and for me he's a perfect example of what thinking big means and i think you know every single project of david is a moonshot and amazingly he lands nearly all the time probably as far as i can say all the time and i'm sure he will release his lecture today we will convince you of this because his work on the coronavirus system has been absolutely out of this world but to me david is not really about it's about elegance and what i'm always amazed by is that because he's not restricted by what biology has to offer he can always design the simplest and most elegant structure and most even protein to address his problems and that's really the art of a true protein engineer and i think this is really remarkable so david's similar contribution to the field have been recognized by many prices so for instance in 2012 the biochemical society awarded him with a centimeter price and more recently he got um the breakthrough uh price in life science david is also a member of the national academy of science and the american uh american academy of art and science right now before we start i just have to say that as you probably have seen at this lecture and the following tunnel section are being recorded talking about the q a if you have a question you should keep it for the end of the talk and the easiest is that you just raise uh rise your hand and we will just mute you so you can ask it to david if you don't have a microphone you can also type it in the chat or also in the in a q a and i will ask it for you um so in a nutshell thank you david so much for accepting our invitation today we really look forward to it and without further ado the stage is yours oops does it work um sorry i'm uh it we tested it just a moment ago but um let's see i just have to find the right the right window um shoot uh it will work um sorry about that where did it go here it is well thanks it's wonderful to be here i wish i was uh uh visiting you in person and um i just have to sort of turn around um so that was very nice introduction from manu but but we have the feeling that um we actually have found from experience that manu can solve any problem in cell biology so i i think he uh we have this wonderful collaboration and he and his student joe watson have just performed miracles with our design proteins and uh it's it's been been absolutely uh it's it's been absolutely amazing to work with manu and joe and i tried to put in my talk the examples of things that we're doing together but i realized there were so many that i couldn't possibly fit them in so so that's been a one of the really fun things for me in science the last year or two okay so today i'm going to um tell you about um what we can now do with uh de novo protein design and i'm going to sort of start by sort of framing it in the context of the cronovirus problem so the basic principle underlying protein design well up until very recently is that uh proteins fold to their lowest free energy states now this isn't always true but it's a good you know this is it's it's a good first principle for organizing thinking so if you want to predict the structure of a protein from sequence uh you search through the different possible states of the protein and evaluate their energies and the the native structure will be the lowest energy structure if you want to design a protein you make up a brand new structure and then you need to find a sequence whose lowest energy structure is that state and we've developed a computer program we and our colleagues called rosetta over the years that does the sampling in sequence space and in structure space and um evaluates energies and we've been sort of progressively increasing the rosetta energy function trying to describe you know van der waals interactions hydrogen bonding interactions solvation interactions as accurately as possible now very recently this whole picture has gotten a bit upended and we're using uh deep learning networks to hallucinate new proteins and if there's time at the end i'll say a few words about that so what i'm going to focus on today throughout my talk is the problem of de novo protein design and um this is a very interesting problem because the number of of possible sequences even for 100 residue protein is astronomical and nature um and the evolutionary process have only sampled a much much smaller fraction a very very tiny fraction of this so if we think about all of sequence space being represented by this gray bar then naturally occurring proteins you know fall into these clusters the you know protein families and what i'm going to be telling you about today is designing new proteins from from scratch from first principles that are completely unrelated to any naturally occurring protein so i'm going to tell you about a lot of proteins today none of them have any sequence relationship to any protein of known structure and so in all of the um everything i'm going to tell you about today we are after we do the design calculations we obtain synthetic genes that encode these new amino acid sequences we put the synthetic genes into bacteria or yeast um they produce the proteins and we purify them out and then characterize them we're also doing a lot of work now uh going beyond the 20 naturally occurring proteins tony naturally occurring amino acids and making synthetic uh molecules by uh by chemical synthesis but i won't be talking about that today so i'm going to organize this around around covid so when um when the uh pandemic broke out um as you know the the sequence was published uh was uh you know more than you know in january of a year ago and when uh from the sequence of the spike protein we were able to build actually quite a reasonable model of the spike glycoprotein and then as the cryom structure and crystal structures were determined uh we switched over to using that i'm going to tell you about what we have been doing as far as antivirals diagnostics and vaccines so for antivirals we have been focusing on blocking the interaction between the spike protein and the ace2 human host cell receptor so we've been um it was it was kind of um the timing was interesting because we've been making a lot of progress recently in designing proteins to bind to protein structures of interest so basically without using any information other than the target structure we can now design binding proteins and so that's sort of depicted on the right here so we make up very large libraries of hundreds of thousands of different protein backbones and then we we dock those against the target in this case this is the receptor binding domain of the virus and we identify those um docs which are shaped complementary and from which we can make uh chemically complementary uh interactions with the uh with the virus so we we first stock the scaffolds and we design the interface um and we pick out those where the interactions really look optimal so the other approach we used here was to take a piece of the ace2 helix so this is using kind of privileged information because we have the structure of the complex of between rbd and and ace2 and then we built small proteins around it so this one's an ace2 mimic these these designs sort of cover different binding sites on the surface of the virus and here examples of proteins produced designed using these two different methods so this is one that was designed completely from scratch and it's um 55 amino acids hyper stable and it binds with about 100 picomolar affinity um now interestingly the one where we use the ace2 helix is first of all it's a bigger protein because we need to fit the ace2 helix in and it's considerably lower affinity because it's based on ace 2. it's closer to 10 nanomolar so we can actually do better in terms of binding an arbitrary new target by building from scratch okay so this is now the cryo-em structure of this 55 residue de novo design protein uh bound to um uh the the spike trimer um this is from david biesler's lab and you can see one of the uh one of the rbds is in the downstate tour in the upstate and you can see the the the protein and ribbon here now what's really exciting from the computational design point of view is if we compare the structure the experimental structure shown here with the computational model which is what we you know i told describe this docking and design process you can see that the um the design protein is exactly where it was supposed to be the the um uh the the backbone is in exactly the right place not only sort of internally in the fold but also placed relative to the rbd so the the computational model and the uh the design are superimposed on the rbd and so you can see the design the the the design proteins in exactly the right place if we zoom in and look at the side chain interactions we see that those are nearly identical to the design model as well so this was really our first indication that we could design well one of our first indications that we could really pick an arbitrary uh site on an arbitrary protein scaffold and design very high affinity finders to bind there okay so these proteins um are very potent neutralizers of the coronavirus so this is um this experiment where live viruses being added to cells and then different designs are being titrated in and the one that i showed you is this magenta line here at about 15 uh picomolar which is 0.15 nanogram per mil uh this protein gives that has is blocking entry that's the ic50 and this is um it's a small protein so this corresponds to just 0.15 nanogram per ml it's perhaps the most potent compound known today to block entry um block the ace to uh chronovirus interaction now there's been a lot of with collaboration we've been doing a lot of um animal experiments this is with syrian hamsters which are one of the classic models for chronovirus infection so hamsters that get the virus get very sick it's a very acute infection and but if they are if if this protein is admitted administered intranasally from 48 hours before to uh 24 hours after um the uh there's almost uh no effect um the um uh the um you can see that uh we're getting nearly uh complete recovery uh by uh seven or eight days here um so um so this is so so far i've described i i showed you that the the small protein can bind to three different rbds on the coronavirus and so we've obviously been very concerned about escape in in uh the last six months and so we focused on even though we're at 15 pico molar binding of the individual domain we thought we could obtain even higher avidity with um higher affinity by making constructs which engage multiple uh rbds simultaneously so here what we've done is to design a trimeric version that holds three of these binding domains in exactly the right orientation needed to bind to engage three of the rbds and here's some negative stain em data showing that the trimer this design primer in fact sits on top of the of the um of the spike we actually have a pretty nice cryom structure of this now so we can take binding domains and then position them very quite precisely and i'll be alluding to that in in in a moment about in the context of designing uh biased and super agonists for cellular receptors so with these multivalent versions both trimerized versions like the one that i just showed you and versions where we take three different domains and connect them by linkers we get very potent protection against all the escape versions of the virus that we've seen so this is the south africa and the uk versions for example and these are these different multivalent versions we're again getting um ic50s which are better than about 100 picomolar we're seeing similar protections in in animals as well um another nice thing about these small proteins is they're not like antibodies you know the problem with antibodies uh particularly for something like the the coronavirus is you just can't there's not enough manufacturing capacity to produce them for you know everybody who would need them but these small proteins express very well in e coli so this is just a whole cell lysate expressing that de novo design protein it's down at the bottom 55 amino acids here you just heat the cells um the protein most of the e coli protein denatures this is at heating at about 80 degrees and the only thing that's left in salt that's soluble after um spinning that heated lysate is is the design protein there's some small levels of contaminants which can be cleared up quickly so we think this is a way to really really drop the cost of drugs um and uh coronava virus uh antivirals in particular so we are um uh we're sort of going through the steps to get these to uh to clinical trials and um that's been a very interesting this is there aren't one of the the tricky things has been that um you know there isn't a lot of precedent for for um the idea is a nasal spray so you could just have it you don't have to refrigerate this stuff so you could just have it in your go get the go to the drugstore get it sprayed in your nose if you're gonna get on a plane or if you're in india and just in a dangerous place you could take it when you needed it and again it's super stable so you don't have to keep it refrigerated and distributing it should be should be much more easier and then the other thing that we're working about on this context is it did take us a while and there was a fair amount of experimental you know screening that we had to do to find these really really good designs so we're trying to now improve this whole computational pipeline so we can go from sort of identification a new pandemic threat to a very high infinity binding proteins um in in just a few weeks which we think is possible now we've taken the same methodology or actually in parable apparel longing cow and brian coventry really made this enormous breakthrough in the last several years being able to sort of like i said take a protein of known structure pick a region and then design a very small protein less than 60 amino acid that binds to it and so um they've been making binders to the major cell surface receptors like fgf receptor pdgf receptor egf receptor and then many cytokine receptors and i'm just showing you a few of them here and we have crystal structures now many of the them and again they're binding very much like they were supposed to and um these are all it's kind of interesting they're all very small proteins this just shows surface views and they have their shapes and and and um and chemical properties though are different enough to make them very specific so this is the different uh design binders um looking at binding um ted experiments against the different targets and you can see we're getting you know really good specificity here in the case of the egf receptor we made binders to different receptor subunits so now we're very excited about doing is um trying to manipulate sulfate by making super agonists and different at different types of agonists which hold different receptor subunits together in different orientations to get new signaling outputs um and we have a there's a first example of this where this is from a year or two ago with chris garcia where basically we we made scaffolds which held uh the through putin receptor at different angles and distances from each other and what we found is we got different signaling outputs so now with these small binding domains we think we can really generalize this to uh signaling um general there's uh and you know there's all sorts of different exciting applications like trans-differentiating cells in the body for um for regenerative medicine and other applications so very excited about that area okay the second area the second um in response to covid of this diagnostics developing sensors so so far i've talked to talked about designing these really rigid sort of rock-like binding modules but um for this we wanted to see if we could build molecular devices so here we have a molecular device which has two states has a closed state which is dark and a an open state which is which emits light um and these um this opening closing transition is coupled to the um free energy of binding a target so we've got a binding domain that's that's enclosed here and when the target's present uh the binding energy drives opening and so what we've done here is simply take that that that small coronavirus binder that i described embed it in the closed state of the system so that then when the when the chronovirus is there it it opens up and um light is reconstituted light is emitted these basically luciferase gets reconstituted and this works really well so the sensor is dark and then when um the the spike or the rbd is added um we get this very rapid and large increase in uh luminescence um and uh more recently we've been exploring this for for a monitoring responses to vaccination so we since this is in thermodynamic equilibrium we can have the sensor we add the rbd and it opens now if we add serum samples that contain antibodies against the rbd they compete with the sensor and we get a loss in signal um and uh we actually have nicer data than this now but um if we so in looking at samples from patients who are have been vaccinated who have different levels of response to the um to the virus and we can um we can assess their antibody response by looking at the neutralizing antibody titer and we can also look at the change in signal you can see they're they're quite well correlated um and so this is a very quick test you could run at home um read on a cell phone and uh um so we're hoping this will be useful we can take the same concept and i showed you we can design binders to many other types of proteins and so we can convert these into sensors in the same way by caging them in the off state of the device so that the protein device so that when the target is present the thermodynamics drive opening and so for example this is a troponin sensor in in these experiments where we've got many different sensors and then we're adding dh1 the target and and this is similar to the specificity plot i showed you before only when we add a troponin to the troponin sensor do we get the light emitted and here's for example one against the botulinum toxin where only the botulinum toxin sensor is responding so we think this is sort of a very general way of making molecular sensors and so um again i wanted to give you an example of how we can use this sort of molecular device concept for problems sort of beyond coronavirus and and in this case beyond sensing and here this is a rather different problem at least on the face of it the problem of doing uh calculations um in the body because you don't you know antibodies just work by going after specific targets and but sometimes sometimes you want to do more subtle things like you want to recognize cells that have two markers but not just one alone so you may have a tumor that doesn't differ from healthy cells in any single mocker but differs in the combination so we can take the same type of switchable system and direct the different components to different cell surface proteins and so only when we reconstitute the full system do we get switching and in this case um uh uh rather than emitting light what we're doing is we're exposing a motif that recruits an effector for example a car t cell and um that example is shown here so here this is a little bit more more uh complicated so imagine a cell that a tumor cell which has two markers which only occur together on the surface of the tumor and then also has a healthy cell marker and we want to make sure that we don't activate car t cell signaling of any cells that have that healthy cell marker so then what we do is we take the two components of the switchable system that i mentioned that i described earlier and direct it to the two components which occur on the on the tumor cell and then we take a third design which actually sequesters one of those two switching components required for switching and directed to the healthy cell marker and if we do that then we only get car t cell recruitment and activation when we in the presence of cells that have uh the two um the two tumor cell markers but not the healthy cell marker so that's here and the third area that i wanted to tell you about in context of covet is vaccines so for a number of years now we've been developing methods for designing self-assembling proteins and much of the rest of my talk is going to be about self-assembly and in particular we have been developing a wide variety of of polyhedral nanoparticles and so the basic principle is to match the symmetry axes of a polyhedron for example in this case an icosahedron which has five-fold and three-fold symmetry acid axes we place homo protein homoeligomers with five and threefold symmetry on those symmetry axes and we can uh move them in and out and rotate them and then we design interfaces between them and using this procedure we've been able to design a wide variety of very homogeneous protein nanoparticles you just express the two components in e coli they self-assemble and um these are just em fields of what comes out and so they're they're very co uh they're very homogeneous and robust so my colleague neil king at the institute for protein design has taken the receptor binding domain of the spike and then just and displayed it on the surface of these um one of these nanoparticle systems and again he gets very homogeneous uh nanoparticles so you can see the little rbds um relatively little on the surface here now what's very exciting is that these induce very very strong responses the um they're much stronger than to the trimer alone and this is p is a vidity we think it's because avididis is um really stimulating b cells that have the corresponding d cell receptors so the current mrna and other subunit vaccines basically use the trimer and um uh and now with these nanoparticle vaccines there's considerably more neutralizing antibody listed and better protection oops so these actually have been in clinical neil's team has these in clinical trials for the last six weeks and we're about to get um feedback on them these are among the uh most promising of what are known as the second generation coronavirus vaccines so so now we've again going beyond covid um we've been sort of extending this concept of of building nanoparticles from proteins with symmetry and what we realized is that we didn't have that we could actually combine form and function by taking embedding uh symmetric proteins in these nanoparticles that actually have functions in addition to symmetry and here we're backsliding a little bit because one of the proteins we're using is actually a naturally naturally occurring protein i think that's the first time in my talk where that's happened and so these yellow things are antibodies and so what we've done is to design in this case a pentameric protein which binds the fc region of an antibody such that the five-fold axis of the pentamer and the two-fold axis of the antibody lie on the five-fold and two-fold symmetry axes of an icosahedron so what that means is if you take this uh penta design pentamer and add it to any antibody um uh it will assemble it into a perfect icosahedral uh nanoparticle we've also made uh tetra um uh uh sorry um uh uh c4 proteins um that tetrameric proteins that hold antibodies so that they're placed on the two-fold axes of icosahedra and here we have a trimeric one that holds antibodies so they're on the two-fold axes of of tetrahedra and so so you can take any antibody and rapidly format it in these different oligomeric states and then you can look at the effects of the antibody for example on signaling and this is an example with a death an anti-death receptor antibody where um we can uh basically this antibody alone doesn't any do anything to tumor cells but by changing the the presentation of the antibody changes the engagement of of the um the death of the death receptor and we get um we get killing of the of the cells so um so if you have any antibodies you're interested in and you'd like to potentiate or explore how uh clustering of their receptors um uh alters um what the effect of that is on signaling now you can do it very easily there's no covalent modification or anything else required here it's just non-covalent interaction um so beyond biology i just want to show you sort of um uh kinds of things that we're thinking about so um and this is happening a lot in the group now nate ennis designed um a c2 protein that binds two chlorophyll positions as in the special pair and photosynthesis and then those have very interesting optical properties on their own and then changi wang used uh built these c2 structures into octahedra and so now we have octahedra with a c2 chlorophyll uh pair on each axis and uh this is this is actually just showing you an uh again this is a case where the design is very close to the computational model and uh so we're now thinking about putting electron trans electron acceptors in the middle like nanoparticles to try and start creating artificial light harvesting complexes okay so now this is where i start talking about all the fantastic things we've been doing with manu and student joe so this is work by ariel bensassen and um so ariel took um a a d2 protein and a d3 protein and arranged them so they would come together to form a hexagonal lattice and these are this is what happens when you produce the proteins in e coli you get this very nice lattice and there is he put a gfp in at the at the trimer vertex and you can see the gfp is there and so then um ariel made the protein separately and when he mixed them together uh they spontaneously form this really beautiful and regular hexagonal lattice and this is a superposition with ariel's design model there's the d3 and the d2 components they're exactly where they supposed to be so um ariel um in a great good fortune for my group met amanu at a meeting a few years ago and um we've had this really exciting collaboration where um [Music] joe and who have been functionalizing these components and getting really spectacular results which i'm hoping you've all heard about so i'm not going to go over them again in being able to demonstrate assembly on cells and very interestingly these arrays shut down into cytosis and they have really profound effects on cell polarity which if you haven't heard manu speak about um you definitely should it's it's absolutely amazing um so we've so on we've also been thinking about um uh in in so in nature we have proteins that uh uh um uh interact with inorganic uh minerals and cause mineralization as in you know tooth and bone and and seashells what if we could design proteins that could template um inorganic material mineralization but what if we could do with semiconductors and other materials that of interest to uh to modern humans it would be really cool so we've been starting to look at this so here we've been here this is work from harley piles he designed proteins which are perfectly epitaxial potassium matched to an inorganic surface so they're repeating proteins that have a repeat spacing that matches in this case the uh the mica lattice and um he made these proteins long enough so they were visible by atomic force microscopy you can see them binding here and then what he did was design interfaces between these to direct assembly of different types of of of assemblies and this is now this is now a hexagonal lattice again it's different from one that arielle designed because this is only forming on mica so these individual elements proteins are epitaxally matched and then they have interfaces which direct formation of this basically trimer on the mica but since these are perfectly straight proteins harley could make a series of variants just by making them shorter and shorter and shorter it doesn't really work with naturally occurring proteins but you can see the experimental results down here the hexagonal lattice gets smaller finer and finer and these are averages down here so um what we're now trying to do is to go flip this around and use the design proteins to actually template inorganic mineral growth okay so everything i've told you about so far is soluble proteins but we can take the same principles and apply them to membrane proteins and this is work of chon fujoo he designed these alpha helical channel-like proteins which have two rings of alpha helices uh surrounding a central pore these are soluble protein designs and these are comparing the design models to the crystal structures so he and pelonglu then took these and they resurfaced the outside to make them membrane compatible and expressed them in mammalian cells and um uh what was very exciting this in collaboration with uh bill catterall is that these proteins turned out to be very selective uh potassium channels so this is looking at conductance this is now a patch clamp on mammalian cell expressing this protein this is looking at conductance for potassium you see it's much more selective for potassium than uh than sodium and this is kind of interesting you know there's been so much written about what makes potassium channels selective for potassium and here we can just design a protein with a pore and get potassium selectivity pretty much right off the bat now what we can do now is um and we're doing is we can start exploring the effect of the exact composition of the lining on the ion conductance and in this experiment uh chun-fu took a glutamate that was near the entrance of the channel and turned it to a cysteine and what he found then is that it's easiest saying here is that this channel still um conducted ions just fine but if he chemically modified that cysteine the conductance went away um sort of demonstrating that the ions are actually flowing through the central channel now peylong and and uh chun fu also um uh uh i told you there were those larger channels the ones with the 16 helix and those um this is a cry oem structure pay long solved and there's a really big channel here which actually allows small organic dyes to go through so um that those were alpha helical channels and anastasia verbueva took on the challenge of designing all beta transmembrane beta sheet proteins and these have some very interesting properties this was a very challenging design problem because it turned out the key was that was very unintuitive at the beginning was to make the beta strands extremely low beta sheet propensity because the design since they're very hydrophobic would um would tend to form amyloid she had to really increase the cooperativity by reducing the strength of the interactions so this shows crystal structure comparing her design to her design model and crystal structure and they're very very close now what's exciting about this is as many of you know these these transmembrane beta barrels naturally occurring ones have been used for things like single molecule dna sequencing and other approaches now we can pretty much design them to order and so here are some of the ones we're currently designing with much larger pores for example and um we're very excited about using these for um uh you know for for you know as filters maybe for various types of sequencing and so forth um okay so then the very last part of my talk i just want to talk about designing uh coming back to this idea of responsive designs that aren't just rocks that actually sense the environment so i'm going to focus on ph sensitivity for the next couple minutes so this is a design of scott boykins this is a c3 trimer where it has these these different types of layers the red layers look like this they have buried networks you're looking down from the top that involve three buried histine residues that are fully making hydrogen bonds these black layers are hydrophobic layers now what happens is when the ph drop this thing is a very stable protein at ph 7 when the ph drops these histines become protonated and the whole thing blows apart and scott can could control the ph at which this happens and the steepness of the transition uh by controlling by by controlling the number basically changing the number of these histine layers and by changing the overall stability of the protein so the number of histine layers determines the cooperativity and the overall stability which can be tuned basically determines the set point and this is a crystal structure showing what the hydrophobic layers look like and the these hydrogen bonding layers look like so again we can design these very accurately so this is again a repeated theme now in my group we have these really cool building blocks this incredibly ph responsive protein and how shen took this protein and used it and and took it and built it into um into uh one-dimensional fibers basically by docking it against itself and then designing the interface uh to promote dry uh assembly into fibers and these are just cryo-em pictures of the different types of fibers he was able to make um i won't belabor this but this uh it's with justin coleman and eric lynch solve the structure of this fiber and like many of the fibers that how is designed is nearly identical to the design model so here we've designed a monomer to be ph sensitive and then how is very precisely eraded into a fiber so this is another example really cool collaboration with manu and joe so first of all we had done what hau did was just look at different phs um you have the fiber at a ph8 you dropped a ph3 and then it um when you raise the ph and then it goes back but we had no idea how fast this happened this is a really cool experiment of joe's this is with fluorescently labeled fiber and so you can change the ph and follow what happens after time and you can see that in in a very small amount of time the fiber basically explodes and so we can build in it's almost like a 1d phase transition because this thing is so cooperative each unit has nine histines in it and then the whole thing is made out of these units um so we're very interested in sort of uh figuring zooming in to see what's happening here and it's very very sensitive so the um the there's no effect um in this case being at ph 3.4 has no effect on the fiber but at three point three point one it comes apart and how can change the set point by changing the number of histines inside the monomer um and then the final the final area is um i think some of you heard this from alexey corby is now we're starting to think about designing uh rotary machines and so what alexey has done is designed proteins which are which are axels and then other proteins which are rings and he's figured out how to get the rings to assemble on the axels and these are examples here's an axle and a ring and um and then he's got the uh the ring assembled on the axle and so he's done this with with um both in cases where the symmetry is matched so he has a c3 axle and a c3 rotor as well as other cases where there's no match at all in the symmetry so for example a d3 axle and a c5 rotor and when you do this you get very different types of rotational energy landscapes when you have a match in the symmetry of the axle and the rotor you get sort of a simple three-fold symmetric rotational landscape as evaluated by rosetta but when there's a symmetry in a mismatch it's a much more rugged landscape and so then what what what alcy's been doing is using cryo-em to map the rotational landscape of these and i'll focus in on this system um so here's uh alex's uh computed landscape and here are different um minima that rotational minima that he observes experimentally so you can see that in this case this is a case where there are two rings on the rotor you can see here they're um they are staggered relative to each other and here they're eclipsed um [Music] and uh so these are and so we we have speculations about what the differences between the symmetry matched and the symmetry mismatched ones will be um that are that are so far holding up that we have more pronounced fewer more pronounced minima in the symmetry match case so the next thing that alexey has done is to put catalytic sights at the interface between the ring and the axle and we can actually and i'm not talking about today but we've been doing a fair amount of enzyme design so we can so he can actually get build active sites um at the interface and now what we're doing is using single molecule measurements to try and look at the relationship between fuel hydrolysis and rotational motion and we have some tantalizing but rather preliminary results on that at this point um so just a couple other things um you know it's been um we've been trying to engage the general public in protein design and so we created this game called foldit where folded players start with a fully extended chain and then they fold it up and they can change the sequence and they've been making all sorts of really cool looking proteins and we've been making in the lab and uh solving structures and um uh a number of them actually fold up into very interesting new structures that are very close to uh what they what they designed um and uh if i have i don't know i'm probably way over time but if i have a few minutes i'll i'll say a few word about a few words about how deep learning is uh is affecting uh uh this so let me just um sort of conclude uh um uh i think i've shown you we understand the fundamentals of protein folding and assembly to the point where we can actually design brand new molecules and that do all a wide variety of different things it's just a super exciting time now because um sort of like man who alluded to there's just so many different things that one can design and so really excited about collaborations if you have ideas let me know um and uh yeah i've realized that really the the limitation now is more that is is more the the network of people interested in in in testing things so are exploring things together and again the collaboration with man has been so wonderful um so okay let me just uh quickly go through acknowledgments so the antiviral work um done by longzhin and bryan and ina we've had really fabulous collaborators on the in vivo side the diagnostic the um the switch the invention was that of alfredo and andy and the antibody nanocages that was work of robbie devine a grad student in the lab and david wiesler uh um solved many at solve the structures i showed you as and then the i didn't actually get a chance to talk about extensible materials the self-assembling 2d lattice work of ariel bensassen again it's been a fantastic collaboration with joe and manu the ph dependent fibers the ph dependent building block was built by scott boykin and then howe um built it into the fibers and again another great collaboration with joe and manu the de novo designed membrane proteins um chun phu and paelong designed the alpha helical ones and anastasia and samuel are designing the beta sheet ones the the calculations on the surface of cells for car t cell targeting work of mark and scott and jillian it's been a great collaboration of stan riddell folded is the work of um uh brian copnick and a team of people now around the world and uh alex c and jakov um really alexi has been the driving force behind the whole rotary motor work and um uh and do i have two minutes manufa okay all right so now everything i've told you about is sort of based on the principles of physics we're trying to design proteins that fold the lowest energy states um and uh but but deep learning is is changing this a little bit so let me describe the uh the basic concept so we've been developing uh deep neural networks for um predicting uh structure from amino acid sequence and basically the way they work is you put in a sequence and they predict um the set of distances between the amino acids from which we can generate a three-dimensional structure now what we realized is that we could actually feed in um completely random sequences and of course of course those are not predicted to fold to any structure at all so you get a very blurry distance map but then we could optimize the sequence to the point basically optimize how uh sharp and contrasty the distance map was to the point where it actually uh had an opinion about what it was going to fold to and then we could generate structures so you're probably familiar with um the idea that if you have say something that can recognize uh images of cats on the internet you can um you can turn it around and start with and generate random images and then um tune them to they to the network thinks they look like cats and those are called hallucinations and in the same way these are hallucinated proteins because these aren't any particular protein these are these are proteins that basically these are sequences that the network thinks should fold we didn't put any constraint on what that protein what that structure actually was we just said that the just trying to optimize the sequence so that it will fold to something and so this is just showing what the process looks like we start off with a random sequence and then we basically are doing monte carlo optimization of the sequence and just basically scoring how contrasty the distance map is so you can see it starts off very fuzzy and blurry and by the end of the optimization we have these very strong contact patterns so this is just residue by residue and a dark spot means that the residues are close together and when you generate models from these you get these different types of structures and if you do this many many times you get a really wide variety of sequences and structures and what's interesting is if you take these sequences and fold them up using rosetta yeah you get um they're predicted actually to fold to their design structures this is completely orthogonal because that has this physical model whereas the the network really doesn't know anything about physics except what's sort of it's disembodied millions of parameters so we've made um a number quite a number of these proteins in the lab they are very very state they fold they have the desired the the target cd spectrum they're very stable and we've now solved structures of three of these by x-ray crystallography and nmr and the structures are very close to the design models showing that this the network really can hallucinate brand new proteins that um that fold as designed so that's that's um uh uh so that's showing that that this hallucination process can generate new proteins there's some very nice properties here even when you think about some more physical point of view one of the problems with traditional rosetta protein design is kind of myopic it just sees the structure trying to design and it basically searches for a sequence which is very low in energy in that structure but as you do that there's the possibility that there's some other state which is getting very low energy as well and you only find that out after the fact when you have the sequence and you do a structure prediction calculation to see if there are if your sequence actually folds to that structure now in this um with the network we can actually compute the probability that the sequence will fold to the structure of interest and if we have a structure we want we can then explicitly maximize that probability and in that case it's taking into account not only the structure we want but also the whole landscape so we can we can basically do landscape aware sequence design and the the final example i'll give you is um uh so i told you about sort of just letting doing this free hallucination process i told you about the very focus problem of you have a structure and you want a sequence which has very high probability to form that structure we can now make a hybrid of those two we can say look we want this part of the structure to have this um to be in this state for example to bind a target but the rest of it we just want to be as simple as possible and we want it to be such the protein will fold so we're we're keeping part of it a functional site fixed and then hallucinating the rest of the sequence uh sort of maximizing the probability of the protein fold to the state so we've got basically a hybrid loss function one part is just rewarding folding and the other part is rewarding uh recapitulation of that functional site so here's an example this is c3d it's part of the complement cascade and um uh here are um the two helices that bind complement are down here and this just shows examples of what happens if you keep those parts fixed and hallucinate the rest you get little proteins that where the back side is just buttressing the binding site essentially we can do this with um with insulin and um this is we can do the same thing with beta-sheep proteins and um uh so it's it's kind of an uh uh uh um anyway just so anyway so sorry i'm probably on way too long but um this was work of um the the free hallucinations work of ian sergey sam and tamuka chris and basil really explored the sort of the idea of of designing in the full energy landscape and doug and sidney and you have been doing this constrained hallucination so yeah thanks thanks for your attention and i'm sorry for going on and on wow thanks a lot that was amazing uh actually we have already many many questions uh it's actually uh for the there are a lot of people that raised their hands um so i think the first one was lucas carrey so lucas i'm allowing you to talk so you can ask your question oh sorry that was a mistake um all right well then sorry about that uh then i will move to uh mutum [Music] i'm sorry if i'm butching your name hi can you hear me yep we can yeah yes very nice talk dr david and so i just wanted to come to your question uh uh your uh slides on the transmembrane helix design in the beta barrels i mean it's just more like a so anyway i'll just get off with the question so how do you tune the or the polarity in the sense how the these transmembrane healings or beta barrels orient with respect to the cell membrane in the sense whether the orientation is very plasma how do you tune that and my second question is whether they pose a preference for the lipid environment that they are in yeah that's a very good question um one of um so for the alpha helical proteins it's been more straightforward than for the beta barrels and for the alpha helical proteins it's basically making the outside part that's um you know in the membrane hydrophobic and then we put snorkeling amino acids at the you know the membrane boundaries but the beauty of de novo design is you know you could we can design for membranes of any thickness they can be twice as thick or half as thick we just change the lengths of the helices or the beta strands and so we're very excited now about uh designing proteins for for synthetic membranes um but yeah i think you you we you do have to keep in mind sort of the thickness and the properties of the bilayer that these proteins are ultimately going to go into all right um so we also have a question on the q and a by mathias hohner that asks you are this protein immunogenic uh and is it possible to create proteins in a way that is non or low yeah so that's a really good question i mean we're going to so what i can say is that uh the the binding proteins that i described that are um 60 amino acids or less they're very very stable and they're small and they're very soluble um so far we have not seen much sign of immunogenicity in animals so for example in the coronavirus therapeutic case our collaborators did an experiment where over a three week interval they gave the protein every three days and then after three weeks they they gave the protein and then exposed the virus and there was no reduction in efficacy there so that there were no neutralizing antibodies against the protein that had been created but of course these haven't um we're going to find out actually uh in the next year because um we've started a couple companies that are you know putting denim design proteins into people um but in general our our hypothesis is that you know small stable and soluble proteins are not presented very well by dendritic cells um and uh but one could go further um you know obviously removing um t cell epitopes b cell epitopes it's the fact that they're smaller will make it easier on the other hand these nanoparticles are incredibly immunogenic they are big they look like viruses and that's really um part of the reason why these nanoparticle vaccines are um are are so effective jason shane had a question i will allow him to talk jason can you hear us oh all right well um so i will move to uh burst bound other question suppose i'm allowing you to talk oh hello yeah yeah that was a totally amazing book david um i was just going to ask whether um the fact that um you can you can design all these proteins with specific structures do we learn something about native proteins so can you say for example um what the uh can you predict how um say covet spike protein um why it evolved the mutations it did and whether in the future it's gonna can you predict its path of mutation in coming years for example yeah that's a really interesting question you know i think we can certainly have a try at that but um one of the things that's that's perhaps i should have emphasized a little bit one of the things i should also say is my talk was incredibly misleading because i showed you all these beautiful examples of designs that work but of course most designs either don't get expressed in bacteria or they end up as gunk at the bottom of your tube so there was definitely some selection bias in what i told you about but what we've also found is that so one of the when we're doing these design calculations we're designing things to be in really deep energy minima and that's how we're getting the accuracy so you know the force the rosetta force field does have um you know there there it's it's certainly far from perfect so they're probably errors on the order of the kcal per mole and what that that does is it makes it very hard to actually point to a native in a native structure and say well this amino acid is contributing this um i think we could so in terms of predicting the future course of coronavirus evolution i think you know we could certainly say that substitutions that were likely to cause great destabilizations are unlikely to occur but in terms of sort of the more subtle things that happen and sort of the reconfigurations that are likely to happen you know as new strains develop you know i think i think they're models that incorporate you know past uh sequence information are would certainly be necessary to supplement the the um uh the the physical um model the result of physical model another example of that is you know directed evolution has been really powerful in terms of of uh improving the activities of naturally occurring enzymes more powerful than computational design because you have to have really high you know the accuracy you need to predict say five-fold changes and increase increases in activity is pretty small sorry that was a long-winded answer to yours thank you thank you but but we're good at getting making completely new stuff i mean that's kind of the unintuitive thing making completely new stuff is actually easier than than than sort of you know making statements about you know the really complicated delicate delicate proteins that have come through natural evolution all right um so the next question is for shuhan i'm i'm allowing you to talk okay hi thanks david for the nice talk so i just wonder like for when you screen through all these proteins scaffold what's library size do you screen through and how many candidates do you have the screen to find uh sub nanomolar affinity uh binder and a second question is what do you think about this rapidly developing phase display nanobody library and uh what do you do you think you can replace fetch display and code this in silicon design of affinity materials binder selected by fetch display yeah well yeah so let's see so the number of designs we actually tested in the different parts of my talk were very very different um it has a lot to do with the size of the protein so the reason why we one of the reasons we focus on proteins with small sizes is we can actually encode each computational design in a single long olu nucleotide and then we can actually test very large numbers of designs uh because um we can get we can get oligonucleotide chips from companies like agilent or twist where they can encode up to a hundred thousand different oligonucleotides each encoding a brand new different protein design so for the coronavirus uh that's what we did we we basically on the computer we designed 100 000 different proteins and we encoded them all in chips and then we could use yeast display to identify the ones that that that worked the best um and uh um so for the on the other hand for things like the transmembrane proteins then we were just testing small handfuls of designs because those are bigger proteins and we um you know we couldn't order lots of genes and also the screening was much much more difficult so in the process um when i said we're going to now try and improve the computational design methods so that we don't need to screen so many we have a lot of data now on on the designs that work and the designs that don't work um and substitutions that change affinity and we're trying to start doing machine learning on that to try and improve our methods as far as phage display so phase display is a very powerful method but you can't specify what site you're going to bind to and you can't really specify the properties well you know it's going to be a nanobody but you can't specify the stability thing so i think that these computational methods will sort of take the place of art of sort of random library selection methods it's sort of for the same reason that like in modern technology now if you want to build a building it's like the phage display approaches you throw a big pile of bricks in the air and you let it fall and you keep doing it until it forms a nice building but that's obviously kind of inefficient if you can just build the building from first principles in the first place nice all right so the next question is for nina i i'm a merchant only 100 100 000. can you hear me yeah yeah oh fantastic talk david it's really mind-blowing um i'm really interested to know if you can design an intracellular thermometer and i'm particularly keen on doing this for long-lived cells like brain cells well that's a really interesting problem we've uh yeah just send me an email um if you know if you have a that's exactly the kind of problem that we like to hear about we you know we just don't know what the problems are so i think that would be a you know let's discuss that um that sounds fun brilliant great thank you uh so we have um a question on the on the question q a uh somebody that uh so it's zooto you that ask you how about expiration on cellular delivery uh because obviously you are you are you are doing all the those particles can you deliver them right right right well we're working hard on that um so um i have a student who's um so who's figured out how to package um guide rnas inside the particles and um so we have um and we have we actually have particles now that are built from those ph dependent building blocks i told you about so the particles come apart at at um low ph the other thing is that when these primers come apart they actually permeabilize liposomes and cause their contents to be released so we're now trying to put all the pieces together um and so the idea is that we um you know the particles get endocytosed uh one of the formats the other another format we've made actually ph dependent plugs for those antibody cages so anyway the cage gets taken up it comes apart at low ph and then we have um a guide rna to to do gene correction so it hasn't quite worked yet because and if it had i would have told you but we're working hard on that um yeah so the idea is to get around viral vectors for delivery of biologics wow um there's a question as well from that perez um that ask you whether it's possible to design self-catalytical reactions like the one that creates the crawford gfp oh yes well in fact chen fujoo who's the genius who um who uh designed the um transmembrane proteins i told you about uh he also has designed um as it is another collaboration with manu and joe he's designed proteins which bind to chromophores which um uh fluoresce in the far um infrared for um for deep tissue imaging but he has also designed proteins which do the gfp chemistry so he actually has completely synthetic all helical gfps now and and but the question is i mean do you think you can then do i mean like pretty complete you could do completely different chromophores from the ones that are yeah you think you can i think there it becomes sort of a chemistry question you know uh um you know we need uh collaborators who can say well if you could bring you know it also has to be cases where the the the side chains um you know they have to be reactive and uh um but yeah if if if you can tell us that you know if you can say if you bring these three side chains together in this geometry uh this reaction will likely take place and that will be fluorogenic i think that's by all that's totally doable we just need marching orders um all right uh sorry let me just go back to that uh so um henry wood has a question henry i'm allowing you to talk oh okay yeah can you hear me yeah uh great talk thank you david um just a bit of a broader question you've obviously got a lot of programming experience what do you think the main obstacles are to molecular biology reaching a point where you can easily program it from a computer or is that maybe the wrong way of thinking about it no i think that's a reasonable way to think about it i mean biology is really complicated i mean everything i talked about was with single protein systems and already you know really we're really stretching the limits i mean the amount of computing power that went into a lot of the calculations is very large so for example um one of the things that we do before we order designs is we take the sequences and we send them out to a distributed computing project that we started a number of years a call ago called rosetta at home and basically people on their on their on their home computers basically predict the structure and they send us those predicted structures back and so we can check in that way that the sequence actually is predicted to fold to that design structure that's a big calculation so if you're now going to be programming molecular biology if it's going to be hard to do it at that level of detail because you'll now need to be simulating not just individual proteins but whole networks of proteins and maybe their cellular membranes as well so you have to go to higher levels of abstraction you know and the key for any kind of computational model is that you need to be at high enough resolution so that your calculations are accurate but they still have to be computationally tractable so you know you have to be able to do enough sampling so that you're really figuring out how your system works at least according to your computational model i think for higher level questions in biology i think that's just very hard right now i think we'll get there but it's going to require you know much much more powerful larger scale computations and probably new types of models all right so in the interest of time i will ask the last question is from himami uh tendon from the lmb he asked um while designing small proteins lenovo how do you computationally filter out those proteins which can be very good as binders but can have non-specific binding of targets yeah well we do that sort of what we call heuristic negative design so the one like you could imagine taking each design and then docking it against every possible set every host protein and throwing out those that uh uh are predicted to bind other targets but obviously that's you know it's an example of something that's not really feasible so instead you know we try and build in properties that are likely to give rise to specificity the most obvious one is you we avoid a big hydrophobic patches which are likely to lead to non-specific binding so we're sort of restricting the number of hydrophobic residues of being close together and then always trying to make sure we have a lot of polar interactions with the target as well and when we do that um we have a number of proteins where we've done sort of wholesale pull downs and we only pull down the the desired target um so so by i guess the answer is by getting very high stability and then by going for more polar interfaces and you saw i showed that briefly the what the what the surface views look like that's how we how we aim for specificity all right and you think yeah vdt helps what's that sometimes oh yeah yeah so the avidity um yeah the avidity is is certainly a way of getting um you know in the case of the coronavirus we wanted we didn't need better than peak molar binding to neutralize any given virus but once it starts to mutate you know if you're at 10 femtomolar you can lose three logs of binding affinity and still neutralize so yeah so that definitely gives me more all right so in the interest of time i will stop the discussion here because you have a very tight schedule now uh so we'd like to thank you again very much david that was an absolutely fantastic talk thank you all the attendees i think we maxed out so soon the the zoom account of the lmb uh so great uh and uh and then uh yeah so so david now you should yeah thanks for all the great questions i wish we had more time to uh to talk but um if you people would like to collaborate please please send me an email we're you know really excited to to explore new areas thank you very much so amanda what do i do now so now you just go to the author link which is
Info
Channel: MRC Laboratory of Molecular Biology
Views: 4,082
Rating: 4.9238095 out of 5
Keywords: protein engineering, protein structure prediction, structual biology, MRC LMB, LMB, Laboratory of Molecular Biology, Medical Research Council
Id: JK3eiLxu5es
Channel Id: undefined
Length: 68min 10sec (4090 seconds)
Published: Thu Apr 29 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.