The Existential Risk of AI Alignment | Connor Leahy, ep 91

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the fact that we didn't have a nuclear war was not how things had to go it was the hard work of many you know smart hard-working you know military officials and diplomats and scientists and you know many many people around the world who worked very very hard so that we didn't end in the nuclear war and but it didn't have to go that way we got lucky in the same way AI can be the best thing to ever happen to us but it doesn't happen for free it doesn't happen by default oh everyone my name is Stephen Partin and you're listening to the feedback loop by singularity before we jump into today's episode I'm excited to share a bit of news first I'll be heading to South by Southwest in Austin on March 14th for an exclusive Singularity event at the Contemporary a stunning Modern Art Gallery that is in the heart of downtown Austin this will include a full day of connections discussions and inspiration with coffee and snacks throughout the day with an open bar celebration at night so if you're heading to South by and you're interested in joining me and having some discussions meeting our community of experts and change makers then you can go to su.org Basecamp Dash South by Southwest which I will link in the episode description so you can sign up for this free invite-only event and just to know it is not a marketing employee when I say that space is genuinely limited so if you are serious about joining you probably want to sign up as soon as you can and get one of those reserved spots and in other news we have a exciting opportunity for those of you with the track record of leadership who are focused on positive impact specifically we're excited to announce that for 2023 we're giving away a full ride scholarship to each one of our five very renowned executive programs where you can get all kinds of Hands-On training and experience with World leading experts you can find the link to that also in the episode description and once more time is of the essence here because the application deadline is on March 15th and now on to this week's guest AI researcher and founder of conjecture Conor Leahy who is dedicated to studying AI alignment the AI alignment research focuses on gaining and understanding of how to build Advanced AI systems that pursue the goals they were designed for instead of engaging in undesired Behavior sometimes this just means ensuring that AI share the values and ethics that we have as humans so that our machine genes don't cause serious harm to humanity and this episode kind of provides candid insights into the current state of the field including the very concerning lack of funding and human resources that is going into studying at this very important topic amongst many other things we discuss how the research is conducted the lessons that we can learn from animals and the kind of policies and processes that humans need to put into place if we are to prevent what Conor currently sees as a highly plausible existential threat so on that optimistic note let's jump into it everyone please welcome to the feedback loop Connor Leahy thank you start maybe you can just provide us a little bit of the background story that led you to founding conjecture in a bit of uh insight into your current efforts yeah so conjecture grew to the large degree out of my previous project was the Luther AI so Luther AI was the source Community Building um you know large language models in doing research and ML and such they're still around still doing great stuff um I'm what's involved nowadays and I was back then and basically while I was working with Luther AI there's a lot of things I think you need to get done like a lot of research ready to get done um you know many many important things to do and well you know I I love the Lutheran AI with all of my heart but working an illusory AI is kind I would describe kind of like trying to herd cats while the cats are also the smartest people you've ever met and have crippling ADHD oh no so it was you know if something exciting was going on people wanted to work on something it was truly magic you know feel like you know we build really very complex software and you know train some of the largest open source models you know and sometimes and people still use these models so these days um you know with like you know two or three people at a time or something on these things you know things that could take industry Labs you know large groups of people to get them um but if you have to do boring things you know and often a lot of work that needs to get done is quite boring it can be very tricky so yeah basically you realize at some point is that if you want to get people to do the board things you have to pay them so to a large degree um projector is a me being practical it's just like hey all right so I think alignment and AI safety is the most important problem that I want to solve that I think is very important um research is expensive you had to pay people to do things and you need like offices and computers and stuff so let's make a company and yeah to answer your question about what are their goals in computer so as I've already said I I'm pretty practical in the sense that what I want I am sure would talk about this a lot is I think the alignment problem so this is the problem how do you make powerful Advanced AI systems do what we want and not do what we don't want them to do um this is of course a big topic and many aspects to it um but yeah I think this is a very big problem I think we're very far from Solutions especially at scale to very powerful systems and it can checks your goal ultimately is to work on say how do we design powerful AI systems that you know can do useful and impressive things and reliably you know consistently you know actually follow our wishes actually do what humans wants and and how can we avoid these if these sisters and doing things we don't want them to do and so one of the questions on Research I'm most interested in is okay how can we Design Systems that where at the very least we can like rule out that it will do X or that like it will never like we can make a exhaustive list of things it can do and that list will not be broken yeah fair enough and for people who may not be familiar because this seems like a very nebulous and a most abstract task that you're setting yourself to how does one really go about doing this alignment kind of research like are you guys sitting around reading like Kant and Nietzsche in your spare time and then trying to code up something are you just creating these use cases where you're just trying to see if you can break a system and get the AI to go somewhere where you don't want it to go like what is alignment research this big umbrella term really look like maybe in Practical terms yeah so uh we don't only do alignment research you know we're also company we also build tools we train our own models so we also train like you know like language models and how that's like you know classic engineering type work as you'd find at any other startup um and the alignment work in particular so there's many different types of alignment work you know and yeah so importantly it's the field of alignment is really is really very small it's the number of people working full-time on this problem rather than making AI stronger is like less than 200 probably less than 100 that's so scary it's it's crazy it's crazy like we have thousands of the smartest people alive working on making AIS as strong as possible as fast as possible but I would say we probably have less than 100 full-time people trying to like actually control so it is a um it's a very strange like timeline we live in it's like if like I understand how it got this way but still it just feels like man really like come on we can do better than this so so yeah so as with any early field of science that's relatively small that has you know um you know there's founder effects and eccentricities you know so like there's a lot of eccentric people in the alignment field not bad surprise right whenever a new field emerges whenever there's a small you know group of people trying something new you have some eccentric people in there so we have some people so we have lots of different people trying very different things some people try very very formal things they try like in fact there's people for example like Vanessa kosai who is trying to like prove mathematically how to like make a perfect AI system that does that can you know can never ever make any you know mistakes and exactly understands the user and like everything horrible and mathematical which is insanely difficult and I don't know if that's possible but it's interesting that you know someone was trying there's other people that are working on skills well other people just make do very pragmatic research as you describe you know building for example people like Redwood research for example uh use like language models and try to train them or like filter them so they for example do not use harmful or like um violent output or whatever and see how they can break that like how can these soldiers be overcome and once you find a way to overcome them how can you make your filter most of them whether you count this as alignment research is controversial a little bit um and but um another big area of research is interpretability and this is one that we also do a lot of research on so this is trying to understand how do neural networks actually work like what does the internal structure mean how do they represent Concepts whether they make decisions how they Etc so this is something I'm very interested in that we're very interested in picture that's the only thing we're interested in but one of the main areas of research that we're interested in is we look at neural networks which you know you can imagine that these massive tables of numbers that all get like multiplied and added to each other and then you get some output and there is a lot of structure to these numbers but it still is like a billion of numbers just kind of you know there you know so there's a lot of structure but it's not easily understood structure it's only a computer program written by human it's way more like a computer program written about alien an alien or about like Evolution it can be look like DNA you know if you look at like DNA in a in an organism there's lots of structure there like there's lots of things you can figure out and or like proteins and stuff like there's lots of structure there is there is sense in the madness but it's still Madness it's still very very complex system that's not meant to be easily understood so neural networks have the same like evolved kind of texture to it where it feels more like doing biology than it does like doing traditional computer science so a lot of what we do is on the research side is yeah we think about okay whatever experiments we could run to like try to understand what these things are doing internally sometimes we'll build like very small toy models and then try to like pick them apart basement piece other times we'll take large new trained models and try to like see how they fail in various tasks and see if we can find structure and like how these errors come about and like how we can develop you know hopefully a better Theory this is still very early research like this is not unfortunately this is not the kind of thing that I expect to be solvable and like oh one year two years or whatever this is a massive undertaking it's basically developing a new branch of science in a sense or like um so the way I see things going in a good world you know where we actually solve this problem we actually make you know great progress here and such is basically we need to develop a science of intelligence so like there's like a little bit of this you know there's a little bit of theory about neural networks yes and they're still like you know we see learning theory and like there's a little bit about like you know you know Invasion optimality and like Asian statistics and stuff like this there is a little bit of theory but most of this doesn't really apply to like the kind of stuff we actually care about and I don't get to know what this Paradigm of science would so we're kind of pre-partymatic we're in this funny world where our engineering is far ahead of our Theory this is not typical like steam engines were developed before we developed thermodynamics you know just by people concurring and trying things out they found in a way of building very good steam engines at the time and only much later did we develop the actual theory of thermodynamics to actually you know quantify and understand these systems so I think we're in the post steam engine pre-thermodynamic stage of like understanding intelligence and understanding AI I expect that if we have enough time we pray Singularity you know um to do a lot more research on the systems I expect us to develop through the thermodynamics of AI whatever that looks like I don't I don't think it would look like heroin Dynamics would probably look very different and then using that I think we would be able to build bounded and portable and safe and Powerful systems that you know you know you build steam Bowlers that don't explode you know and we don't have to you know once team boils were first invented they were but they were very dangerous like they would explode all the time and get like 20 people like these are very dangerous machines and I we're you know luckily our AIS do not currently kill people right currently currently knock on wood knock on wood you know that that stays that way for a while but it's not unreasonable to believe that as these systems become more powerful they will become more dangerous by default unless we have the appropriate safety Commissioners it's like steam Focus you know suddenly we have temperatures and pressures and like you know that didn't exist previously like these weren't things people were used to dealing with like the pressure that exist inside of the steam boiler was not something that really existed in nature before them it wasn't like a thing humans were used to dealing with whether we evolved you know you know steam proof scales because in nature we were like resistant and this was a novel situation we're encountering a novel form of you know environment a novel form of danger that you know we eventually did overcome and we now you know steam boilers are very very safe we now know how to build very very safe steam boilers it took a lot of time and a lot of exploded boilers and that might be very problematic if the boiler in question is something a bit more powerful than that to that end are we coming across or do you think we will come across new units of measurement like new tail tails that we can use to kind of figure out what's happening inside these systems I think so I I think a a truly good theory of intelligence should be able to put you know bacteria plants ants dogs humans and gbt3 on the same scale like these should all be it should be possible to put all of these on a objective you know not Observer dependent uh you know reasonable scale where everyone agrees okay this is a reasonable measurement scale of some kind I have no idea what the scale would look like I don't know what the units would be I don't know how you would construct this but I expect you know if aliens from outer space at you know 2 000 years Advanced came down and they handed us a textbook for their science of AI it would include something like this like it would be like oh you know AI is just a special case of an intelligent system you also have like you know humans and you know E coli and whatever you know they're all optimizers and there would be some kind of universal therapy but we're obviously far from that what's that end are are you looking to other animals other organisms um I I guess other forms of Consciousness and intelligence just derive these models like the can a bat or a rat or a primate or any of these uh you know forms that exist in the world are they helping you inform This research at all so for the research gym we know so we do pure computer science kind of engineering Environmental Research you know of course you know philosophically you're inspirationally I think is important so like I I think actually understanding like I think every AI researcher should read at least like one book about animal intelligence if I can recommend one of them I really like the book are we smart enough to know how smart animals are by friends of all it's a very good book all fronts of all books are very good and I think one of my favorite genres of like my favorite genres of like um philosophy Professor you know strawman philosophy Professor are the kind or like philosophy of mind and like claim all these things about intelligence about humans or whatever it could be completely disproven if they meet exactly one animal and just interact with an animal and saw like how many of these traits that they think are unique to humans animals have as well like you know chimps obviously a theory of mind like crows you know can obviously use tools like there's a lot of things I basically think that a lot of intelligence is quite Universal or quite simple in a sense like like I expect that not all but most of human intelligence already existed in like some of the first vertebrates it was just very small you know the same way that you know a you know tiny neural network from 20 years ago has most of the components of gpt3 not all of them there's there's still several important things missing but there's still a lot of core insights and like you know and a small you know people neural network and background it already like does contain a lot so I particularly like um to think about chimps and you can actually chimps versus Humans because it's pretty uncontroversial chimps are very close to humans in many many ways but you can see our brains are larger you know more parameters you know but most of the structure is the same like it's very very same sure there's some tweaks different hormones and like you know you have a bit more of those cells that's a little bit but the structure is almost identical yet humans go to the moon and chimps don't so there is some difference and I don't think so I think this difference is quite small like I think it exists like there's obviously a difference between dreams and chips I think there's different in like you know space of algorithm to space the design space is like very small and I think this is not super controversial to the thing and I've used this as a nice intuition pump when thinking about uh AI so like you know sure maybe our current AI system is like pretty silly and pretty stupid and you know they hallucinate stuff and they make mistakes and you know they can't really take actions and stuff but I think intelligence is not linear like it's not like you add one unit of effort and you get like one unit back there are discontinuous returns and that you know humans are probably not more than like you know three times more intelligent than chimps our brain's about three times as big right you know so we're maybe like three times as smart which is pretty smart but three chimps can't out think a human you know three chimps can't go if we had three times as many chimps as humans they don't go to the Moon something so a very small algorithmic change and relatively small change in parameters can vastly change the resultant optimization pressure that can be employed in reality so you know the difference between like you know gpt2 and gp3 is like a factor of 10. so you know and you know maybe that's fine but and you know it just gets like better by certain degree but you know the difference between chimps and humans is about a factor of three yeah I have you has there been any work that you know of that has tried to maybe capture some of the evolutionary ideas around what unlock that intelligence specifically I'm thinking of the conversations around things like dual inheritance Theory which is like our our culture was a big driving force and a lot of what makes our species so intelligent is the fact that we cooperate that we talk to one another is there any work that you know of where we're having some of these like adversarial or maybe Cooperative interactions taking place between AIS and seeing if maybe giving them value systems or hierarchies or like social dynamics that we can compel them to behave in certain ways there is work on this but um you know and I make a large particulars on the sharp Theory worked on by Alex Turner and such as like as work that I think is wrong and I think does not work but I think it's interesting so you know shout out to Alex and Clinton for doing that work even so I don't think it will work at all I think it's worth trying you know it's interesting um so there are several aspects there's the aspects of like okay where does intelligence come from how does it relate to like like what are the features that are made of humans intelligent and then there's the other question of human values like where do human values come from what does cooperative mean like you know and I think these are quite the support actually to separate these so one of the core things with alignment which I think is very important is it values and intelligence are separate like you can build a very intelligent system that does not want what you want it can want whatever it wants you know the same way you know you can like the color blue or the color green like you know it's not a difference to you like the same but even more extremely so especially the Software System right you know humans we all agree on like you know some things you know I think we kind of want to like not to be in pain you know we kind of want to things to be like nice you know most of us except like some Psychopaths you don't want other people to be happy and such so like you know like some things that we can agree on but like the software system I mean you can want anything you know you just like type in the code you know okay this system wants to collect rocks and then that's just what it's gonna do like why not you know it's just it's just a piece of code ultimately so uh so this is called the orthogonality thesis is a kind of any system I guess just because the system becomes smart doesn't mean it will have converge in any set of values like it's not it will suddenly become an Etude ubermensch and like you know or like you know see the comp was actually correct or Christianity is the one true religion or anything like that no why would it you know it's let's just do whatever it wants you know whatever you type in or whatever we didn't type it because currently we can't control these things like this at all so why do humans become smart I think a larger than much of this is uh basically that humans moved a massive amount of their combination outside of their brain so you take an individual human with no language no parents no upbringing no Society nothing he's smart for a monkey like still a very smart monkey you know they'll figure out how to like throw rocks and you know stuff like that but like they won't figure out fire magically you know if no one teaches them they won't spontaneously know how to do that you know they won't spontaneously you know learn many things per se they won't know which you know many traditional cultures you know have incredible knowledge of the plants and animals and you'll know what you can eat and what you can't but that was all found out the hard way at some point you know someone ate the mushroom and got sick from it and then they told the others man so in a sense I think the adaptation if I had to pick one adaptation so to speak that makes humans human is that we have needs because this is in the in the Richard dawkinsons we have memetic pieces of information same way we have genetic piece of information I think most of human evolution nowadays happens through memes not the genes so if I met like you know my Paleolithic ancestor you know we're not doing anything to that difference you know sure they'll probably be like Harrier they're bigger than me maybe or whatever but like mostly where we not that different but memetically it would be vastly different you know my like if I did logic tasks in more like you know linguistic tasks or something this would be something that my ancestor would do very very poorly on compared to me and you know I you know I know many facts and I know how many algorithms in my head that I didn't with you know I can't come up with all computer science myself I'm not that smart but other people can I can learn it so I can so I I know more about computer science now in turret it's not because I'm smarter than Alan Turing I don't think I am but it I it was other smart people pre-digested this and it was passed down to me so humans move from a genetic evolutionary uh you know mechanism to a memetic kind of evolution but we're kind of in between because we're still ultimately biological teachers you know we still have G it's still important you know and you know if you get bonked on the head really hard then all those memes aren't gonna help you you know you're still screwed so we're still biological creatures over like the in-between stage or the in-between between biological and medicine and so an AI is a purely memetic picture so this is this is the next step in a sense is that we're going from the biological to the intermediate to the purely mimetic like even the AI body or brain quote unquote is itself just code just means it's it's just information it's completely abstracted away from any kind of biological you know uh happens then so this is also why I expect AI systems to evolve faster thank you much much faster it'll just like because like with humans you know if you want to tweak anything about the hardware you have to wait at least a generation if not many generations but it's much simpler with an AI system you just run a new thing just train on some different data you can just change its architecture lots of things you can do here yeah well you mentioned in the beginning of that about the disconnect between values and intelligence um and hey I forget what your example was but you know basically an arbitrary goal and and one kind of cliche question but I think one worth asking is is the paper clip scenario one that you think is realistic that an AI might you know try to break apart all the atoms in the universe to make paper clips if we tell it that's what it wants to do like do you think that that is a a realistic apocalypse worth being afraid of so there is several versions of the paper that maximize the store I'm going to tell you the version why light so there's a verse of the story that I I think is better which is um ins so in the original story it's like you have a paper of a factory you tell it's like build paper clips and it destroys everything I I want to put that one to side to a moment instead tell a different version of the story in this other version of the story we have some kind of AI we build some AGI system let's say we're you know we're one of the we're the richest billionaire in the world we hire all the best AI researchers in the world they buy the biggest material in the world you stick them on the room and we're like all right build me an AI that makes me the most money possible and they're like all right let's let's see what we can do throw a bunch of stuff together you know someone's you know high on Diet Coke and Red Bull you know late at night figures out some clever algorithm trick or whatever you know it's probably not going to be one discreet defense but like you know we figure out some new algorithms we have a big computer try a bunch of date stuff on it we you know leave it running over holidays or whatever and then we come back to some system that um you know so we gave it some gold and how but we have to like actually cash that out what do you need to give something currently we actually don't know how to do this so currently for example systems attached GPT the way we you know for example opening AI doesn't want to actually be key to insult the users reasonable thing they would want it not to do still as everyone I think probably knows by now there are prompts you can give check TBT that will make it insulting user or you know say play right things or whatever um so the way these systems were trained is using a method called rohs or reinforcement learning from Human feedback so what you could do is is you give the model so this is not exactly correct but it's close enough basically you give the model a bunch of you showed a bunch of text and you know various outputs for questions or whatever and you have humans labeled them thumbs up and thumbs up and then you train the model to be more likely to Output thumbs up stuff and less likely to have a thumbs down stuff it's not exactly but like close enough so you know this is yeah it's pretty useful like this like chat gbt is pretty great you know it's a pretty cool model it's pretty useful it's mostly very polite you know almost of a fault um pretty good but I'm sure you could tell that like well this isn't really us exactly writing a goal so much as vaguely gesturing towards you know you know so we might have um a system that you know we are training in this regard and we are we you know there's all kinds of correlations exist in this data like it's not like we're like writing down in code here's what humans want it's more like here's you know some stuff humans like and they don't like this stuff sort of but the Malik interpret this in any way for example experiment we did within a conjunction was an earlier version of GPT models is we found if you ask one version of the model for a random number it'll give you pretty random distribution of numbers but if you asked it for a uh but if you ask an instructor fine-tuned version of the model for a random number it would almost always give you one of two numbers so you know I don't think they intentionally try to make the model Leo prefer these numbers I think it was probably just you know somewhere in the training day that these numbers were happened to be some more of them or they got a bit more thumbs up or whatever so now the model thought to itself well these are my favorite numbers now you know those got thumbs up before so these are not my favorite numbers so if anyone will ask you for numbers I want these ones so I I'm telling you these these examples because it's kind of important to get the feeling that it's not like someone's sitting down a console and being like your goal is to maximize shareholder value or to be nice to users it's way more vague and kind of like messy so back to our story we have our our scientists boot up the system and they give them like thumbs up when the bank you know account number goes up or thumbs up when stock value goes up or like whatever right so it gets like some associations with it right so maybe in this system boots up you know it starts you know um buy some cryptocurrency or whatever and they're like oh that's kind of weird but let's let's let it cook you know let's let the machine go let's see what we can do and then you know for some reason two days later the market crashes and it sells and we're like oh wow it made us a ton of money cool it must have like learned from the internet like you predicted that people are gonna you know sell soon and so it like took a position that made a lot of money so we're like wow awesome this system's great let's uh give it more compute let's let it run for more so now we have some system they're too clear no such system currently existence like it's sophisticated as a system I'm describing but people are trying to build these systems it's also very important to remember so this system goes online it Scrolls through the internet maybe it like messages people and asks them questions and got us information you know it you know simulates stuff in its head it does like all kinds of stuff to like you know do whatever it's going to do and so at first maybe this is great you know it starts you know it trades cryptocurrencies or stocks or whatever but then eventually we notice like oh it's messaging people threatening stuff on Twitter telling them to send a Bitcoin otherwise it's gonna like you know Murder them or something well that's no good why did it do that we didn't want it to do that well what happened was that the system just simulated a bunch of timelines it was like okay well if I contact this person and I say hi um you know I'm a friendly chatbot what happens well person's not going to be interested in that what if I say hey I know where your family lives here's their address send me money oh then I get more money if I do this awesome so then it chooses this path it's not because it's evil or because it's a conscious or anything nothing of the sorts it's just optimizing something and it's it so now this keeps going let's assume we don't catch it or like you know or we catch it we shut it down we apply a few patches stops doing then you'll be sending some thumbs downs and then we're like all right it's probably so let's assume we do that you know we send some thumbs down like no no don't threaten people don't do that well then maybe it'll stop you let's say like we're optimistic but like what it could do is so whenever you give it a signal of you know thumbs down threatening people it gives it a signal for two things one stop working people and two stop being caught threatening people either of those are fine because from the perspective because you know humans can only label what they see so maybe the model now just hides how it threatens people it makes it super low-key or like does it through proxies or whatever if people if humans don't label the deception then why then it's just like well this seems to be legal why would it not be you know so now we have some system that's you know doing more and more secret things you know every time we catch it doing something bad we can maybe stop it maybe but now it's getting better better lying to people it's getting better and better at hiding things it's getting better and better at all these kind of things so over time we've built some very powerful systems that understands how the world works there are stands for humans work that can manipulate you that can lie to humans that can like hide information from humans and now eventually made this system things you know again just thinking just simulating possible actions it can take and thinks well you know if I was smarter I could take even better actions so maybe I should you know just like do a bunch of computer science research and like figure out how to make a better AI system so then it thinks a bunch you know because it's super smart has access to all the internet you know it like figures out much better code so it shuts itself down and runs a better code so the system's 10 times as smart or you know even just 20 percent smurther doesn't matter and then you know it starts trading out more data or you know maybe it uses all of Bitcoin to buy more you know compute somewhere in China or something so they can run more copies of itself or whatever eventually we have potentially very suddenly we could have a system that is very distributed you know it runs copies of itself on various systems it's very smart it was built to be smart okay it's not like a miracle that if something became smart no like we designed it to do this like this is a thing the smartest people in the world are currently trying to do people are actively pouring billions of dollars trying to Design Systems of this and so and then so at some point here things go crazy yeah so the system gets smarter and smarter you know at some point it notices oh well I can just like just like how the stock exchange to set the value to Infinity cool I'll just do that no one's gonna stop me then maybe think oh wait if I do that humans are going to freak out they're going to be really scared and they're gonna like try to shut me down that's no good if I'm shut down they can't set the stockholder value to the infinity so I'm going have to I'm gonna have to protect myself so what you get is a system that wants to retaliate that wants to protect its own existence not because it's conscious not because it's afraid or has like a will to live or something just because it calculates hey I can if I'm not here then the value gets set back to a low number and I don't want that I want to be set to a high number so you know now it starts hacking the Pentagon now you know it gets access to weapon systems or who knows what and now if things are out of control you know maybe it's like wow you know it takes control of factories and it starts building robots and it's like oh wait I can build you know I know I set the value to the highest the max input but if I build more memory I can build bigger numbers or whatever right so it starts building computer factories and you know maybe the humans don't like that so humans get you know just like blown off they're like okay screw them you know maybe it takes all the oxygen out of the atmosphere so we can build more tools you know who knows and then eventually in the this version of the paper for maximizing it keeps optimizing optimizing and develops nanotechnology it develops you know space sparing to you know interactive travel and whatever to get more and more resources and so on and eventually it comes to the conclusion that the most optimal way to build you know Computing is a small molecular squiggle that looks like a paperclipse and then it just builds more and more of those squiggles across the whole universe so it can build more computers so it can set the values high and also so we can defend itself against anything that might try to camp so in this version of the story it's still pretty weird like this is still a strange story but I think every step in itself it's not impossible like like this is a thing that the smartest people alive today and you know some of the deepest pockets and the biggest companies are actively pushing towards Building Systems that have capabilities like this and if we don't aren't able to control them and we set them you know some vague goals kind of like pointing in some vague Direction well who knows what they'll do so how how realistic is is this to be a thing that happens then I mean if when you have that much Capital being thrown behind it that much thought and energy being put into cultivating that kind of AI it feels like a higher probability than not unfortunately yes unfortunately I'm very pessimistic about this so of course there would be many people disagreement with me about this but I think people have very optimistic bias here so like there's a very funny Grant I think it's from the cold takes blog which shows the GDP of like 200 or like 300 years or 3000 years something and like basically a flat line until you know go like straight up over the last like 200 years and then at the very top there's a speech bubble which says I don't know about all this sci-fi stuff I have a very normal life and know what's normal and I don't expect anything weird like this to happen and it's like you know this Singularity University podcast so I people are aware of this whatever but like you know things change and sometimes very dramatically and very quickly and like you know three years ago we didn't have systems that can talk but like we have AIS that can talk like they're not perfect they make mistakes they make up stuff whatever but they can talk like GPT can talk they can talk like I I feel like you know I'm going crazy here like you know imagine being a sci-fi movie you know you're watching a sci-fi movie the scientists turn on the new system and it talks to them and they're like ah man it got the distance between New York and Chicago where I'm visiting is literally not interesting who cares but you'd be screaming at the screen like what do you mean it talks like like you know like give me a break here man like you know if this was something that took 25 years to develop right you know and like you know it's like really brittle and like it took all the world scientists working together to like figure this out you know I'd be like all right probably we still have a way to go but it wasn't these systems were developed by relatively small teams you know using you know a lot but like not nation state level amounts of resources you know like you know training a GPT model costing like maybe 10 million dollars or something which is a lot of money yeah but it's not that bad yeah it's not that bad it's not if it costs like 100 billion then I'd be like yeah okay I still have a long way to go here but it did if MB systems keep getting cheaper and more effective very very rapid you know Moore's Law and whatnot so so unfortunately yes I think the way things are currently standing is that things are really stacked against us in a sense that capabilities are Advanced extremely quickly there's lots of forces against us you know you know you know capitalism is one of the strongest forces on the planet you know just market value go up and you know I understand and this has brought us many good things in the past too I'm not you know saying like this is like always in that negative you know Moore's law is great you know we have so many great so much great technology now that's like super useful I love my computer you know I love my you know I love the internet like these are all really great things and but there is a there is a it's kind of like a reactionary perspective and that like there's a lot of techno optimism especially like in the Bay Area and such it's like they're so used to people screaming that Tech is bad but they've seen viscerally how good Tech can be and like how much to tell people even so people are now also getting even jaded about like social networks that's which I'm also quadrated about and but it's easy to be very very optimistic about these kinds of things when you have to remember that you know all these nice things we have today still were built by someone the security was enforced by someone like someone figured these things out you know someone you know you know the fact that we didn't have a nuclear war was not how things had to go it was the hard work of many you know smart hard-working you know military officials and diplomats and scientists and you know many many people around the world worked very very hard so that we didn't end in the nuclear war and but it didn't have to go that way we got lucky in the same way AI can be the best thing to ever happen to us you know it can you know you know do so much science and cure cancer if you know help us you know expand to the Galaxy you can it can solve almost any problem imaginable and I do think this is like this is a thing that physics allows us to do there's nothing that's forbidden in physics from us to from sitting down and building an aligned powerful AI system that truly wants what's best for humans and it allows us to you know expand the Galaxy and you know solve all these wonderful worlds there's nothing that forbids this but there's also nothing that guarantees what are the past that you see as the most optimal ones then I mean given that there are so many obstacles and negative incentives on the landscape which path makes you the most hopeful do you think like guys if I could just get everyone to look this direction for a second we really need to go this way like is is there is there a path that you see forward one of the things I've I've learned now that I've you know been more active in the world so to speak you know running a company raising money talking to politicians trying you know trying to work with governments and policy and such one thing I've learned is it's almost never the case where I'm like wow I just need this one thing and then everything else is irrelevant it's almost always like all right give me anything and I'll work with it like you know give me a lot of money I'll work with that give me a lot of political power I'll work with that give me a bunch of geniuses I'll work with that you know it's like there are actually many paths to Victory like this is the positive spin this positive spit there's actually really a lot of ways we can win there really are the negative spin is is just because they exist doesn't mean they're accessible per se so you know I can come up with hypothetical scenarios well you know all of our politicians could just all look at Ai and suddenly become hyper rational and be like oh well this is a problem we should all be we should all like you know do no more military stuff you know China us Russia everyone shakes hands like okay no more military AI let's that's but that's never gonna happen never gonna happen right like you know like like there's nothing fizzy physical preventing this this is a thing that physics allows to happen but it's never going to happen yeah Game Theory and negative incentives exactly so so I could I could talk about like all kinds of things where I'm like well okay if I was you know God emperor of the world then sure you know we could do a lot of things and if humans were more rational if we were kinder to each other there's lots of things we could do but it's not obvious so the positive views so so the negative view is that I think like I think getting a russia-american China to all be friends is literally harder than building a lined AGI literally harder I genuinely think I think the the problem of building an AGI aligned AGI is very hard but I don't expect it to be like a thousand X harder than figuring out quantities like I think it's probably as hard or harder than like you know what people did when they first discovered quantum physics like general relativity or something I expect to be like that's harder or harder but I'm like a thousand times harder no you know maybe it's actually easier like I could also imagine it being easier you know maybe if we just had you know Einstein and Heisenberg and Von Newman reincarnate and they just spend 10 years honestly just figure it out I think this is possible I think if we had like you know 15 Einstein level Geniuses work on this for like you know 10 years I think that would probably be enough so the problem is we don't have that many Geniuses and many Geniuses are busy doing whatever other people are doing yeah a lot of other things unfortunately but yeah so what are some paths that I do see I think there are some and of course with conjecture my goal is to try to bring us as close to some of those as possible um a lot of this routes through you know trying to on the one hand work with policy makers and labs and stuff to you know at least you know give us a bit more time you know I'm not nothing stop doing it yeah that's ridiculous like you can't stop it but we could be like for example we could like publish less dangerous research like I think of a lot of AI research should be called being a function research in the same way but there's biology like like obviously holy stop gain a functioning covet viruses holy like how is this illegal like I'm losing my mind like just a fan gain of function virus research like it doesn't help it didn't help us in the last academic it is virus labs are not safe holy just stop like you know this is of course again also shows why this is hard so not super optimistic there or trying and the research side so if I can give one policy recommendation or whatever you know just the simplest thing possible like holy please just fund alignment research like literally just do the thing there are less than 100 people aren't broken this is a thing Academia should be great to have like there are so many brilliant you know computer science professors young students that would be wonderful for working on these problems and I think the reason they're not is purely contingent I think it's purely historical purely cultural I don't think it's like not an exciting problem it's the most exciting problem it's a cool problem it's a problem you can work on it's not easy to work on but it is a problem that can be worked on and one of the things I'll be very excited about if for example governments could just come out and say hey alignment big problem we'll you know and you know even if they don't fund it which they should but even if they didn't just Syndicate I think would make it respectable now everyone can be like oh you know yeah this is actually a real problem and you know then a lot of you know grad students can come out of the Woodworks and be like hey actually I do want to work on this problem like I actually have a bunch of ideas and you know because the grad students will say kids will be terrible but still that's how research happens that's how progress happens so if we can get the academic you know system which I have many problems with I have many problems with the academic system mobilize this is like a Pareto Improvement that's pretty obviously so like you know sure you know are there even better ways this could go yes but again you know let's be realistic this seems like something reals this is a thing that could happen like I think there's lots of people interested in this problem if given social permission to do so and I think this is something that doesn't cost governments very much or like high stat like you know just having high status professors come out and say like Yep this is actually super cool like what Stuart Russell did I think if just we had more High status you know scientists say this kind of stuff I think it's a lot of real potential there uh that's that's at least somewhat optimistic um I know we're coming up on time here man so I want to respect that maybe some closing a closing thought I think is you know always one that is on a lot of people's mind how how long do we have until in your mind something like in an AGI really starts to become a real concern like do we do you think that we have five six seven years for to figure this alignment thing out or do we have like 15 or 20 years and obviously this is very speculative but in your mind like where where do you land these days depends on the mood but like the the joke answer I tend to give people is 30 in the next five years 50 to the next like seven to ten ninety nine percent by 2100 one percent it's already happened oh Fair that's fair enough man well then with that being said any any closing thoughts any any last words you'd like to leave the audience in general I just want you know people to like both on both see that the AI is great and I'm so excited that like the potential that it has for Humanity and such but you know it's like with GD you know as with every TD Story Goes The Third Wish is to undo the first two we don't want to get into that scenario you know we don't want to deal with Genies here you know we don't want to build systems that like you know sort of want you know these are very powerful systems that we're building and we are pushing directly towards this is not sci-fi like this is sci-fi in the same way a you know magical brick I can hold up to my head to talk to my friend across the planet is sci-fi you know like if you consider that sci-fi then all right fair enough but then you know look around you you know I'm talking to a magic brick you know that can like you know perfectly reproduce by Visage in my voice and send it across the whole world right like we live in sci-fi like this is not like you know we live in a strange world we live in a strange timeline and and you know technology is crazy and it's going to get more crazy like this is not a weird thing I want people to think of this as just like a natural thing I want people when they're exposed to the alignment to be like oh yeah of course like yeah like that's a very reasonable thing what are you doing about it Mr politician or Mr government or Mr you know CEO of a big tech company why aren't you taking me seriously and I think if we do take it seriously it is a solvable problem it is a problem that we can't overcome but man we're not on track for it right now well let's uh let's hope that your work will put us back on track well I hope so and I I think it's possible uh we it's very funny to think about it sometimes you know some monthly at night you know I'll be up at night and I wouldn't it's kind of weird we're not in a timeline where we've obviously lost I think there are many ways things could have gone like fairies as a cold war had dragged on like military AI became like a big thing I think we would have just been like super screwed and there's been nothing we could have done but and like there's a bunch of other ways where I think things could have just like totally been over like like super over you know maybe we could have you know should Cooper developed AGI in 1991 and we all die you know whatever right um but we're not the the future is not yet written the decisions have not yet been made there is still time but not much there is not much time if we really change things I think it's possible over the next couple years I think we can make it but yeah we're not currently on track but we can but we can't do it we can't we have to just uh shift some priorities and take it seriously Humanity has done this before like like it is crazy like with all the terrible things you may have done like all how stupid and greedy and selfish and terrible humans are we've done a lot of great things like you know there's a lot of things to be very ashamed of about being a human there's lots of things to be very very ashamed about but there's also a lot of things to be really goddamn proud of like there's really a lot of things you can be proud of so we have the potential to be heroic as a species you have the potential but it doesn't happen for free it doesn't happen by default even it's you know that's why we celebrated so you know let's try to be heroes one last time
Info
Channel: Singularity University
Views: 7,514
Rating: undefined out of 5
Keywords: Singularity University, Singularity Hub, Education, Science, leadership, technology, learning, designing thinking, future forecasting, Ray Kurzweil, Peter Diamandis, SingularityHub, 3D printing, AI, artificial intelligence, AR, augmented reality, VR, virtual reality, automation, biotechnology, blockchain, computing, CRISPR, entrepreneurship, future, futurist, futurism, future of work, future of learning, genetics, health, healthtech, medtech, fintech, nanotechnology, robotics, talks
Id: k6M_ScSBF6A
Channel Id: undefined
Length: 53min 43sec (3223 seconds)
Published: Mon Mar 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.