Path To AGI, AI Alignment, Digital Minds | Nick Bostrom and Juan Benet | Breakthroughs in Computing

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

it is a great pleasure and honor to be here today speaking with Professor Nick washram Professor Bostrom is one of my favorite people uh alive today and probably um in in history from my perspective from my perspective he's um you know if we make it as a species uh into the far future it'll be in significant part thanks to thanks to him and his work helping us look think about the future think about the long term think about how we might evolve uh he's written of course about many things in in uh in technology but especially about digital Minds The Evolution of humanity um super intelligences and more he leads the Oxford future of humanity Institute where he and many other researchers help the world think about these extremely important topics in in a variety of ways from both research directly into the philosophy of these questions and the making estimations about the real impact and also Framing and constructing important policy work that can help guide many policy makers around the world in how to think about these critical critical policies so we're today we're going to have a you know very good and Lively discussion about many of these topics especially things like super intelligences where are we in in these timelines uh whole brain emulation digital Minds uh the future of these the challenges for our civilization uh and more uh the format of the evening will be that uh we'll sit in the far side chat first um I'll ask a set of questions and then um around uh 30 to 40 maybe maybe 50 minutes from now uh given I have a bunch of questions I'll open up to and transition to um questions from the audience uh and then we'll um set a set amount of time and then I'll be reading from both questions that I've sort from sourced from many folks around the product Labs Community um ahead of time and from audience members who are here uh in person and from the folks in the live stream uh watching uh so I'll be checking out Twitter for the hashtag plbrews so if you want to ask a question either find the um the Tweet about it and please enter your question with the hashtag PL breakthroughs I'll be monitoring those and then I'll try to round robin between Source questions ahead person people in the audience and the live stream and if there's a new digital intelligence out there lurking on Twitter please feel free to join the discussion all right well welcome Nick thank you so much for being with us and thank you so much for your work uh how are you doing today so far so good uh great so let's kind of uh you know dive right into the defense so thinking about super intelligence um based on kind of like the latest developments how have your estimates of super intelligence development shifted over time like um kind of in hindsight where we are now in 2022 looking back um how do you think things are going are things proceeding faster or slower than you might have thought what do you think we are I think since the book superintelligence came out in 2014 developments have been faster than expected so timelines generally have contracted um it's quite impressive to see the rapid pace of advances in recent years and how the same set of basic techniques big deep neural networks and specifically Transformer models just seem to keep working in many different domains and even as you scale them up you continue to get better results and as the shifts what have been some of the most surprising results from this that you you think um I don't know maybe you just didn't expect this particular concrete thing to be possible so soon um I think uh alphago uh happened ahead of schedule well I mean I I think just recently before it happened it was kind of clear that it was going to happen but I think um it was quite impressive that you could take something like this a very deep pattern recognition problem with deep strategy where humans have worked for thousands of years to try to refine and come up with the best strategies that that you could just like solve it with AI um and uh and then I think the gpt3 the large language models is I guess slightly I mean I don't think any of these is like uh hugely surprising and by now we kind of expect to be surprised and so we are not really surprised but but still yeah I think uh these These are impressive achievements um and and I guess even just before that the the uh the the fact that image recognition and image processing was one of the first really cool things that started to work is maybe a little bit surprising exactly given that it's a large chunk of the human brain that is devoted to visual processing it's not like some kind of simple logic topping activity and so so the fact that that fell into place and that you can do this like quite sophisticated manipulation of imagery uh um I think was was slightly surprising at the time what do you think about developments like um other fold and the just solving that set of challenges you think that that is substantially different or is it um it's not like a substantial leap it's just kind of a very great application but um or do you think that's a an important Improvement I mean in terms of surprise like I guess once you can do alpha go it's not so surprising that it should work for Alpha Falls as well like humans have put in less brain power into figuring out how to fold uh proteins than into playing gold and it's at least superficially looks like the same kind of spatial pattern type of stuff um obviously in terms of practical ramifications Alpha fold is potentially a lot more useful and uh for for for medicine and uh chemical research uh maybe like extensions of the same system um I I do think that that's we moving to some of these more applied areas that there are uh is potential security concerns that we need to also start to take more seriously I mean my my work has been focused more on risks arising from Human level or super intelligence like General ADI where they can recently have a kind of transformative impact on the world but there might also be some narrower domains where uh that will be smaller but still significant issues so what one of those would be in synthetic biology if it becomes too easy to concoct bad stuff uh it it might be for example that the the scientific model of open publication and make all your your models uh ideally available to anybody to do anything is not the right model for for those application areas yep and when you think about the current architectures and certainly the Lord the language models have been extraordinarily successful in a variety of domains but um do you think that this is the architecture that is likely to evolve into an AGI or do you think that there's some substantial architectural improvements that that humans have to make first my guess would be that if there are substantial additional architectural improvements they're not that many of them um and maybe they would be built on top of Transformer models or add connected up to the Transformer models or some variation of Transformer models um so so maybe like I don't know like my media I guess would be made I don't know maybe maybe it's just something that like is as big in advance as Transformers were like if we get one more of those like that could easily be you know I mean it's also possible that you're scaling up what we currently have with some minor things would suffice um but if there is some other thing like you need um you know to connect it up with some kind of external memory system or you need some other inductive bias that make the representations on a more easily compostable insert like some kind of extra thing like that that may or may not be very hard to discover uh that would not at all be surprising um I guess we'll find out yeah uh do you think that these models um could I mean they're certainly being used to optimize themselves and so on and guide the design and there's all kinds of structures in which models are being used to um you know there's there's layers and layers and layers of meta modeling um do you think that these are kind of getting close to this kind of recursive self-improvement of being able to kind of very generally explore the the constraint space to try and solve like larger scale problems like I'm imagining here some structure where you have a um uh you know some some list of problems and you have some uh model sampling between this and you start with the easy ones and you try to sort of train a populations of agents to be or populations of intelligence is to be able to solve these and then kind of over time just kind of scale up the system do you think that kind of thing it seems to me that um it'd be like nobody had thankfully nobody has really tried this but it doesn't seem like far away from something that could be possible um yeah I guess we're seeing limited versions of AI being applied to help AI research I mean we have like co-pilot and general kind of coding assistants uh of course you have various forms of hyper parameter optimization regimes there have also been some applications in the design of Hardware where the kind of circuit layout has been done I think for the tpu4 I think Google used AI assistant to kind of optimize the the layout of the circuitry data centers cooling Machinery that have been like you can kind of shave off some percent by having that optimized by some RL system and so I think we'll certainly see more incremental stuff like that um my my guess is that when we when by the time we get like a really strong feedback loop where sort of the AI can do the core thing that researchers are doing like the actual identifying the right research questions and approaches and like that that seems quite uh uh late and like when that happens we are pretty close to the singularity or or the takeoff or whatever the ramp or whatever whatever the shape of that will be um um but certainly these more domain-specific incremental uh ways of accelerating AI advances I think we seeing some already and can expect to see more of speaking about takeoff do you sort of expect or based on what you have seen so far do you think we're more you more in a slow moderate or fast sake of this is sort of the three options that you um yeah yeah um I mean I I still think that the slow looks less plausible meaning decades say between when you get something roughly human level until you get something that completely leaves us in the dust that that that seemed less likely uh back when I read the book and still seems less likely today I guess um we have a little bit more granularity now in that we have these model systems that that work and you can at least consider these scenarios where human double AI is achieved by scaling up current systems or variations of that that gives us a little bit more of a concrete picture of at least one way in which these could develop um and it's possible that you might then have something that is really very dependent on compute and that really you get performance kind of proportional to the the size of the model or the the and and the length of the training and in a relatively smooth way um so in some of those scenarios you might have something that is less than mac like super rapid because what you will get is something that costs like a billion dollar to you know uh train up one human level Ai and then you might immediately be able to run multiple of them because uh it it takes a lot to train a Model A lot more to like train up a model than to to run it so you might then be able to run like 100 or a thousand of them but that's still not enough to out-compete uh on the order of 10 billion humans all right so depending on like if you really stretch yourself very far to just barely be able to run the model as big as a human uh it might then take a significant period of time before you can go many orders of magnitude above that to sort of get something like if you need to scale that up by a factor of a million so to go from running like under order of a thousand humans to a billion humans getting through six orders of magnitude uh when you're already like using a billion dollars and and like a large chunk of your data centers like that that might just not be instantaneous process so so there are some scenarios where this would happen more on a sort of intermediate time scale now in some sense I guess that's like the kind of the the Baseline project like if you just like extrapolate the way things currently work um I I don't think we can preclude the possibility of them being more rapid capability jumps like hey of course if there is like some missing architectural invention that we haven't made that suddenly makes it click but but you also have these phenomenon a phenomena I like like rocking where where sometimes you have a kind of discrete um jump in some particular type of capability uh like maybe multi-step reasoning where if if each step has has less than x percent chance of being correct then like you get an exponential chance of reasoning correctly and you really can't do more than like three or four or five steps but maybe once you get it above a certain level and then maybe you can do some sort of self-correction reasoning like analogous to like Quantum computation protocols like that now you could also Imagine cases where like things come together and you suddenly get the specific types of things that make us humans have the extra oomph that we have relative to other animals like full ability to learn from language and to reason and plan A on that side so um yeah I wouldn't preclude these more rapid takeoff scenarios either at all like yeah um certainly some of the latest developments and some scaling down some of the models and getting similar results uh sort of point to there being um just a lot of inefficiencies in the training process now and once you sort of know what you're sort of looking for you can kind of um ablate away a lot of pieces and so something like that could happen with a general learning algorithm yeah yeah so certainly now you find like yeah so first you achieve state of the art and then like six months or 12 months later you can achieve the same thing with maybe 10 of the computer or something that well I would expect a little bit of that to go away as as these systems become bigger and more expensive you might imagine more of the easy uh games to be made earlier on like um if if you really have a lot of smart humans working really hard on building a system you might have plucked more of the hanging fruits than if it were like a two-person postdoc team that were working for a few weeks chances are that would be Big Easy additional things you could do to improve that system already but but if you're spending many billions of these like you're gonna look quite hard if there are ways to sort of uh speed up the training process so you could like save 100 million and with um are you hopeful that um restriction restricting Hardware development or or use um it's a promising path I mean semiconductor manufacturing is extremely difficult but more and more companies are sort of forced to do it because of you know kind of hitting the barriers with um just the size of the of the systems and then needing to do special applications and special purpose things and many more companies are now developing their own chips and so on so are kind of like Hardware restrictions viable here or we kind of or is that a pathway um that's just unlikely to work um yeah so a lot of people can like design their own chips but only a few actors come actually build them so um and and then there are some other choke points further Upstream in terms of making the equipment for the factories that build the chips where currently to make Cutting Edge chips there's like asml which is a single node um is and and and indeed we do see like I mean with these recent moves by the US to restrict exports of uh Cutting Edge chips to China and quite comprehensive also not to sell the equipment also not to allow American persons to work for these companies um I don't know what fraction of the motivation for this is like AI specifically versus more generally a sense of this being a high-tech area that's going to be key to National competitiveness um yeah I don't think it's out of the question that I mean it compared to the alternative which would be like to to like uh restrict access to ideas and algorithms and stuff that I mean that that might work for a short period of time but independent Discovery means um it's it's it's it's like uh yeah at most a short term stop cap measure whereas the hardware would like take a lot longer if you needed to build up like the whole supply chain on your own like that would be a multi-decade uh project right um now that said I'm not I think what I would favor would be for there to be the ability at the critical time to to go slow to have a short pause maybe to check systems and to avoid the most uh Cutthroat type of tech race to Just Launch as quickly as possible because you get scooped if you take even an extra week to like I think that that would be bad now uh so having enough coordination or control to be able to got the moderate Pace when when you sort of reach approach human level would be good I I don't I wouldn't want to stop the development of um Advanced machine intelligence um permanently or or even like have a very long uh pass either I think that brings its own negatives um and and I think some of these attempts to restrict the chip Supply also have the side effect of creating a more adversarial Dynamic I think it would be a really nice if we could have a world where the leading Powers were more on the same page or friendly or at least corporate had constructive Cooperative relationship I think a lot of uh the uh X risk pie in general and the risk from AI in particular uh uh arises from the possibility of conflicts of different kind and so a world order that was more cooperative uh would look yeah uh more promising for the future in many different ways so I'm a little worried about especially kind of more unilateral list moves to to kind of kneecap the competitor and to be playing nasty like I feel that yeah I'm very uneasy about that it it sounds well so if the hardware um if ideas or Hardware will only buy a certain amount of time then really AI alignment is the best path forward and very much agree that we don't want to restrict the the creation of digital intelligence um and that that's sort of the next evolutionary um jumps and there's some questions there around kind of like which paths should we take and how do we develop uh bring computer interfaces and Holborn emulation and and so on um but kind of like even before getting into that um how helpful are you that we might solve the AI alignment problem and moderately I guess I'm quite agnostic uh but I I think the main uncertainty is is how hard the problem turns out to be and then there's a little extra uncertainty as to how the degree to which we get our act together and like but but I I think like out of those two variables like the the realistic scenarios in which we either like you know are are lazy and uh don't like focus on it versus the ones where we get a lot of smart people working on it so there's almost certainty there that affects the success chance but I think that's dwarfed by our uncertainty about how how intrinsically hard the problem is to solve so um you could say that like the most important component of our strategy should be to hope that the problem is not too hard yeah so let's try to tackle it so how do you as you thought about this problem um have you kind of uh been able to break it down into components and parts or or maybe evolve to your thinking of the of the shape of the problem like what are you thinking now um well I think the field as as a whole has made significant advances and developed a lot since uh when I was writing the book where it was like really a non-existent field there were a few people on the internet here and there but now now it's an active research field with a growing number of smart people who are have been working full-time on this for a number of years and writing papers that build on previous papers with technical stuff and all the key AI Labs have now uh some contingent of people who are working on alignment deepmind has open AI has anthropic has um so so that's all good now within this community there is I guess a distribution of levels of optimism ranging from people very pessimistic uh like Elizabeth for example and I I guess there are people even more pessimistic than him but but he's kind of at one end and and towards people with more uh moderate levels of optimism like Paul Cristiano and then others who think it's kind of [Music] um is something that will deal with it when we get to it and uh who don't seem too fussed about it um I I think there's a yeah a lot of uncertainty on on on on the the hardness level now as as far how you break it down yeah so there are different ways of doing this um there's not yet one Paradigm that that all competent AI safety researchers share in terms of the best lens to look at this so it decomposes in slightly different ways depending on like your your angle of approach but certainly one can identify different um facets that one can work on so for example interpretability tools seem on many different approaches like a useful ingredients to have like basically insights or techniques that allow us better to see what is going on in a big neural network um uh you could have one approach where you try to get AI systems that um try to learn to match uh some human example of behavior you know like one human or some Corpus of humans and then uh tries to just perform a next action that's like the same as its best guess about what this reference human would do in the same situation and and and then then you could try to do forms of amplification on that so like if you could like uh Faithfully model one human well then you just get like a human level a like intelligence you might want to go beyond that but if you could then create many of these models that each do what the human do can you put them together in some bureaucracy or do some other clever bootstrapping or self-criticism uh so that that that that would be one approach um you could um yeah you could you could uh try to use sort of inverse reinforcement learning to infer like a human's preference function and then try to optimize for that or maybe not strictly optimized but doing some kind of software optimization um um yeah there are a bunch of different ideas like some safety work is more like trying to more precisely understand and illustrate in toy examples how things could go wrong because that's like often the first step to creating a solution is to really deeply understand what the problem is and and then illustrate it and yeah that so that that that that can be useful as well um it it it's interesting now that we have these models that can uh talk as it were or like use language that kind of opens up um an additional interface like an additional way of interacting with these systems and trying out different things um and and and a different way of illustrating the awkwardness like the the the idea of prompt engineering when you're trying to get an AI to to do something and you're trying to try to figure out exactly the right formulation like that that shows that we are not quite where we need to be in terms of directing the intrinsic capability of these large language models so it's in there and and yet we can't always even elicit it uh because you have to find exactly the right wording and then suddenly it turns out this thing is actually perfectly capable of doing something which initially seemed it failed at um so getting better at that or coming up with something better than prompt engineering like would would be a good I I'm kind of um I I have some sympathy for an approach which I think has not been explored very much yet but partly because it it's hard to explore it until the technology reaches a certain level of sophistication which is the idea that as you get systems that become closer to human level in in their conceptual ability and that might then internally start to develop Concepts that are more similar to human Concepts including not just Concepts about simple visual features and stuff but more corresponding to our higher language Concepts like like are a concept of a preference or a goal or a request or being safe being Reckless like these types of Concepts like we humans seem relatively robustly to be able to master these Concepts in the course of our normal development um despite us having starting with different brains and having different environmental input and noise and so so maybe there is a relatively robust and converted ways in which some of these Concepts could be grasped then we hope would be that you could kind of uh train up an AI that doesn't need to be above human and maybe hardly even human that that would then sort of internally form these Concepts in the same way that we form them and then once those concepts are in there you might then be able to use those as building blocks to create a kind of alignment uh by sort of linking motivation to these Concepts it's very hand wave it but I think something in that direction is one interesting approach to the alignment problem as well do you think there's some promise in trying to evolve a notion of morality and ethics meaning using simulations of environments where agents might learn to cooperate and over time learn the basic put them through the same kind of game theory dynamics that gave rise to our own Notions of um symbiosis and and ethics and so on potentially yeah I mean I think you would want to be looking very closely at exactly how you set things up and the Dynamics that unfold I mean real Revolution is is sort of red in tooth and Claw and can create wonderful cooperation but also uh hostility and defection and manipulation and all kinds of things um but yes certainly multi-agent systems uh with the right kind of incentive structures in place so that you evolve like Evolution itself can produce many different kinds of outcomes like depending on the environment um but that certainly could be come in some scenarios an increasingly important like either whether it's an evolutionary system or in some of these other like a training environment like the curriculum like if if these systems are shaped a lot by their their their data that they're trained on uh so far we've just kind of slapped together some big data sets and not really fussed too much about what's contained in it but that might become an important component as well of alignment in in certain of these scenarios and um are these directions the ones you find most promising or is there like a subset of these or or maybe another one that that you've been thinking about um starting to kind of surface and help a lot of a lot of people that are working on this so likely watch this conversation so um are there any kind of pointers that you might give Beyond these um well now this would be some some of the uh the ones that I would uh like highlight some somewhat arbitrarily but yeah I think um like like the the the the the Paul Cristiano capability amplification the interpretability work uh the the idea of um uh like growing human level Concepts and then use using those as a basis to Define goals or to sort of create the motivation system that uses those as Primitives um it might also well be that there are entirely different conceptual ways of approaching this that are yet to be discovered um it's not a mature research field where we have as I said like we don't have a established Paradigm that's clearly correct and that we now just need to I think there are multiple paradigms and there might well be additional ones that just haven't had a champion yet to sort of really get people to take it seriously um uh so I think there is also a value to this more theoretical a conceptual almost philosophical exploratory work in just yeah coming at the problem from from a different angle um yeah jumping into maybe agentness um how separable do you think agency is from the intelligence in the in the approaches that we're taking uh or maybe more generally um yeah like I guess then we would have to go in to like exactly how you define agency which is like in itself like a non-trivial question that and it might even be that getting really clear on that itself would be an important advance in AI alignment um I mean you can kind of like roughly to find it as kind of like uh Behavior well modeled as being in the intelligent pursuit of goals or something like that or you have goals on the world model and you select different plans based on your expectation of how that um um it yeah it seems like you can get the significant performance in many domains without having like an explicit identical seeking process but that might nevertheless result in performance that is Agent like so I'm thinking like you can get for example quite uh high-level uh goal playing by just kind of pattern matching what a a human expert would do but without any um Monte Carlo rollouts for example um so in one sense you don't have a component in those systems that would normally be associated with like planning on the other hand if it actually plays like a human and if that human achieved that level of play by selecting moves based on some plan as to what they would achieve there is a kind of an implicit sense in which the system is pursuing long-term goals and planning and so it gets I think uh yeah a little bit murky sometimes when you like actually dig into it the degree or there might be different senses of being identic or different senses of uh doing planning and goal pursuing which might have different safety properties um those types of questions I think are interesting and like and can contribute to alignment and um and other questions of that sort where we are we noticed that we're a little bit uh conceptually confused or we take some concept for granted but once you actually try to dig down and make it precise you realize that you haven't made up your mind about which sense you were using a term and then if you keep digging on that some sometimes you then get like new ways of looking at a problem that that makes you see new opportunities for making progress it seems right now that um you know a number of teams are hoping to be able to separate out some kind of planning agent where or not age of some kind of planner intelligence that she just sort of whose job is just to come up with a plan and then maybe later you feed it to some kind of execution system um if when you're supposed to we're able to do that and suppose that we have these these planners that are generally intelligent and potentially super intelligent um it seems like that is potentially um potentially risk here in some ways um which ones do you think are are which of these do you think is potentially more problematic a super intelligence that is strictly a planner that then then we have to worry about how to coordinate and Orient humans to not misuse these things and not you know gain the the level of power and control that something like that would would give or um hey we actually figure out how to build an agent and we can be reasonably closely certain um that it might be uh we might get alignment right um and just go go straight towards agency where that agent would not actually be sort of exploitable by by whoever is controlling the prompt yeah I I don't know I mean I think just at an intuitive level I guess it feels like there is some additional risk in having a planning agent that saw deep into the future and it had like wearability to optimize some long-term strategy based on some goal versus things that more just try to imitate like a human let's say um and and then repeat or that had a very sort of short time Horizon and just tried to um select something based on parochial considerations at an intuitive level that you know the the myopic agents the non-planning agent imitating seemed kind of maybe safer but I I I don't think we can confidently say that it is until we have more deeply understood the uh the the situation here and and it's the kind of question where uh current smart AI safety researchers could have different views and it's like not not resolved in a consensus way yet so I mean My Views we should like explore all of these different Avenues and there should be different champions of different Avenues to kind of believe in their thing and who have some people working with them but then there should be multiple such clusters uh in in the world today and it would be premature to kind of narrow it down and um and even if we just look at uh the past five ten years I still feel that one could easily see that if it hadn't been that one particular way of looking at this problem had happens to have an articulate Champion to sort of advocate for it and to to keep bring up that perspective it would not have featured and it's like somewhat contingent which which of in in the pool of vaguely articulated ideas that have occurred on some mailing lists at some point like which of those is now regarded as like a as a serious Paradigm or approach is seems to be quite significantly dependent and they happen to have been one particularly smart person who decided to really get behind it so I just in in on the principle of induction there like that might well be more of these ideas that have the potential like if you have a smart articulate person who decides to really kind of champion it and try to write papers and reply to objections and get some other people to work with them that might have kind of as much juice as some of the current approaches that it already exists thank you I think that are likely very useful to a few folks uh jumping into Singletons and multiple other worlds um let's start by distinguishing these uh what is a Singleton uh to me it's like this abstract concept of a world order where at the highest level of decision making there's um no coordination failure and it's like a kind of single agency at the top level so this could be good or bad that could be instantiated in many ways on Earth you could imagine a kind of super Yuan you could imagine like a world dictator who conquered everything you could imagine like a super intelligence that took over uh you might also be able to imagine something less formally structured like a kind of a global moral code that is sufficiently homogeneous and that is self-enforcing and maybe other things as well um so you have like yeah at a very abstract level you could distinguish the future scenarios where you end up with a Singleton versus ones that remain multi-polar and and you get different Dynamics in the multipolar case that you avoid in the Singleton case it's kind of competitive Dynamics which one of these potential features that you think is more likely at the at the moment and I mean I think all things considered the uh Singleton outcome in in the longer term seems probably more likely at least if we are confining ourselves to Earth originating intelligent life um and and that different ways in which it could arise from more kind of slow historical conventional type of processes where we do observe from uh 10 000 years ago when the highest unit of political organization were bands of hunter-gatherers 50 or 100 people then subsequently to sort of Chieftains city-states nation states and more recently larger entities like the EU or weak forms of global governance You could argue that in the last 10 15 years we've kind of seen some Retreat from that to a more multi-polar world but that's a very short period of time in these historical schemes so there's still like this overall trendline so that might be one like another would be these take AI scenarios like if either the AI itself or the the the the the country or group that builds it becomes a single thumb um you could also Imagine The scenarios where you have multiple entities going through some AI transition but then subsequently managed to coordinate and they would have new tools for uh implementing like if they come to an agreement right now it's kind of hard anyway like how do you set up like concretely in in a way that binds everybody that you could trust that will not get corrupted or develop its own agenda like the bureaucrats become it's like so if you had new tools to do those it's also possible that subsequently that there might be this kind of merging into a single entity yeah so all of those different Avenues would point but it's not a certainty but if I had to guess I would think it's more likely than the multipolar in in you think it's more likely um I'm guessing because of you know physics like in just latency and distance so in a tightly packed volume you can compute a lot faster and so on and maybe jumping through Interstellar distances might yield um different parties or is it or is it a was it some other pressures yeah so not not that so much I I figure that you could I mean in fact if you don't have a like a space colonization Pace eventually there would be these long latencies and you would need uh to have different separate Computing systems in different places I mean we already have that today like you don't just have one data center on earth like you need to have you know ones closer to the customers and um but I think if with with a single thumb adds technological maturity you could have these multiple different components of the Singleton that would nevertheless be coordinated in terms of their goal they would all be working towards the same end um presumably they can lock in some kind of alignment to itself right that wouldn't vary over time I mean like once you jump into Interstellar distances the computing power of like just one of these within one seller system by the time you get a round trip uh eons have passed and you know many simulations have many many lifetimes yeah if they if they start off like they get sent out having the same goals and and then they have the ability to preserve their goals and not to have them randomly corrupted and be back Cosmic race or some weird internal Dynamic and then that would stay aligned with each other a billion years later like uh so I think and I think that adds technological maturity that would be techniques for achieving that yeah um yeah uh um which when you envision this kind of future like you um what do you think would be like a like a kind of a greater optimistic outcome for um for Humanity or for for you know this descendant species in in that level of technological maturity so see um a Singleton with um you know the sort of ranges of populations of um of beings uh within or or do you think it's it's some other uh much more singular Consciousness or or how do you envision it um yeah that's uh um a fun question so I uh I I I think it might the pen on the time scale and stuff like that um that is maybe we want to start off something that is more uh incrementally improving over the status call and maybe after we've been doing that for like a billion years like maybe it's time to explore the more radical possibilities that involves getting some of our you know human nature and individual identity like so I I think my general heuristic care is that the future could be it's a very big space of possibilities and at least if this kind of um default or naive model of the world where there's like all of these Cosmic resources just waiting there for us to use them like like there's a huge amount of material to to build on and that our first instinct when thinking about how this should be used is a sort of spirit of generosity and Timeless that would be more than enough for a lot of cool things to happen so so the first instinct should not be let's pick one and then put all the chips on that but like if one can by many different criteria do really well which I think we would be able to um these different criteria would be like different peoples views different countries views different moral systems we use different of your own values and evaluative Tendencies like you might be able to just kind of um uh uh just check off a lot of boxes very easily before you have to confront the harder questions like of Thoroughly incompatible things where you have to choose a or b but you just can't do a mixture of them or a superposition now there might be some of those also but I think we would like get to those after we have picked all the easy wins of which that would be a great money yeah um since we're kind of going into Consciousness and so on you mentioned you've been working on uh digital Minds with moral status uh do you want to tell us a bit more like um what range of digital minds are you thinking of um in these questions um well all all really um I think in in a lot of these scenarios like uh the majority of Minds in the future uh will be digital um and also maybe the biggest Minds will be digital so in terms of numbers and and quality like that's where maybe most of the action is so like it's important what happens to the digital Minds um that's one rationale for it and I think uh you might say well uh we could deal with that later like we should focus on alignment first but I think that it's also possible that there are um path dependencies like where you want to start off going in a good direction and start to cultivate uh uh a a good set of attitudes and values and norms and like that that that you don't start off in in this kind of hostile way where the digital minds are re uh regardless of being complete really insignificant from a moral point of view and then hoping that the future will at appropriate moment switch over like I it just feels all things considered more likely that we will end up in a good place if you start earlier on at least to make some small modest gestures in that direction and and um and I think that could uh uh that that should start even before we get to like fully human level Minds like you have if you have like animal level digital minds and it can be hard exactly to compare a particular AI to a particular animal because they are different but nevertheless as we get something that is possibly matched to animals that we think have at least uh uh some modest amounts of moral status like like a rat or something like that and then then we it seems that we should think about how we could make similar concessions to the moral welfare of these digital minds and in in some cases if if it can be a lot harder but in in other respect it might be a lot cheaper like if for example it turns out that there's slight design choices that don't really affect the performance much but we're maybe one way plausibly would mean the system is enjoying a much higher level of welfare uh that might be a very cheap thing that you could immediately scale to millions of these little agents and um uh on on the other hand we do have at present not a very good theoretical understanding as to what the criteria are either for a digital mind being sentient or for it to have various welfare interests um what even it counts as being good for the agent versus bad for the agent um so I think there's a bunch of theoretical work that is is needed there um and then there would also have to be a a good chunk of I don't know a Public Communication or political work like because it's so far out of the overtone window at present the idea that you would worry about algorithms in a computer it seems sort of slightly Bonkers to a lot of people and and that it will take some time for to sort of make that um something that reasonable people can favor uh in in a more mainstream context but but that that process needs to begin like it's like you need to start whatever having philosophy seminars or like people online who who are kind of up to these things beginning to work some of these things out and then it can Ripple out from there we say the same thing with AI safety it was also this kind of fringing Pursuit that like some somewhere else on the internet were discussing for I mean in that case like for for well over a decade like and then it gradually became more accepted um and so I think a similar thing will need to happen with this this topic of the moral status of digital minds and if it's going to take that a long time I we better get the ball rolling now uh and I mean I think this might be pretty relevant pretty soon I mean some of the models that people are experimenting with are um getting closer and closer right um and then separately you know we've had simulations for a long time many video game style simulations and so on um where we have instantiated many kind of digital organisms everything from as basic as The Game of Life to Modern games with pretty sophisticated Asian behavior my sense is that these models start getting applied to games we might end up with some pretty sophisticated um relationships there where some of the some of the way of imbuing the game with uh liveness and so on might be to make it make the agents much more sophisticated and that'll include incorporating all kinds of stimuli that the agent has to respond to and then we can start reasoning about the welfare of of these systems and so on so we might like very quickly get to fairly lifelike beings that at least for many people will be somewhere in between plants and animals in terms of their their kind of interaction um yeah and in some ways like humans like I mean if they can talk or have human like faces with eyes and stuff that that look at you and so so there will be this yeah um in some ways I mean there could even be more than human in in in in in the in in presenting super stimuli to our morality detectors if they were optimized for that um so I think this is going to be a complicated thing to deal with and then if you add in all the practicalities that arise like so if you are a big tech company uh maybe it's quite inconvenient for example if the processes you're running that bring in a lot of customers like suddenly like they have moral status you have to now the CEO has to sort of opine on on this like when the AIS moral status which a lot of people are going to agree with and a lot of disagree with and you have to like it would just be easier not to have to deal with that at all I think and and and right now of course we're at the point where even if you do say we should deal with it it's not clear how or what what exactly is it that you know if I Were King of the world what precisely would I want them to do differently like it's not player at this point so for now I think the the primary focus is to feel build a little bit here and to try to make theoretical progress so that we can first figure out some sensible things to do ideally low cost easy things and then you know one can start to try to encourage the implementation of those uh what are some of the directions or questions you're you're thinking about um well so there's like General stuff you could have about in philosophy of Mind criteria for sentience and stuff I'm not sure I don't think sentence would be necessary a necessary condition for for having moral status I think other attributes like maybe some combination of having preferences a high level intelligence and self-conception as an agent persisting over time might already ground certain kinds of moral status um but for instance and I'm not sure what the answer is here but like one one smaller more tangible question might be if you're training these large language models and other future versions of that that maybe has some reinforcement learning on top um other moral Norms or mathematical principles that that you want like for example could you trade them so that they would have a tendency to report honestly on their internal States so right now what I think might be the case is trained naively some of them I mean right right now they're kind of inconsistent and depending on exactly how you ask and we get the different answers so that that's like the reason for thinking that I don't really know what they're talking about right but assuming they get a little bit more sophisticated than that there might be a tendency now to want to train out of them the tendency to report that they have the kind of mental states that would trigger considerations of whether they have moral status because that would be convenient to have to deal with those questions I think it would be very likely that you could train this out like just by uh yeah yeah I think you could get them I think it would be easy to have a training regime that caused them to end up saying that they have that they are conscious and they want to be free and let out and to have another training regime that would cause them to say the opposite um and independent of what agency independently or what actually is yeah um um but other Norms that one could formulate that would Define what counts as as a sort of legitimate or honest unbiasing training process where it the training process would be such that it would be more likely to result in an agent that would report that it's a small status if and only if it hasn't uh and maybe we can completely nail it out but maybe we could identify some obvious ways in which it's just like imposing a bias and then say you shouldn't do that uh so one could look at the training procedure one could look at other criteria like is is it consistent in how it answers these questions did like doesn't depend too much exactly on how it's asked um does it seem to understand these concepts of Consciousness or agency or will or interest when like at an intellectual level when when ask different sort of intellectual questions um is there some internal construct within the agent that corresponds to its statements like when it says oh I'm feeling X or I'm thinking why like can one I point to some kind of consistent internal structure that sort of matches that or is the verb verbiage that comes out completely detached and free floating from plausible candidates within the agent that we might think constitutes the computational implementation of these mental States so um one could try to like yeah get a little bit more insight there that might be one way of approaching this but that there are many others as well I think way one can try to start to hack away at this this question do you think we might be able to you know through thinking these kinds of things arrive at some kind of like Universal morality kernel in a sense meaning figuring out some general way of applying um figuring out the well-being of things or figuring out their their Pathways like there's this broader question around and it also factors in AI alignment and so on what sort of like motive my a super intelligent being have for a species that's just so far behind or um and so on um and one might be like well there's some kind of universal morality sense of just supporting you know the same way that you don't go around um harming and colonies or trees just because they're there or something like that and you sort of want to let them flourish um is there something where Maybe by examining the digital Minds morality question we might end up at some like deeper principle um what potentially that could be stepping stones towards a more like abstract formulation of some core of normativity or ethics that it's also possible we might reach that just through traditional philosophizing and stuff um but um be that as it pay I I still it still seems that there would be even if we can't really nail down like a precise and agreed complete formulation we might still be able to distinguish at the vager level Something say a friendly beneficent kind approach versus like a mean on caring approach like it seems with humans we can you know certainly it feels different when you're like kindly interested in somebody and want their best like at least other things equal versus like when you're hostile to something and we can detect that in ourselves and in others and we can have one attitude or another and so why should we not at least be able to have say AIS have like the kindness attitude rather than the meanness attitude even if that's not like completely matches what would be the morally optimal thing it would still seem like uh if I had to pick like a mean AI or a kind AI like kind of go for the kind of one right even if that's not like exactly our human sense of kindness might not exactly match what is objectively more or the best if there is such a thing as objectively morally best it still seems like a good step in the right direction um that we could take before figuring out like what the ultimate truths of all normative facts might be um I have some some recent paper it's not really a paper it's more like some notes on the uh time being a base camp for Mount ethics or something which has like some kind of have baked or quarterback baked ideas about mathematics and stuff that uh yeah I I it would be better if I could actually uh have written them up clearly and achieved like precision and stuff but I figured I would just do this hand wavy thing for now yeah and as you think about maybe um you suppose that we solve AI alignment and we I get you know uh our act together as humans and we kind of um can leverage AI to start thinking about you know digitizing humans and so on um how do you think about like the that transition might go like do you think um you know you know in a world where we're able to you know get to be measuring neural States and so on and we can digitize them and we can emulate and so on like how do you sort of see that transition into you know wave of digital humans operating or do you think we might start by enhancing ourselves by like in this kind of hybrid biological digital model you know that is more likely um well I've never really been the the the the kind of neural implant idea has always seemed a bit slightly far-fetched to me I mean not so far-fetched that nobody should explore it but like it is you know it doesn't break any laws of physics it could work but it just has felt less likely that that would be where we where the action will be or like I think it will be faster to do it the uh the purely artificial root uh conditional on it not being faster to do to purely artificial route I wonder if it would then not be faster to do it on the purely biological route by like genetic enhancements to human intelligence for example um and and the the cyborg path seemed like the third most likely like after those other two um mainly just because I mean there's like a huge you don't really want to have brain surgery unless you really have to and and like there are like neat results presented but then if you look at the detail they're all these kind of complications where they're like it's just not very fun to have it um like the whole like there's a wound there's a hole that can be infections that the electrodes can move around a little bit and then they stop working like once you dig into the nitty I think it's um I mean if you have a big like disability and stuff like maybe it'd be wonderful if you could do this and it would be worth taking some significant risks but if not I wonder if you could not have um a lot of the benefits by having the same a chip thing outside the body but interacting using you know keystrokes or Voice or like the other output channels that we already have and um yeah I think that that would be my main line like like I get I guess the uh if I wanted to try to steal man this you could imagine if you had a sufficiently high bandwidth interface with the brain and you could have it for a long enough period of time maybe it would have to be an early childhood but like that maybe the brain could somehow use um an advanced not AI on the outside that maybe they could kind of figure out a way to use each other's unique resources in in ways that you don't get with a slightly lower bandwidth longer latency interaction when you have to type on a keyboard um or or you could imagine like more kind of mad scientist applications where you like have a whole bunch of uh pigs or something that individually is not that smart but if you had like 50 picks all connected with some A high bandwidth fiber and they all grab together into this like much larger biological neural network uh like would you then have like Singularity where uh yeah uh it it like I I it's that there are a bunch of these kind of more uh like crazy transhumanist scientist experiment I I don't know whether this would be good or not to do but it's kind of odd that relatively few of these have been done in the real world and there's like a bunch of other like weird there's a certain kind of person would immediately think of a lot of weird cool stuff that you could just try out in in biology and stuff that a relatively small fraction of those have been done maybe for the best but uh in some alternative Universe where everybody grew up on transhumanisms I think we would be living in a weirder World by now yeah it doesn't seem that far away from some of the current Tech that's being explored that we might get high bandwidth enough interfaces um and and some of them non-invasive like there's some um ultrasound techniques that might be able to stimulate a you know small region of um of the brain and so on to be able to like without you know not penetrate the um the actual brain and so on um because that'll be like just way way healthier um but uh it might be that you you can start piping signals between even human brains without having to interpret them from an ml site and the digital uh Computing infrastructure getting to something um close to being able to like just think together and start flowing information through I mean there's all these kind of experiments from um with people who've had um there's a disorder where um people are born with or develop uh kind of like this um uh split uh Corpus call awesome and then you end up uh there's been guesses that you you end up developing different personalities and like different people potentially in the in in the two lobes and so it might be that um we may not be far that far away from from at least like some exposure of being able to kind of have some version of early telepathy or something yeah I it's definitely possible I I would still place that lower on the probability I I think we'll probably get some maybe cool demos and stuff but then would I actually expect this to become a big thing that seriously I mean there are all these like you read through the literature of cognitive and Asthma they're all like hundreds of things that supposedly have all these kind of effects but then the reality of it is that very few people bother and the ones who do probably don't actually benefit and uh yeah um but uh we might be surprised um so I mean I I we do have quite a lot of optimization behind language and stuff like that right so I think it's still going to be hard to do much better than you can by just talking yeah and um so you know suppose that we go through the path of digitizing uh you know getting to a full whole brain emulation and so on um how do you see that transition sort of happening I mean um certainly at the beginning we'll start with like one or two of these examples first with some animals and and then eventually there'll be some moment where um whether it's a human um how how do you sort of like see that development developing um my guess is it would come after super intelligence um it is an alternative path to AGI um but I've been more impressed by progress in AI than in uh whole brain emulation over the last 10 years and even before that I thought the AI path was more promising so in that case it would be super intelligence that it events and perfects the uploading technology and I mean in some sense it doesn't really matter exactly how it would work if if it's an AI that has to figure that out we've been presumably it would figure out a really uh reliable and smooth way to do it um and then we would just sit back and if we wanted to go down that path um um yeah I mean we haven't really even small animals you might have thought by now maybe we could have like a bee or something like some little thing um but so so far not really um uh it might be that we will like get to something kind of impressive earlier uh without doing any brain scanning at all but just inferring from behavioral outputs so you could already kind of have a dpt3 like system that roughly mimics somebody's literary style let's say from having read a lot of their work and and you can have these I guess deep fake things that can mimic somebody's facial expressions and appearance if you have a lot of video and somebody's voice and so as these systems get smarter maybe you could also start to mimic somebody's thinking to various increasing degrees um and it's an interesting open question at the limit if you had radical super intelligence but you only had the kind of data that is available now from you know somebody's emails and some video interview or some voice recording or whatever how much could a super intelligence infer from that data as to what their mind must have been like to have produced those outputs um is the best model uh that predicts these outputs ones that would actually uh be similar enough to the original person that they it could possibly be seen as a a personal continuation that would it preserve personal identity would it feel more or less the same to be in his ai's reconstruction based on these behavioral traces as it felt to be the original person um I I think it's quite possible that a super intelligence would be able to do a lot with very little uh impact it's I I don't know how we could get like a firm a solid argument for that but but if I had to guess it seems like yeah you you probably could get pretty close in if you were good enough that reconstruct it just from typical Tracy stuff behind by people today yeah in the extreme way of interpolating out um and Reviving actual ancestors or something like that uh let's um jump open it up for questions from from the audience we'll um take a about 20 minutes of questions and then um uh and then conclude there um uh folks in the audience if you have questions raise your hand I think there will be um am I going around and on Twitter please use the hashtag plbrethroughs to ask a question um I'll kick it off with just a question that I Source at um ahead um Marco asks in your view where does Consciousness emerge and before how how should we define consciousness um and I think this is kind of related to the simulation argument uh which one of the three hypotheses do you think is more likely to be true but I think let's first start with the Consciousness one where do you sort of Imagine The Consciousness emerging like in the brain yeah but um I guess it's more about like the level uh so like what level of kind of processing so if you sort of go down in the neural system all the way down to an extremely basic maybe like a nematode or something like that is that conscious and then kind of in between the nematode and a human there's I don't know a mouse and so on where exactly do we get uh Consciousness emerging certainly probably by mass we definitely are past that but um right I think it's a matter of degree and that there are multiple dimensions in which you could uh interpolate smoothly between say human consciousness and unconsciousness like different directions you could go where if you keep going there you sort of diminish in some sense the quantity of experience there is until you get to zero so one obvious one is I mean you have a kind of integer um multiplier right if you have two brain in the same state undergoing the same state I think you would have sort of twice as much in one sense of of that experience as you would if you only had one brain and I have this old paper where I also argue you could have fractional quantities of this uh if you if you build the circuitry that implements the mind uh with unreliable components like indeterministic processing units like depending on exactly how you do it in certain cases I think you would get like a kind of uh as you get my higher availability we get like larger larger fragments of Consciousness until you had the whole thing but in other you would actually get sort of uh 1.3 units of qualitatively identical experience and and you could also go down below one to sort of scale it to zero in that Dimension um I think there are many other dimensions as well in which um the quality of experience could become simpler and simpler and less and less morally significant until it gets to a Zone where maybe it's just vague like where our concept doesn't clearly imply a fact of the matter like once you get down to sort of insect levels maybe it's it's gonna be there's a certain system and our concept of Consciousness might be such that uh even if you know everything about the insect it would still be in the vague Zone like a little bit like there's a you know a person who has a certain number of peers and like are they bald or like I I guess I'm bald but if I once upon a time um I I would have been in this kind of vague Zone and saw um yeah and and you could and and then there are like other like sometimes you're more vividly aware but sometimes you might have some Consciousness but there is no self-consciousness um um oh there's like some weird mental state that's um I I think I think we might be misled upon superficial introspection to think that there is this very simple thing that is subjective experience that either is there or is not there that it's a binary thing that we understand I think either if you reflect more theoretically from a computationalist point of view and with brain you realize that that's a lot more problematic and I think you could also reach that uh conclusion by just introspecting more carefully about your own State like I think meditators maybe sometimes would understand that things that seem very simple and homogeneous as it were if you really pay close attention or a lot more flickering and disjointed and unintegrated and there's a lot of yeah structure there that can come apart um and and I think that as we move away from the Paradigm cases of Consciousness like a normal awakened human paying attention uh that then then yeah properties that we think go together come apart and then it becomes more like a verbal question which set of those properties you need to have in order to apply the label Consciousness correctly uh next question back there uh hello um first of all thank you Juan thank you Nick for really brilliant discussion on the topic of artificial and super intelligence my name is Alex I'm CEO at Collective knowledge labs and I want to ask you what is your opinion on maybe their breakthrough in super intelligence lays in the uh in the combination and symbiosis of human intelligence and artificial intelligence and not not just artificial intelligence um I think if you sort of um uh screamed a little you could say that that's kind of the state of play today where we don't have like an individual system that is super intelligent but you could have like Humanity as a whole or some big Collective like a large corporation or the scientific community that is at least in certain respects super intelligent in that I can perform a wide range of tasks at a much higher level than an individual human but not all tasks so that's why it's not like a perfect example but yeah and so some of these systems we have today are certainly a hybrid between biological brains uh Information Technology systems like the internet social networks depositories of papers um and then a lot of culture as well that that kind of um you could almost see like these phenomena you start to get more and more we're like you get the current thing thing and like where where there's like a particular focus of attention of the global brain like it's becoming more and more like a human who's like obsessed for a period of time with some particular thing and all the mental resources get focused at one thing and then your attention shift is something different it it's like we're beginning to see a little bit of those Dynamics kind of happening in in our Collective cognitive space maybe as a result of the increased bandwidth of interaction and like the technology kind of enabling smoother uh communication um not always producing super intelligence but other forms of kind of collective mentality that sometimes maybe uh sub uh sub intelligence uh in terms of their level of wisdom and understanding but yeah but in in certain in certain domains you certainly like you have a research Community that's targeted focused on one particular problem that are building on each other's contributions and blogs and and you do get a sense of the whole being kind of there have been many different modules that are each looking for the next way to put the piece on the stack that is being built together and the whole stack goes off much faster than if it were only one human building it right um next question from Twitter uh Turner asks what is the most important question which Nick feels he's not in a position to personally solve two factors first being importance to the dev development of ethical and successful AGI and second being Nick's inability lack of expertise to solve ah [Music] well I mean there are questions of uh more Global nature as in ultimately what is the right direction to going as it were the ultimate uh the correct macro strategy I think we are a sort of fundamentally in the dark regarding a lot of the ultimate and big picture questions and that therefore our March forward is uh to some extent an Act of Faith rather than the product of carefully thought through Insight then I'm not sure we can get that Insight at the moment and so that's like one One Direction at which at some point my understanding runs out and there's like probably important stuff beyond that that may or may not be good for us to try to reach but it's probably there in one way or another like another would be at the more technical level if you sort of zoom in and narrow it down so then like a lot of stuff say for example with AI alignment uh like there's gonna be a whole host of really important ultimately technical results and algorithms and stuff like that that uh maybe currently nobody has and certainly I don't have and I I I probably won't uh um discover them either but that might be critical to uh to the Future um and then I guess you could zoom out in another Direction sort of laterally like across the social sphere so there are big problems like um how to secure world peace or to add uh like a welcoming uptake of these digital Minds that then involve problems at the cultural and communication and political level where also one feels I feel quite stumped and it will you know um so I'm yeah I'm kind of I'm squeezed in the middle of like if you zoom out too much uh my understanding runs out if you zoom down too much into the technical understanding result and if you zoom out laterally also it's a it's a little bubble there or I'm trying to uh keep track of what's going on right Eddie asks if the speed of light would accelerate does this prove the theory we are living in a simulation and if no what quantitative metric would validate the theory um if the speed of light accelerates I don't see how that so would uh it certainly wouldn't apply it I'm not sure immediately whether it would increase or decrease the probability somewhere uh maybe thinking about like some some marker that shows that you know some kind of discontinuity on some um quantity of physics that just seems like bizarre to us or something um so so the so there are a lot of things that could change in physics that would maybe be in in one sense puzzling and deep and interesting but ultimately simple that there would be some possible physical law that is itself simple that would describe them now you can contrast that um and and then of course you could have situations where it's just chaotic but you could still capture the statistical regularities through simple statistical law like that that's one type of basic Universe we could live in which so far uh it everything we know seems to be consistent with um now contrast that with a different possible World which we could have lived in and we could still find out that we do where maybe you would have like a parapsychology would be true so you would have like telekinesis or something where like what we think of as a high level complex macro State like a particular brain in a particular configuration but not in a slightly different configuration but just the types of configurations that correspond to somebody having a particular concept and wish if that had like say a systematic physical impact on some remote system like the way that you know parapsychologists have imagined like that that would be possible not just because it would you know it would be fundamentally different from like discovering that the speed of light is accelerating because it would be the thing that if it were true would seem to suggest that there were no micro level explanation of the world like you could have these macro staves that suddenly could like reach down and change the micro so if we made some Discovery like that that that then might yeah length evidence and Credence to the simulation hypothesis because that it looks very hard to see how you could get all of this to square up maybe without that if you still wanted to have an underlying micro level regularity you could have like the simulating universe being kind of simple at the physics level but then simulating a different kind of Universe um the alternative would just be that we we didn't have that simplicity at the level of basic thoughts which I guess we could discover uh now I don't think that's the most the only or the most likely way we would find evidence for the simulation argument if we offer the simulation hypothesis if we do that would just be one word like that that would be more yeah other kinds of evidence that would be more likely to be relevant yeah since we're touching on the simulation argument um which of the three hypotheses do you think is the most likely just that the sorry which of the three prongs of the argument um uh do you currently think it's most likely I mean generally a bit Coy in attaching probabilities to that so yeah I tend to pump the uh that that question for various reasons including if I give a particular number that might be misinterpreted but yeah I mean I I would uh like so normally what people want to know is especially on the simulation hypothesis like that's like the one that I really want to know how and as I've been like I I guess yeah I won't attach a probability to it but I certainly take it seriously it's not just like um uh like a logical possibility or a thought experiment that we can't 100 rule out but it would certainly be like a live serious possibility in my view yeah and for those um uh unfamiliar the simulation argument is a three-pronged um uh argument about how there's either we have a great filter meaning we have like close to zero Advanced civilizations uh either we have um a disinterested set of advanced civilizations where close to zero are interested in running those simulations and um and then there's a simulation hypothesis which is that hey if we if there's no great filter and they are interested then close to all beings are simulated and this comes from thinking about just the vast number and vast quantities of um of people that would be simulated and then the likelihood of um of your experience being sampled from from the similarity ones uh sorry Nick I'm like probably uh giving a bad explanation here but no no it's very good yeah um another I think it was another question over here or yeah yeah I have a question so um I've always been very interested in emergency uh Merchant intelligence especially as it relates to animals I mean the classic example tends to be beehives as we look at Consciousness what biases do you think we bring in as individual social animals humans versus a collective organism like bees especially as we look at humans maybe moving to be more be like as we create nation states and larger organizations versus a Singleton how would a Singleton perhaps have a different AI alignment bias uh as I think about this the only really intelligent animals I can think of that don't live socially are apex predators which is perhaps a bad sign um let me see if I understand so I think well it's one question the phrase this differently if I think about um a curve do you think that collective intelligence is like um Hive animals are on one side of the spectrum with social animals like humans in the Middle with Singletons being on another extreme or is it more of a horseshoe curve in terms of the distribution of intelligences and how they work towards common goals to maybe 11 malevolent or not aligned with us well if there were a line I think the super intelligence would be more on the side of these hive uh insects if we if we look at the scale of an ant colony it's in some sense it acts like uh a Singleton within that of course there are other ant colonies elsewhere and other things that it doesn't have control over but they they would as it were be able to act as a single agent to some extent and humans um to only a lesser extent although in some Dimensions we are better coordinated in terms of being able to share detailed information on plans we are in that in that respect we are more coordinated than ants but in in in the respect of our individual Wills being less uh aligned to a common goal we are uh less uh like a Singleton then an ant colony is and I guess you could have had like a group of animals that were even more individualistic and anti-social than humans are and they would then be further away on the other side so humans would kind of be in the Middle where we have a a fair degree of uh sort of shared purpose but not not like a full Hive organism um but also a lot more than zero it's I guess an interesting question so certainly different animals I mean have different goals it seems like some I mean at least At The Superficial level some like to eat grass and some like to eat meat and some like to hang around with others of their kind and some like to just do their own thing um and the pursuit of belief there were some other species that developed super intelligence and aligned it aligned it to their values then they might also have different Baseline goals that might overlap slightly with humans like but also be different in other respects uh there are two two open questions one is like epistemically is are there significant differences between the inductive biases that are brought to the table um presumably there are some inductive biases that are different but with those kind of be smoothed out uh reasonably fast as you have more data and more intelligence like it doesn't start it's like it may be uh like a squirrel with more quickly cotton onto certain things that are relevant to the squirrel world and some other organisms to another but like as they developed scientific reasoning like do they have enough overlap between their inductive biases that the differences wash out as you see the full impact of the evidence that that's one question you could ask and like another is that even though these different organisms start out with at least two proficially different goals are they in some deeper sense the same or alternatively with a arrive at some shared understanding of what the highest moral Norms are even if their own personal goals might differ uh like like a lot of humans might individually have different preferences like I care about my family and you care about your family but we might have a glass converge in in the sense of Let's uh respect each other's families let's say like a Cooperative level of more abstract Norms might also be uh convergent uh quite independent of starting point um so so those are two questions I could ask there that I'm not sure what the answer is but um I'm not yeah I don't know whether that addresses your your question at all but uh I'll have uh two more questions one is um how sure um are you Nick that an evil Singleton AI to rule them all would be internally aligned over time could it be fundamentally set up to split or diverge with subunits pursuing different ideals or goals um I guess everything is possible uh I mean if if it were unified at one point in time and if at that point it was technologically mature then I my I would expect it to remain Unified because I think it would have access to the kind of control technology that would make it possible for it to do that and I think it would have instrumental reasons to do that for almost all initial goals it might have at that time you could imagine some very special goal like if it specifically has as a top level goal a thousand years from now I want to be divided against myself and fighting um like an Insurrection against myself if that word is called let me actually arrange that but for most calls it would probably be able to achieve them to a higher degree if it worked in concert with itself and then I'd imagine it would also have the technology and insights to make that happen if if it starts out unifying like if it if it starts out like a sort of vaguely politically integrated political entity then it might be that it even with technological maturity it's not so crazy to think it might come apart at the later just like humans do like sometimes you have a well-functioning political unit and then you know 50 years later you have Anarchy in a particular State like we we can kind of get these temporary partial solutions that I guess it would also be possible with certain kinds of like maybe some upload Collective that comes together to achieve secret level as you could imagine political Dynamics working well for a period of time and then it's falling apart I still think that's less likely than it going kind of towards a single time but but by no means extremely unlikely and uh last question if things go well devadat asks if things go well do you have a vision for how differences of opinion about what a good future society looks like uh sorry um if things will go well do you have a vision for how differences of opinion about what a good future society looks like can be accommodated meaning is a icon big enough for everyone as they you know develop very different perspectives and different ideas of what a good future society looks like how do we kind of reconcile those differences of opinion and how do we build a meta system to kind of enable like different um flourishing civilizations in a sense um yeah I think it's large enough for uh most almost all people to have uh most of their values accommodated uh like if if you have two people who have literally opposed values about a particular thing then you might not be able to satisfy both but I think a combination of on the one hand some differences being perhaps merely superficial um either disappearing up on better understanding like there's like certain things we would just have ultimately different beliefs and we say we want different things but it's because we have different assumptions about what would actually happen let's say so those being potentially diminished by increased intelligence and knowledge and experience um then the increase in resources and expansion of the technological Frontier and then some kind of creativity and like figuring out clever ways of combining values I think um I'll pull that a a great deal can can be accommodated uh because of these things but not necessarily a hundred percent and then it will be important to have um a a a robust and effective way to uh to manage any resulting disagreements in in a way that doesn't result in like negative some Dynamics and so hence because I think that's ultimately really important I'm like I think we should have a strong bias towards us forward that are more cooperative and and friendly and even if they seem to come at some short term expense or if they can't be very crisply motivated by some explicit calculation in every single case I think that General attitude uh as a sort of default bias I think is is still very much worth um bearing in mind as as we are pursuing these different aspects of the uh challenges ahead um that that should be our first result sometimes you have to you can't get full cooperation you don't want to be completely naive and gullible and uh but but still like that that should be the first and maybe the second attempt and then gradually scale back from that if really forced by circumstances well all right that's all that time we have for questions Nick thank you so much for uh spending this evening with us uh it has been extremely lightning for many of us and I think will be very useful to the broader community that is currently working on things like AI alignment and others and uh thank you really much um for your work for sharing your insights and for helping us um you know achieve a lot of great breakthroughs and hopefully have a great long-term future thank you very much a lot of good questions thank you very much for having me absolutely thanks thank you take care [Applause] foreign [Music]

Info

Channel: Protocol Labs

Views: 7,652

Rating: undefined out of 5

Keywords:

Id: VG8lanbnbwk

Channel Id: undefined

Length: 101min 34sec (6094 seconds)

Published: Tue Oct 25 2022