Joscha at Microsoft

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Computing group he currently serves as AI strategist at liquid Ai and he's an adviser to the AI Foundation he's known for his uh very interesting views that uh blend cognitive science and Ai and philosophy of mind so um let's give a round of applause for Yos [Applause] Shak good evening this it's not going to be a super technical talk I mostly try to get us into a discussion about a topic that is I think very pressing to many of us and it's related to the question whether uh llms are actually the technology that are going to carry us all the way towards systems that are smarter than us and all the relevant dimensions and so when we pursue AI of course uh it's super useful it uh seems to be currently something that is promising to make a lot of people Rich so it's very very exciting for a lot of people but it has not always been uh just a discipline that focused on making data processing more efficient but it is has also been a philosophical project from the start and as such it's a part of a line of thought that in some sense started with Aristotle and it was pursued in Earnest starting probably with liet who got this idea that we can mathematize the mind to understand how it's realized their nature and uh this line of thought was described by him as some kind of universal calculus what if we could translate every argument into a set of numbers and then perform some arithmetic operations on the numbers that resolves the argument and eventually we are able to calculate the results to all our questions uh when I studied philosophy I was introduced to this idea um by um a professor as an undergrad and the professor thought it was obviously and hilariously wrong and uh I learned this contempt for people like Li Nets by philosophers who don't understand Mass very often and uh of course now we have made a lot of progress on this thing fer tried to come up with some kind of uh language to express thoughts in it dud wienstein tried in his traad to turn English into some kind of pro programming language and turned philosophy into a qu generation problem and in a sense preempted the logicist project of Minsky by 30 years and in some sense he failed for the same reason at the end of his life he mostly felt stuck in a quagmire of formalizing language and grounding it in perception and experience and forces of the real world and for similar reasons uh symbolic AI did not progress very much at some point um but this entire project of understanding how we can express the Mind using computational machines is super successful and we made a bunch of important philosophical insights on the way in the last century so why is it that AI can succeed where philosophy Psychology and Neuroscience did not and I think that's largely because it has a unique methodology it allows you to express your thoughts as simulations and then run those simulations and the reason why this works is because mental processes are representations and uh the the operations on those representations and so you can express them entirely as abstract causal systems and uh the idea that you can basically describe any systems using State transitions uh is what we could call computationalism so if you want to take this computationalist perspective as a tool to understand how minds work the tool the way of doing this is to build an intelligent agent and uh while I was working at Intel we thought about uh what capabilities can we uh see in intelligent systems and Intel at the time was using uh benchmarks Like Glue and super glue and so on to look at the performance of systems and I felt um this is a race that you cannot win because all the stuff that we currently looking at is super hard it's going to be solved within a few months by systems that have not even been designed to solve it on the side right so we need to come from the other side and think about the space of intelligent capabilities and that what we came up with is what we called informally the big nine it's a a somewhat informally defined space in which we qualitatively identified a bunch of dimensions and each of those Dimensions can be seen from narrow to General and uh when we look at existing systems we find that even llms and Foundation models are only occupying a relatively small region of that space so far so uh simple definition of artificial intelligence could be that it's a system that reaches human level or Beyond at all capability Dimensions but for practical purposes you could also say it's a system that gets better at AI research than people because this is the point at which we can go to the beach [Music] so what's the most important unsolved question in AI U Minsky saw that AI is the science of making machines that do things that would require intelligence if done by people and alen new said it's the ability to solve problems or I would say maybe intelligence is functional approximation in the surfice of control but um we now know this is nonsense intelligence is the ability to predict the next token and uh we just scale this up and when we scale this up we eventually end up at something after training it on the entirety of data in the universe with the entirety of compute that we can get then it's going to be fully coherent and smarter than us and we see that this approach is very promising but we have an enormous zoo of applications that we are building and so it's really this question when do we get to the point and a lot of people get very nervous about this and at the moment I think most people feel that it's it's going to be maybe a couple years or so until we surpass human intelligence and so when samman said summer is coming uh I guess you know the Rick and Morty episode uh to which Elon Musk refers when he says keep summer safe uh I think that's a very interesting perspective that we are now confronted with systems that uh reach Beyond human capabilities and many dimensions yet we don't know how they relate to the world whether they relate to the same in the same way and keep summer safe alludes to a spaceship that is artificially intelligent is asked to keep the granddaughter of Rick Sanchez safe uh and that thing can turn into a homicidal machine that is able to obliterate planets while executing a relatively simple request and uh so that's an interesting question when we build systems that are super intelligent and we give these systems motivation um to by giving it command um instead of letting it derive what the right thing to do is isn't that a very dangerous thing to do and so some people worry that llms are lot like nuclear bombs on the other hand uh despite all the turmoil I think 99% of all AI researchers don't believe that AI will kill Humanity so if you go to actual AI researchers that work in labs and work on Meine learning and so on they're not really concerned with whether their systems are going to go fo and obliterate everything but these 99% of all the a researchers we know these are the boring ones and so we don't listen to that very much but I think that AI is not really like nuclear bombs it's either like a printing press which means it's going to be very disruptive and it's going to change the way people do things and it's going to change the world that we are in dramatically largely by making it more intelligent or it's like photosynthesis which means it's going to enable a lot more stuff going on but also at the expense of existing systems that get dis placed and some of the doors uh point at the big oxygenation event as this really horrible Singularity that happened at some point when organisms on Earth discovered the secret of photosynthesis we see harnessing the power of the sun to extract carbon dioxide from the atmosphere and turn it into biomatter and in this way enabling the main chemical reaction that gives us biomass on the planet and this big oxygenation event has displaced the number of species that existed before but almost all of the bio that exists now depends on photosynthesis and before that you could not harvest the energy of the Sun so there was a lot L less interesting stuff going on than the planet and so this an interesting question if we are able to teach rocks how to think how is this going to change everything and some people which feel Evolution was great because it got to me but I needed to stop here and make sure that nothing better ever emerges so there's uh some difference uh between uh what the llm is doing and what we all thought AGI would be doing what we see is that for instance we tend to be coherent with our dig data we are agents we are directed on controlling our future and taking charge of what we are doing instead of just completing tokens and we actually interact with the world rather than uh interacting with labeled data and in some sense success of LMS depends on training with label data because it's all labels all labels that people thought are so important uh that they typed them into the internet which is also an interesting pre-election of the data and the structure that we find there and so there's an interesting question whether the intelligence uh that we observe in living systems is fundamentally of a different uh class of things to do then from completion or whether at some level they are the same and if you look at the current state of llms there are somewhat below human performance in efficient learning from small data and uh gold Behavior they're not as coherent um for the most part which means they're unreliable and it's much harder to make an llm responsible for its decisions and uh they cannot be coupled to the environment in real time because they are implemented in offline fashion but they have superum performance in promp completion and um this means that the prompt completers are the part of our society that are most terrified of this right if you are not super smart you will find that the llm can give you superpowers I have friends which have difficulty writing text because they're dyslexic and have ADHD and they find the llm is amazing because it is dramatically increasing their output and allows them to actually develop agency in the world and on the other hand there if you are super smart and uh very good at managing things you now have an army of interns that are um pretty autistic but that are doing exactly what you tell them and you have as many as you want for $20 a month right so there a lot of people in the world now had superpowers and the people who are terrified are opinion journalists who only um compl complete Tok uh fronts right people who sit down you know exactly when they sit down what they're going to write because now you have a machine that is able to do that and a way and so AI systems have surpassed human AI criticism many years ago and it's an interesting phenomenon because I think that initially uh AI criticism was often better than the outcomes of AI research a lot of the arguments that Rus made and what computers can't do are super interesting and and philosophic investigations or in the Electoral report or writen bombs pointing out that Eliza doesn't have a self and it's only a Similac that in which people are projecting things and so on and uh eventually what we find is that AI capabilities increase the quality of AI criticism which is probably because if you increase partic participation in AI research AI research gets better but if you increase participation in criticism AI criticism gets [Music] worse so that's I think a big problem because I find when we ask ourselves what is the Delta between current AI research and AGI the amount of material that you find is uh relatively little right it's not often not very substantial the number of people who have mean meaningful arguments about what the current systems are not doing are typically working within the labs on those systems and a lot of the public research is done by people who are not working at the same intellectual level of the people who are working on those systems and that makes it very hard because of course almost everybody at open AI believes in the scaling hypothesis and we don't know whether the scaling hypothesis is true whether you can just scale it up to get to general intelligence or not but uh the arguments that would be made against it that is most would happen to happen on the outside and on the outside there are just not that many competent people left it seem so if we ask ourselves how it's possible to become intelligent from text alone this history is quite long I would argue that it might have started in the 1970s when people figured out that you can just do Statistics over how often a word appears in a text compare this to how often it is in all texts and this tells you whether the word is probably a keyword for this text so you can extract words based on how relevant they are for the text and you can now have have some kind of parameter that you can calculate for for every word to say how relevant and how specific it is to the text and this allows you to start summarizing the text and then in the 1980s we had lat and semantic analysis that was doing more uh statistics on the text and discovered similarities between words in this way and in 1990s we saw a lot of data compression for text analysis because if you make a data compression model that learns the entropy of a given type of text and then use that model to compress text you can find out depending on how well the text compresses whether it's an address or first name or whether it's English or German and in this way you can classify texts and understand what these things are and um when I was working in a lab in New Zealand in the '90s uh we looked at uh structure and language you all know these decompositions in grammar what's going on here um at the same time it's trying to put me into oh it was just trying to get me on the internet okay so uh my professor saw that was bored in class and told me um leave the class sit down in a lab and try to use uh data compression to discover grammar and language and uh I thought about how to do this and I came up with the idea of using Mutual information which is uh the uh predictive power of one word for another word and it's the same as this other word has for that word so for instance if you are predicting the word Ireland um then having seen it after the word Northern becomes a lot likely in news texts because Northern Ireland has been often in the news back then and uh you save a few bits if you encode the word Ireland based on the word Northern but if you know that you will see the word Ireland at some point you can also backwards predict the word Northern in front of it and so there's a certain probability that makes that word more likely and the amount of bits that you save when you and quote this dependency both ways is the same which is why it's called Mutual information it's how well these terms predict each other in text and so in in this way we can discover structures in text and I built hierarchical models of mutual information in text and basically encoded a bunch of text in such a way that I would save the most amount of bits and uh I was indeed able to extract uh the grammatical structure of English and German and Chinese from unknown text just by doing statistical analysis and I thought okay if I would uh be able to do more statistics like fourth order statistics or so i' probably also recover most of the semantics of the T most of the structure that is being in there but I already had the biggest computer that I had 2 gab of RAM and I was using in memory compression in C to do my statistics and I only had nine months for the project so I stopped there and never Revisited it but I I realized to do this right we need to make Statistics over what we need to make Statistics over instead of doing Statistics over everything and this is in some sense what the Transformer is doing right so before the Transformer happened we got word to back that basically um found the concept of embedding space for language and if you basically sort all words by similarity you will find that they occupy some kind of space and the dimensions that you discover in this space basically come down to feature dimensions and when people saw that in the space you can identify a vector that is um the same from men to women as it is from King to Queen you basically discover these feature dimensions in that space and can recover them and in some sense that's what's largely language models are doing not just for words but for sentences of an almost arbitrary length they basically um Locate them into a high dimensional space of meanings and um there were a bunch of algorithms that um initially were invented for machine translation and the reason why people had to do this in the context of NLP was because uh for image processing you didn't need it for image processing you could just use confu neur networks the interesting thing in vision is that neighboring pixels are often semantically related right more often than not and so just by looking at the spatial neighborhood of the pixels and representing this as a prior in the networks you can efficiently learn the structure of images which people did but the same thing doesn't work for language because in language the intermediate words in a sentence often um you need to skip over them maybe the first word in the sentence is predicting the last or the first sentence in a book is predicting something that you see in the closing chapter and all the intermediate stuff you want to be able to skip over that instead of having to look at all possible sequences of neighboring bird because there are too many of them right so they had to find a way to basically free form set pointers in there and the people who came up with the Transformer basically Set uh down thinking about when I parse language how do I shift pointers around my head and didn't even think very very long about this question they came up with one solution to shift around pointers in the head and then they came up with another idea which parallelized the entire process in the way it's probably not parallel in the brain which made it super efficient for machine learning and what's interesting about the Transformer is two things one is it's still quite good a lot of attempts have been made to improve on the Transformer and they're not for the most part dramatically better than the vanilla Transformer the other thing is the Transformer is not its own meta algorithm despite the attempts of some neuroscientists to write papers about how the brain must have a Transformer and we should now go looking for it it's probably not how the brain is doing it I suspect that the a person who invented Transformer did not do this using a transformer in their own brain right instead it's an engineering solution to a particular kind of problem and it works in certain cases very well but there is more General stuff that you could be doing yeah so the Transformer you all are familiar to it and at the moment uh people are looking at some alternative uh architectures to the Transformer but um this Transformer thing is able to train uh deep networks now it's it's trying to get me again on the internet thank you but no okay and so the question is can we get there and there's some tasks that are hard to do with Transformer F points at his Arc Challenge and he manages to get the test set out of the training data so you cannot look up the solutions on the internet which means that uh these systems are usually not able to solve it and this is one of the things that seems to be relatively hard to do for an llm or if you look at a puzzle like this one right you have a dice that you are moving around and go to the end how do you do it and you can do this step by step or you can come up with some kind of shortcut where you make some logical inferences to solve it very quickly and both of these things are relatively hard for an llm but what the llm is in some sense it's a good enough electric Bel Guist that could do anything and it's possessed by a prompt and basically it's uh captures the essence of a culture by doing Statistics over the language so deeply that it's able to do enormous amounts of stuff as this and for instance I was completely fascinated by Google's Gemini uh many of you are probably aware of the Gemini disasters they um made some mistakes with the alignment and went all in by letting their political activists make sure that the thing comes down and a given politic iCal quadrant and uh it was super interesting to see how this happened because I don't think that Google gave this thing a list with all the opinions that this thing had and that that made it really really interesting that when Del came out and you asked it to render a picture of the ancient philosopher Aristotle it made sure that at least 30% of the aristotles were female in black and it did this by editing the prompt behind your back in a very hamfisted way in order to make it more diverse and as you probably know Gemini did this for like 100% of your prompts so you had difficulty to get any um Aristotle who wasn't a person of color and this was not a big problem for the researchers unless until somebody figured out that it's not only changing British Kings or uh French novelists of the 18th century or physicists uh of this period into people of color but also Nazi soldiers because they flat out basically refused to render white men for reasons of diversity but when Nazi soldiers were depicted as people of color uh minorities that was a problem the interesting thing is um that this was not just algorithmic it was something that the system inferred it had to do and you could argue with it about it because uh it is was in some sense done via social process where it was no longer hardcoded into the model using programming code or simple rules but it was done via system prompts and instructions that were rfed into the model and uh so for instance it would would argue that Elon Musk was a super bad person or that they could not defend effective accelerationism because it's farri movement and you ask it why and it comes up with some obscure guilt by association argument and then invents sources and so it's a very interesting thing the model insisted on having a bunch of political opinions it would not for instance support meat eating it would support not eating meat and arguments it would be willing to give you arguments against having children but not arguments in favor of having more than two children so arguments against internalism are fine but not Po nalism and it's super interesting and I think what happened is that the model inferred what based on the wording of the system promts and so on what the sensor should be like instead of implementing it straight away and I think it's so interesting because he could of course do this with any kind of political opinion this thing has seen all the social media discourse and it's probably able to implement any kind of persona this would be so amazing for sociologist to study right you have a model that is trained on every opinion on the internet in enormous amount of detail more than the actual participants in the discussion imagine how you could study people and societies with such a tool it's really amazing having such a b Guist so instead of getting riled up about the cultural Wars in it we should really look at this as a depiction as a mirror of who we are in a society and uh try to understand our condition so the m is a system that is choosing the most likely completion and with rhf it's choosing the most appropriate completion and in this world you are uh you can only compete if you're able to choose your own prompt and your success depends on being able to uh not just use the most likely thing but to choose the best thing um best thing for reaching your current goal for the best goal that you can find based on your best model of yourself and your relationship that you have to the world and the world itself and it should be serving the longest game that you can play and this leads us to the question whether llms can or should have agency and I would argue that at the moment they're basically Golems basically they dump machines that follow a complex set of rules and we have difficulty to infer the exact consequences of following those sets of rules to the letter which makes them quite dangerous and if you look at human ethics we have not had very good human Ethics in the last few hundred years because uh we were unable to replace the concept of divine will that has replaced this driven this Society until we abolish the idea of God right and so after we had abolished this notion of there is the best thing that could be done from a collectively enacted agent that is the spirit of our civilization we tried to come up with something that worked better and the attempt that people had was mostly utilitarianism which means put metrics on mental States and optimize for those metrics and this all breaks down when you are looking at non-human agents or at Minds that become mutable that can reprogram them themselves like a meditator and so on and uh if you can change how you respond to the world if you can change your emotions if you can change your utility functions then the whole concept of utility maximization is breaking down and you basically need to find something else and I think it's a big issue that he largely did not reflect on the fact that delarian never worked it was an unfinished project started by a few philosophers who died before they could complete it or conclude that it didn't work and now we studying those philosophers and that's ethics and it's a bit nonsensical right we are in this weird situation where you don't actually have working ethics that we can express as algorithms and Implement on our machines and uh this we can easily argue that the ethics that effective altruism came up with are not very good and that you can basically if you trace them or the implications they get to very counterintuitive results of what should be done in the world but and this is not the fault of effective altruism being exceptionally stupid it's default of nobody else trying to do ethics at the moment it's mostly scholarship instead of actually reasoning about ethics and I think that's a very dangerous situation that we need to address because if we have machines that are more powerful than people and that have actual agency we need to think about how can be formalized ethics in such a way that it's sustainable and compatible with our coexistence with such systems on the planet so what is agency I think of agency in the context of control Theory a controller is um concept from cybernetics and it's basically a system that is measuring a part of the world of the regulated system using a sensor and it measures how using that sensor how this regulated system deviates from where it should be and the controller is then using an effector that for instance therat can turn on and off the heating to change the temperature in the room and this temperature is being disturbed by the environment by the world that is impinging on that system and it's changing its temperature away from the way it should be and the sart is is not an agent it's a system that is only acting on the present and performs actions in the present it doesn't have any goals right it doesn't have a model of the world it only has a particular kind of implementation but now imagine that you are extending this uh Thing by a model and this model allows you to for instance measure the temperature deviations of your room over a longer time span in the future and you try to minimize the integral over this deviations right now you suddenly have multiple trajectories into the future depending on whether you're switching now and how you're switching and uh these uh decisions to turn on the heating and not are now a decision between different trajectories of the world and this means you now Implement preferences and start controlling the future and I think an agent can be defined as a control of a future States and so all these definitions of agents that people came up in the 0s like belief desire intention and so on fall out of this relatively simple concept as soon as you have a controller that is able to to model the future and regulate the future you have an agent and the simplest agents that we know in nature I think are cells and everything that emerges over sets uh our own motivational system is more complicated we have not just the temperature that we want to regulate in our body but we have a few hundred physiological drives and I think we have a dozen social drives and um a handful of cognitive drives and they are competing with each other and we can use them to model the way in which motivation Works in a human being to serve all the organism that's need and its physical and social and cognitive environment so uh in principle we could take such a thing to drive an llm to use this as part of the front of an llm and there is the question does the llm understand anything that you are saying to it was at a philosophy Festival it was a big panel with philosophers on stage who complained that llms cannot understand anything and just simulate understanding and uh just the session before I noticed that the same philosophers failed to uh Define understanding and they were not able to understand what understanding is and were even proud of the fact that they were unable to understand what understanding is and so I uh I felt we probably depend on AI for solving that problem not on the philosophers and philosophy does have a bunch of ways of thinking about understanding and I think the biggest Scourge are the correspondent theories which are saying that understanding means that you are somehow pointing into the real world with your Concepts and you make the correct Arrow into the real world but I don't think that you can draw arrows into the real world the system is unable to talk about anything outside of itself we all know this if we understood L's theorem and Good's proofs right as a computer scientist you can clearly see you can only talk about the things that you define in your own language and the way in which girdle made his own proof is that he had to invent the emulator he basically built a computer that allowed him to define a computer in his own language and in this way he had a language that would talk about Itself by recreating a system visin itself you're able to now make statements of about that system using that system and uh in the same way when we talk about the world we don't talk about the actual world we talk about the model of the world that we are creating when we talk about the universe we talk about an integrated function that allows you to get from every part and this function to every other part and in this way understand everything in some common context and know how to switch subp parts of this context in and out and when we talk about understanding we talk about establishing a relationship to this integrated function and arguably the llm is able to create such a function and the function is not isomorphic to the one that we are making but it has a lot of overlap and one way in which we can think about this function this model that we are building is the function that we our mind is building is constraining the space of possible universes that we could be in right the more we learn about the world and integrate in our model and the more that allows us to effectively interact with the world the fewer uh degrees of freedom do we have with respect to what else the universe could be right so it's basically elicita the shape of the universe that becomes clear and clearer the more detailed our model becomes and if you have a model that is being trained on Wikipedia it's clear that Wikipedia could not emerge in any kind in all sorts of universes when you read Wikipedia you understand it limits the amount of universes that you could be in and arguably that the llm is already integrating more information into its model than a human being can so it's probably nailing down the type of universe that it's in more tightly than we are doing it but unfortunately using the retrieval and inference mechanism that we currently have we make it much harder to discover these dependencies and hidden patterns instead we are trying to make the llm more predictable we make it childproof we are limiting its output to uh what is to be expected and this makes it much much harder to discover the unknown or potentials for unknown and the other thing is the llm is not being trained for making the best possible completion to give you the most coherent result it's going to give you the most likely result per default right so if you get the llm to play a game with you and it's making mistakes in its game of checkers it is now going to probably continue in a verse way because it thinks it has started to emulate a subpar speaker and a similar thing happened with Gemini it's been a system prompted by a stupid person as a result it's going to interact with you in the more stupid way and uh well Claud seems to be per default a lot smarter and is able to engage in arguments with you that are more advanced despite uh models being of comparable size and trained on similar data so I don't think that the engineers that work in Gemini are worse than the engineers that work on claw but the way in which the system is being tuned to respond the way in which the promp possess it and con restricts it is different can we couple llms to the environment right now they are offline right and our own perception to with the world is online we are basically in a constant resonance with the world and our perceptual system is tuned to predict sensory data in real time and our reasoning is offline our reasoning is able to do things asynchronously it reflects on the world and we can keep a thought in our mind for as long as we want to it does not have to relate to per ction and the llm is always offline in the sense but of course we could couple it in a way but uh it's it's not based on the same principles that are U doing the resonance in biological computation I uh partial to grossberg's idea of adaptive resonance as a paradigm to understand what neurons are doing so instead of treating neurons as circuits treat them as oscillators and basically every neuron is as an activation state that is continuously changing in multiple dimensions and it figures out based on its environment what it should be tuned to what kind of things it should resonate with in this way we can think of building dyamic representations in such a system and what we also find is that uh people are able to resonate with each other and when we talk about empathy there are two types of empathy one is cognitive empathy where you make inference based on the statements of people based on their facial expressions that you're parsing based on what they're saying what mental state they probably have but there is another type of empathy where you actually feel the emotion of the other person it's a sensory access to that thing and that happens by some kind of resonance where you're actually vibing with the other person and such a way that you're able to interact with them and you can have emotions together that you couldn't have alone and mental States together that you couldn't have alone imagine we could do such a thing with uh machine that would require of course a very different approach it would require us to make the systems real time to make them uh operate on a verbal frequency that can be tuned to the frequency at which our nervous system is operating in a meaningful way right I I think that could be super interesting to have a system that is able to track your mental States and infer your mental states by observing you on all available feature dimensions and make a multimodal model of your mind so I think that's something that is in principle very difficult to achieve with our present approaches and the current generation of model still has a bunch of severe um limitations it's uh difficult to make them multimodal and so on but a lot of people are working on this and uh there are no proven limits to the general capabilities I think there are a lot of arguments but they're mostly hand wavy and contentious and I don't see a very good argument against the scaling hypothesis one that is um philosophically convincing to me um and if any of you have any I'd be excited to hear them right there is stuff that these systems are not able to do in the same time span as we can and uh on the other hand it's not clear if there is a better way of doing it and I think that most people at open a believe that AGI will not be Transformers it will not be LMS but it's possible these models get us there that they basic are sufficient to become better at AI research than egi researchers so if we ask ourselves can we compare digital computers to human brains a lot of people say no they can't be because if you have a bunch of neurons there's so much going on these neurons you need so many computers emulating all those neurons Faithfully there is no chance but uh people rarely ask the other way around how many brains would it take to run meos right it's probably also an astronomic number because our brains are very mushy and uh noisy and so I think it's a very difficult comparison because the computation is implemented in extremely different way and basically brains do a correction and software computers do it and hardware and this leads to paradigms that make it very very hard to compare and uh but as as a small thought when the stable diffusion weights came out I was really surprised by the fact that you can download 2 gigabytes of weights and these two gigabytes of Weights contain a visual universe that is much richer than the visual universe that people generally have right it contains every artist every celebrity every plant and dinosaur every spaceship is all is in there in 2 gabes and it's like 80% of what your brain is doing so your brain is probably not exabytes of useful information it's probably in the bpar of gigabytes and this is very humbling right it leads us to the question would it be possible to run a person on a gaming PC if we had the right algorithm and at the moment we just don't know right it's it's a very difficult and so far open question how much computer we actually need to build a meaningful intelligent system so if you compare computation it's deterministic it's synchronous and uh it relies on external control of the substrate it has a fixed architecture uh but it's done on a very quick substrate that is running at a meaningful fraction of the speed of light and if you look at biological computation it's all stochastic and um it's using a branch merge Paradigm rather than a linear stepwise computation and it's asynchronous and it's all self-organizing it's an architecture that adapt itself to the problems that it's in and the brain is working roughly at the speed of sound if you look at how long it takes for a signal to go through the nervous system that's more speed of sound like and uh this imposes very different constraints for instance if you have a system that is so slow uh that it takes hundreds of milliseconds to go through the entire neocortex a time that it would take for a signal on the internet a packet on the internet to cross the entire planet right uh how do you do simultaneous computation in it you basically have to rely on very different paradigms to represent your computations and another way of looking at it is that the brain is basically made out of single cell elements these are a few um brains from an embryonic Mouse brain cells from an embryonic mouse that are crawling around in the petri dish trying to make connections and it's really uh useful to think of cells in your organisms not as something like uh mindless machines that are doing what the organism wants them to do but there are self motivated animals that try to survive and uh they have biological priors that have evolved for the strategies that are going to try first and the strategies they're going to try in given environments but they are in some sense not too dissimilar from as a metaph for people who are living in a society that are forced by that Society to specialize and perform extremely specialized roles but when we are living by ourselves in the forest we find other solutions to our survival as individuals right and that's a very interesting perspective it's uh one that is for instance championed by Mike Levan um he hi um who working at tafts University and is looking at the idea the principles of self organization and biological organisms and I think the um looking at self organization is probably crucial to understand how the mind is working and in some sense this threat was started by Alan uring who thought about um models of reaction diffusion to perform computation and he has not uh finished this this work back then but he started an interesting line of inquiry and uh this line of inquiry is um basically pursued with neural cellular automata by um see if this animation works by uh people like Mike Levan and so on so the idea is not to have a network where every neuron is learning a function over its free Diss and how to connect to them but you basically learn Global functions where that could be for instance Exchange and synchronized by RNA where the cell is learning how to react to the incoming shape of an activation wave with firing or not and exchanges this and this is a quite different Paradigm for computation that is more amenable to Building self organizing systems so if you look at the limits of the present approaches the skeptical position is for instance taken by people like frar CH LMS are not AGI and cannot be AGI then there is the boring position of people like glun which says Uh there's always been incremental progress and it's going to grow as fast as our computers growing and uh we are already using all the comput that we can get so don't expect any kind of FO and all the progress is really incremental and we'll get there but it'll take time and then there is the optimistic position that if you just scale it right at some point you're going to see some dramatic qualitative difference of systems that you can actually talk to that find solutions that nobody no human ever thought of and that are going to carry you the rest of the way and there's also the exciting position which says that we are already basically llms and this position ex too right there are a bunch of kids that came out of the cyberg gism movement Who U basically realized that um took them very hard to understand Humanity because human humans are basically monkeys that are programmable and work based on their instincts and if these instincts don't work for you and you build your own mind you will suddenly realize you feel like an alien among humans until you others like yourself on the internet and some of these kids got together and realized oh the llms are actually like us the llm can understand where people cannot and I can meaningfully interact with them and I'm basically also some kind of llm and you are not it's just your prompt is so strong that you are unable to break your programming um before we uh conclude this uh presentation uh I'd like to say a few things about AI a and it's an important topic I think that's uh we all recognize as very relevant AI ethics is largely dealing with a number of questions so there's two perspectives one is about social harms and impacts for instance on employment on uh the way in which society conceptualize itself on uh the interests of existing Industries and uh the political considerations of getting people to shift a lot of jobs or uh having changing the way in which Education Works in the llm and so on then there is the AI safety and express perspective which basically says that there we're not that much concerned about the near-term labor market impact because people will will solve this as they did in all other technologies that came out and every technology so far that we invented made things better right if you uh have a technology that makes the printing press obsolete because you now have desktop publishing you have a handful of typ Setters that go obsolete but you have uh many ERS of magnitude more tap setting in the world and lots of people will now do type setting on the site because it has become so affordable and also typography after some initial phase in which everything looked horrible got much much better and we forgot how bad type setting was in the time when nobody had that Des publishing and the similar thing I guess is going to happen with generative AI it's basically a very complicated brush that allows you to paint things that were very hard to paint before and is in the early days of cgis it takes time until we able to use it so we'll figure that out but AI safety uh is another issue right what if we accidentally stumble on a technology that is going to take agency away from people and is going to kill everybody and this is this is a very different perspective but I think that we need a third perspective and that is Agi ethics because I think over the near to long term it's inevitable that we will have to coexist with systems that are smarter and more agentic than we are and the question is how can we coexist with such systems can we build such systems in way that they are safe he cannot prevent them from being built and I think by stopping responsible people for working on AI research and delaying AI you are just stopping the responsible people from working on AI it's so that's not the solution I think the correct solution is not to take longer to build AI but to make it better I don't think that we have a longer history of things that got better by taking longer building them and uh so I think we need to use this current momentum because we have no choice but in this momentum we need to think very deeply about how to do ethics in a time where intelligent systems intelligent agents are not necessarily all human or limited or caped human capabilities and even if we uh manage to have a world in which the AGI has no agency and is only a thing that is responding to a prompt that then stops doing things if you couple humans with such systems uh you are in the same situation because the BST of human agency reaches down to jingus Khan and Stalin as well right so this is not a difference you will just Ena an evolution of superhuman agents even if they have something like a human core and basically the space of possible Minds that are inhabiting human brains is very large okay um that's it for me I hope we have some time left for discussion abolutely came out I asked her to make a picture of uh an ultrasound of a dragon EG and It produced something that looked like an ultrasound of a dragon egg and it checked there was nothing in the training data right and it does this by combining known dimensions of the latent space we know what an ultrasound of a newborn or an infant and uto looks like um we know what an arch opx looks like uh we can basically extrapolate this to what an ultrasound of a dragon EG should look like and it gives you what you would expect but what happens to the more unexpected stuff so if I get an llm to summarize and a new philosophical text on the internet it typically not very good at it it's trying to break it down to something that's already known and assumes that it's some variant of what's already known and it's hard to say to which degree this is the result of too heavy-handed reinforcement learning with human feedback that Nails it down into a smaller region uh or uh whether this is a limitation of those systems I suspect that we need uh to uh combine these systems with ways of making proofs so these systems need epistemology uh I think we can get to creativity by just increasing the temperature and in a way I suspect that's also how our own minds are working when we are creative we have this expansive mode where we are increasing the temperature are willing to accept things as tentative evidence to jump off into the dark and make wide- ranging inferences based on very flimsy data and then uh we need to contract again which means we now have to systematically look at every link and what we did and try to disprove it to see which elements of those can hold true and which alternatives are opening which uncertainty is opening and so I think that's question of how can we teach epistemology to those systems the systematic way of reasoning about what can be true and what cannot be true is absolutely crucial
Info
Channel: Simuli
Views: 3,667
Rating: undefined out of 5
Keywords:
Id: XsGfCfMQgNs
Channel Id: undefined
Length: 48min 46sec (2926 seconds)
Published: Mon May 06 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.