Can We Build an Artificial Hippocampus?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

it is Humanity's longtime dream to create machines that can think but what exactly does it mean one particular characteristic of intelligence is the ability to generalize knowledge and flexibly adapt it to new situations such generalization is indeed one of the Cornerstone problems in modern machine learning in this video we are going to see how we can take biological organization over hippocampus a brain structure involved in memory and navigation as an inspiration in order to construct a computational model that can learn to build abstractions and generalizations as well as the insights we can draw from this model both about our own brains and the field of artificial intelligence before we begin I'd like to warn you that this video is the continuation of the previous video in the series on cognitive Maps last time we explored neurobiological background of hippocampal competitions and introduced some general principles so if you haven't seen it I highly recommend check it out before watching this one since we are going to build up from there if you're interested stay tuned imagine you're an agent that walks around the world and whose only goal is to find rewards from an evolutionary perspective you can think about such an agent as an early organism which needs to look for food or mates now as an agent you have a certain repertoire of actions that you can take for example activate a sequence of muscles to move in a particular direction to choose the most rewarding actions you need to be able to predict the action outcomes and that effectively requires a mental model of the surrounding environment existence of such a model allows you to run mental simulations in your head to weigh the actions for example what would happen if I go straight or is it better to turn right over the course of your lifetime as you encounter a variety of different environments initially you might build an entangled indivisible model for each without necessarily linking different models to each other however if you are being optimal in your representations at some point you realize wait a minute all these models have built so far actually have an awful lot in common indeed walls that block your way doors that allow you to go through the walls and even just the structure of an open 2D space itself all work similarly in every environment so these common elements can be easily reused in other words it makes sense to break up or factor each module into its building blocks for example building blocks of space of boundaries reward Etc once these building blocks are learned we can rearrange and mix them in different configurations to build new models of the world on the Fly and thus generate flexible Behavior as you might remember from part 1 this is exactly what my mailing hippocampus does and we can find neurobiological evidence for this process in responses of individual cells now the question is can we teach a machine to do the same to make the task easier for an artificial system let's formulate it as a prediction problem namely the model will receive a sequence of observations along with the sequence of actions that led to them and learn to correctly predict the next observation in the sequence it actually makes a lot of sense biologically there is a great deal of data suggesting that the main purpose of the brain may be to predict the incoming stimuli and try to minimize the prediction error a theory called predictive coding for example consider the following sequence of observations and actions can you tell me what should be the next element in the sequence seems impossible right however what if I told you that those actions one through four actually stand for directions North West South and East now the task becomes much easier because you know the rules how to chain these actions together you can predict the next observation to be the same as the first one since you know you essentially closed a loop in other words knowing the structure of space significantly simplifies the prediction problem but the module of course would not know this underlying structure since it would be no fun instead it would need to extract repeating patterns in order to somehow infer the structure of the underlying World from sequences of observations and actions for example after seeing a large number of sequences like these it should infer the rules how different actions relate to each other which is equivalent to constructing the structure of space it's important to point out that although I'm saying things like the model will learn the underlying structure of the world it is not told to do that exactly the model has no other goal so to speak other than predicting the next observation in the sequence in essence it is just a fancy mathematical expression with a large number of parameters that takes this set of numbers encoding the observations and actions performs computation on them and spits out another set of numbers corresponding to the next predicted observation but because we train it to minimize this prediction error and since these observations are not random but come from some structured world the optimal solution to this prediction problem is to construct some structural representation of this world which underlies the regularities in the observations so we simply expect the knowledge about the structure to emerge as a result of optimization but how should the model look like well because we are free to choose whichever architecture we want it is reasonable to draw inspiration from an existing biological machine that solves this problem on a daily basis the hippocampal formation in the last video we saw how the hippocampus receives two streams of inputs sensory the what am I seeing information coming from the lateral intrainal cortex and structural the where am I information from medial and terrinal cortex they are then combined in the hippocampus into a conjoined representation similarly our model will have an analog of medial and terrinal area responsible for keeping track of current location in the world let's call it a position module at every point in time it will receive an action and use it to compute the estimate of the current location the best guess of where it is in space you can think of this positional information as being encoded with the pattern of neuron activations inside of it note that the position module operates purely with actions and doesn't receive any information about the sensory observations similarly how if you close your eyes and walk around the room you have a rough idea of where you are located even though you don't see anything this is because your brain is able to accumulate self-movement vectors and estimate the position a process known as path integration so once the model is trained we expect our position module to be able to do the same another crucial component is the hippocampus itself which binds the wear information with what this binding is effectively forming an association between the two inputs so we need to add a memory module that would receive the positional information provided by the position module together with this stream of sensory inputs and store each encounter combination in memory essentially it memorizes the associations between position and observation I was at X when I saw y but storing memories would be useless if we couldn't retrieve them importantly since this is an associative memory module it should be able to reconstruct the full memory from partial information for example we could provide it with just the position and it would go and search in all of these stored memories which observations were accompanied by this position so essentially answering the question what did I see last time I was here and similarly we could provide it with just the sensory observation and it would retrieve position where was I last time I saw this now we have all the necessary components to solve the prediction problem let's walk step by step on what the trained model will do to come up with a successful prediction when walking for example on a family tree remember it should be capable of learning any type of structure not just the four connected grids so we start on John transition to Mary via a sister action and then to Kate via a daughter action finally we give the model of the action labeled as uncle and ask it to make a prediction at what's happening under the hood is the following at first the position module has some initial belief about the current location which is combined with John and this combination is stored in the memory module next the sister action is fed into the position module which comes up with a new belief about location that is combined with Mary and the corresponding conjunction is stored in memory similarly daughter action is used to update the internal state of the position module which is combined with Kate and sent to the memory module finally the uncle action is fed into the position module importantly the resulting positional information the pattern of neuron activations is the same as the one we started with this is because after the model is trained on many family trees under Lane by the same rules the position module is configured to always return to the same position when we make loops like these in other words the general laws governing the transition Logic on the world graph become embedded into the rules of how the position module updates its state after performing path integration correctly we return to this starting position but there is no corresponding sensory observation to memorize instead because the model has reached the end of the sequence it tries to predict the next observation but it has the path integrated position to guide this prediction so it queries the memory module with the positional information and retrieves a sensory observation corresponding to this particular position which in our case is John awesome right so far we have just been theorizing about this spherical model in a vacuum but does it actually well work and if so what does it tell us about our own navigational systems the most direct way to assess how well the model is performing is to look at its accuracy which is just the percentage of predictions it made correctly and importantly look at how quickly the accuracy grows here is what I mean imagine for a moment that instead of this fancy machine we had just a good old lookup table which simply memorizes all the transitions as pairs previous observation plus action equals new observation so it would store memories like John class sister equals Mary Mary plus daughter equals Kate Etc and to predict the next observation it would simply scan the lookup table and search for a particular combination in the case of our family tree example on first try it would not be able to predict that Kate's uncle is Jon because it hadn't encountered this particular combination previously in other words to reach 100 accuracy it would need to first encounter all possible combinations of observations and actions and this means that the performance of the model depends on the number of edges of this graph that it visited in contrast Tom and eigenbao machine or tem doesn't need to be explicitly told at the outcome of every action from every node because it has the notion of a structure for example if I tell you that Kate is Mary's daughter that is enough for you to infer the rest of the relationships automatically and this essentially means that to reach 100 accuracy it is enough for tem to just visit all the nodes instead of all possible edges and hence its performance depends on the proportion of nodes visited which grows much faster than the proportion of edges so our machine seems to indeed construct a representation of the world hooray but what's going on inside of its brain so to speak let's look inside the position module first remember the belief about current location is encoded by the pattern of collective activation of neurons but we can also interrogate individual neurons and look at what each one of them is doing as the agent randomly walks around here for the sake of visualization I'm going to show results after the model was trained on regular four connected grids analogues or physical 2D space rather than social hierarchies remarkably we see that individual units in the position module develop periodic activity patterns as a function of position they tile the space with a regular hexagonal grids of different size or these periodic stripes exactly like grid cells and band cells of the enterinal cortex in code position in mammalian brains and the selectivity of individual units is preserved across environments suggesting that they indeed can generalize [Music] neurons in the memory module do something different since they form a conjunction between positional and sensory information each neuron would be active when both of the two Upstream components are active indeed units in the memory module resemble hippocampal Place cells of various size which fire in a particular patch of space importantly just like hippocampal representations in real brains they are firing patterns differ across environments since the incoming observations are different this is known as hippocampal remapping I'd like to emphasize that such grid-like and place-like representations were never hard-coded into the model we started with essentially a random set of parameters let the model optimize itself to come up with the best solution to the prediction problem and those responses just emerged naturally so far we've trained the model on sequences that were generated from random walks in a given environment which means that all the observations were equally likely but in the real life animals don't really move by diffusion they are biased towards rewards and exploring objects they like being near walls because it feels safe and avoid open spaces so the question is if we change the statistics of this sensory observations so that some stimuli are more common than others would it affect the representations that emerge in our model as the optimal solution to the prediction problem for example let's train them on sequences of observations that mimic the behavior of a real Mouse which prefers to spend time near boundaries and approaches objects in this case representations that emerge in the position module now include boundary cells which are selective to borders of the world and object Vector cells that seem to activate whenever the animal is at a certain distance and certain direction away from any object both of these types of responses that by the way also generalize across contexts are observed experimentally when recording from enterinal cortex while some neurons in the memory module develop selectivity to particular objects resembling Landmark cells of the hippocampus if we take a more complex sequence such as the one mimicking the animal that is performing an alternation task the model successfully learns the rule that the reward is alternating between the sides importantly representations of some neurons in the memory module resemble the splitter cells found experimentally which are modulated by both the position and the direction of the Future Turn this suggests that temp has the capacity to learn and map latent spaces which are not directly given to it in the observations another example of how tem Maps latent space is available as a bonus clip to my patrons Porters more details at the end of this video terrific now we have a model that can generalize and naturally develops same representations of space as the hippocampal formation so what insights can we draw from it recall that play sells remap which means they changed their preferred firing locations in different environments this process has not been thought to be random since there is no immediate logic in how these representations drift around but having a model of hippocampal formation at hand we can start to address this question on a whole other level notice that neurons in our memory module the ones that resemble play cells are actually conjunctions between sensory and structural information this means that firing of a particular place cell is partially controlled by grid cells which provide the structural information so for example if in one environment the location of a given Place cell coincides with the hexagonal activity pattern of a particular grid cell then after we change the surroundings and the place sell remaps its place field will shift to another location which also lies on this grid in other words remapping is not completely random but rather is controlled by grid cells preserving some structural information this relationship between the locations of place and grid cells implies that there should be a correlation between the degree to which firing locations of placing grid cells coincide across two environments this is the case in the model and remarkably when the authors tested this prediction on experimental data they found it to be true in real brains as well well I know this was a ton of information to process so let's try to tie everything together the problem of constructing internal models of the world is Cornerstone for both biological and artificial intelligence factorizing this rounding into building blocks and combining them with particular sensory contexts to generate new models on the go allowing for Rapid generalization this factorization and composition can be demonstrated in a computational model which when tasked with predicting the next observation in a sequence learns the underlying relational structure of the world representations that naturally emerge in this model resemble real neurons found in the hippocampal formation suggesting a unified framework of interactions between enterinal cortex and hippocampus I want to take this opportunity to give huge thanks to Dr James Whittington the first author of the original term paper and Gus my friend and fellow Patron with an expertise in machine learning who both helped me immensely with preparing the script for this video as a final note I will mention that the Tolman eigenbaum machine we've seen today is actually very similar to a Transformer architecture a type of neural network that is at the core of modern machine learning in fact with one little modification we can turn this similarity into a precise mathematical equivalence and this modified version called the Tolman eigenbao machine Transformer learns much faster and performs better while still resembling biological representations for the most part this potentially provides a very promising link between neuroscience and modern machine learning which makes both Fields even more exciting than ever now I know this was a very simplified description but fully exploring this equivalence would require going over the Transformer and hopefield networks in detail let me know down in the comment section if you would like to see a more technical video of this kind in the meantime if you're interested in machine learning and don't want to wait any longer let me tell you about something that can take your understanding to the next level greenland.org Breeland is a revolutionary platform for engaging and Interactive Learning gone are the days of passive textbook reading with brilliant you'll engage with the material in a Hands-On way solving problems answering questions and participating in stunning interactive visualizations which help you develop an intuitive understanding of the material one course that you might find particularly interesting after watching this video is titled artificial neural networks offers an accessible introduction into the world of artificial intelligence and how it is inspired by the human brain you will learn how neural networks work how to build your own and even how to train them to recognize patterns but that's just the tip of the iceberg with over 80 courses to choose from brilliant has something for everyone and with its personalized approach you can learn at your own pace with bite-sized chunks take your curiosity to the next level today go to brilliant.org artem carsonov to get a 30-day free trial of everything brilliant has to offer and the first 200 people to use this link will get 20 off the premium subscription if you enjoyed this video press the like button share it with your friends and colleagues subscribe to the channel if you haven't already and consider supporting me on patreon to suggest video topics and enjoy the bonus content stay tuned for more interesting topics coming up goodbye and thank you for the interest in the brain foreign

Info

Channel: Artem Kirsanov

Views: 192,686

Rating: undefined out of 5

Keywords: Neuroscience, Brain, Machine Learning, Hippocampus, Generalization, Tolman-Eichenbaum Machine, Transformer, Transformer architecture, Spatial navigation, Memory, Computational modeling, Computational neuroscience, Deep learning, Neural Networks

Id: cufOEzoVMVA

Channel Id: undefined

Length: 23min 51sec (1431 seconds)

Published: Sun Apr 30 2023