NVIDIA Unveils "NIMS" Digital Humans, Robots, Earth 2.0, and AI Factories

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Jen sen hang the CEO of Nvidia took the stage of Taiwan over the weekend and he made some major announcements including what is his vision of the future of artificial intelligence and a lot of it is completely mind-blowing from AI factories to robots to digital humans I mean this is the stuff of the future so I'm going to show you the most interesting part of his keynote let's watch please welcome to the stage Nvidia founder and CEO Jensen law [Music] [Music] T I am very happy to be back in this intro clip he's going to show off everything that he's going to be talking about for the rest of the video and he says that none of it is pre-generated it is all simulated with AI and it includes everything from digital humans to weather prediction to completely automated Factory robots let's take a look everything that I show you today is simulation it's math it's science it's computer science it's amazing computer architecture none of it's animated and it's all homemade this is invidious soul and we put it all into this virtual world we called Omniverse please enjoy [Music] [Music] next he really shows how Nvidia has the biggest moat possible with really their Cuda framework and all of the flavors of Cuda and if you're not familiar Cuda is basically the library that runs the these gpus that does this hard math at scale and that's why it is so difficult to break into the hardware industry because that Cuda software is a huge mode and everybody's adopted it at this point accelerated Computing does deliver extraordinary results but it is not easy why is it that it saves so much money but people haven't done it for so long the reason for that is because it's incredibly hard there is no such thing as a soft that you can just run through a c compiler and all of a sudden that application runs a 100 times faster that is not even logical if it was possible to do that they would have just changed the CPU to do that you in fact have to rewrite the software that's the hard part the software has to be completely Rewritten so that you could reactor reexpress the algorithms that was written on a CPU so that it could be accelerated offloaded accelerated and run in parallel that computer science exercise is insanely hard well we've made it easy for the world over the last 20 years of course the very famous coup DNN the Deep learning library that processes neural networks we have a library for AI physics that you could use for fluid dynamics and many other applications where the neural network has to obey the laws of physics we have a great new library called aiel that is a Cuda accelerated 5G radio so that we can software Define and accelerate the Telecommunications networks the way that we've soft software defined the world's networking um uh internet and so the ability for us to accelerate that allows us to turn all of Telecom into essentially the same type of platform a Computing platform just like we have in the cloud litho is a comp computational lithography platform that allows us to uh process the most computationally intensive parts of Chip manufacturing making the mask tsmc is in the process of going to production with kitho saving enormous amounts of energy and more enormous amounts of money but the goal for tsmc is to accelerate their stack so that they're prepared for even further advances in algorithm and more computation for for deeper and deeper uh narrow and narrow transistors parab bres is our Gene sequencing Library it is the highest throughput library in the world for Gene sequencing coop is an incredible library for combinatorial optimization route planning optimization the traveling salesman problem incredibly complicated people just people have well scientists have largely concluded that you needed a quantum computer to do that we created an algorithm that runs on accelerated Computing that runs Lightning Fast 23 World Records we hold every single major world record today C Quantum is an emulation system for a quantum computer if you want to design a quantum computer you need a simulator to do so if you want to design Quantum algorithms you need a Quantum emulator to do so how would you do that how would you design these quantum computers create these Quantum algorithms if the quantum computer doesn't exist while you use the fastest computer in the world that exists today and we call it of course Nvidia Cuda and on that we have an emulator that simulates quantum computers it is used by several hundred thousand researchers around the world it is integrated into all the leading M leading Frameworks for Quantum Computing and is used in scientific super super Computing centers all over the world CDF is an unbelievable library for data processing data processing consumes the vast majority of cloud spend today all of it should be accelerated qdf accelerates the major libraries used in the world spark many of you probably use spark in your companies pandas a new one called polar and of course uh networkx which is a graph processing graph processing uh database um library and so these are just some examples there are so many more each one of them had to we created so that we can enable the ecosystem to take advantage of accelerated Computing if we hadn't created CNN Cuda alone wouldn't have been able wouldn't have been possible for all of the deep learning scientists around the world to use because Cuda and the algorithms that are used in tensor flow and pytorch the Deep learning algorithms the separation is too far apart it's almost like trying to do computer Graphics without openg it's it's almost like doing data processing without SQL these domain specific libraries are really The Treasure of our company we have 350 of them these libraries Is What It Takes and what has made it possible for us to have such open so many markets next he's going to talk about what he is dubbing Earth to which is a completely simulated digital twin of the earth and it's going to be used to do a lot of things of of course predict weather patterns and try to prevent weather disasters and then he's going to play a demo and keep in mind everything you're looking at is completely simulated not a movie not CGI this is Earth two the idea that we would create a digital twin of the earth that we would go and simulate the Earth so that we could predict the future of our planet to better avert disasters or better understand the impact of climate change so that we can adapt better so that we could change our habits now this digital twin of Earth is probably one of the most ambitious projects that the world's ever undertaken and we're taking step large steps every single year and I'll show you results every single year but this year we made some great breakthroughs let's take a look on Monday the storm will Vier North again and approach Taiwan there are big uncertainties regarding its path different paths will have different levels of impact on Taiwan [Music] Earth CF fore c a [Music] cordi AI [Music] for NVIDIA 2 [Music] sh and now he's going to talk about what he is also calling the Big Bang of AI so everything prior to seeing chat GPT was all about vision and understanding but no one really understood the capabilities in terms of generative Ai and now that is the explosion we're seeing in the world today and he talks about the new AI Industrial Revolution which is really interesting and he references delivering software the old way which Microsoft did really package software delivered it and that was incredible but now rather than delivering software we are going to deliver AI we're going to deliver models and I think he's kind of getting at the fact that developers are not really going to be building software 10 years from now ai is going to be building software we are going to be building AI here let me show you something this slide okay and this Slide the fundamental difference is this until chat GPT revealed it to the world AI was all about perception natural language understanding computer vision speech recognition it's all about perception and detection this was the first time the world saw a generative AI It produced tokens one token at a time and those tokens were words some of the tokens of course could now be images or charts or tables songs words speech videos those token could be anything they anything that that you can learn the meaning of it could be tokens of chemicals tokens of proteins genes you saw earlier in Earth 2 we were generating tokens of the weather we can we can learn physics if you can learn physics you could teach an AI model physics the Ari model can learn the meaning of physics and it can generate physics we were scaling down to 1 kilometer not by using filtering it was generating and so we can use this method to generate tokens for almost anything almost anything of value we can generate steering wheel control for a car we can generate articulation for a robotic arm everything that we can learn we we can now generate we have now arrived not at the AI era but a generative AI era but what's really important is this this computer that started out as a supercomputer has now evolved into a Data Center and it produces one thing it produces tokens it's an AI Factory this AI Factory is generating creating producing something of Great Value a new commodity in the late 1890s Nicola Tesla invented an AC generator we invented an AI generator the AC generator generated electrons nvidia's AI generator generates tokens both of these things have large Market opportunities it's completely fungible in almost every industry and that's why it's a new Industrial Revolution we have now a new Factory producing a new commodity for every industry that is of extraordinary value and the methodology for doing this is quite scalable and the methodology of doing this is quite repeatable notice how quickly so many different AI models generative AI models are being invented literally daily every single industry is now piling on for the very first time the IT industry which is $3 trillion 3 trillion doll IT industry is about to create something that can directly serve a hundred trillion of Industry no longer just an instrument for information storage or data processing but a factory for generating intelligence for every industry this is going to be a manufacturing industry not a manufacturing industry of computers but using the computers in manufacturing this has never happened before quite an extraordinary thing what led started with accelerated Computing led to AI led to generative AI and now an industrial revolution now the impact to our industry is also quite significant of course we could create a new commodity a new product we call tokens for many Industries but the impact of ours is also quite profound for the very first time as I was saying earlier in 60 years every single layer of computing has been changed from CPUs general purpose Computing to accelerated GPU Computing where the computer needs instructions now computers process llms large language models AI models and whereas the Computing model of the past is retrieval based almost every time you touch your phone some pre-recorded text or pre-recorded image or pre-recorded video is retrieved for you and recomposed based on a recommender system to present it to you based on your habits but in the future your computer will generate as much as possible retrieve only what's necessary and the reason for that is because generated generated data requires less energy to go fetch information generated data also is more contextually relevant it will encode knowledge it will encode your understanding of you and instead of get that inform information for me or get that file for me you just say ask me for an answer and instead of a tool instead of your computer being a tool that we use the computer will now generate skills it performs tasks and instead of an industry that is producing software was which was a revolutionary idea in the early 90s remember the idea that Microsoft created for packaging software revolutionized the PC industry without package software what would we use the PC to do it drove this industry and now we have a new Factory a new computer and what we will run on top of this is a new type of software and we call it Nims Nvidia inference microservices now what what happens is the Nim runs inside this Factory and this Nim is a pre-train model it's an AI well this AI is of course quite complex in itself but the the Computing stack that runs AIS are insanely complex when you go and use chat GPT underneath their stack is a whole bunch of software underneath that prompt is a ton of software and it's incredibly complex because the models are large billions to trillions of parameters it doesn't run on just one computer it runs on multiple computers it has to distribute the workload across multiple gpus tensor parallelism pipeline parallelism data parallel all kinds of parallelism expert parallelism all kinds of parallelism Distributing the workload across multiple gpus processing it as fast as possible because if you are in a factory if you run a factory your throughput directly correlates to your revenues your throughput directly correlates to quality of service and your throughput directly correlates to the number of people who can use your service we are now in a world where data center throughput utilization is vitally important it was important in the past but not vitally important it was important in the past but people don't measure it today every parameter is measured start time uptime you ization throughput idle time you name it because it's a factory when something is a factory it's operations directly correlate to the financial performance of the company and so we realize that this is incredibly complex for most companies to do so what we did was we created this AI in a box and the containers an incredible amounts of software inside this container is Cuda CNN tens RT Triton for inference Services it is cloud native so that you could Auto scale in a kubernetes environment it has Management Services and hooks so that you can monitor your AIS it has common apis standard API so that you could literally chat with this box you download this Nim and you can talk to it so long as you have Cuda on your computer which is now of course everywhere it's in every cloud available from every computer maker it is available in hundreds of millions of PCS when you download this you have an AI and you can chat with it like chat GPT all of the software is now integrated 400 dependencies all integrated into one we tested this Nim each one of these pre-trained models against all kind our entire install base that's in the cloud all the different versions of Pascal and ampers and Hoppers and all kinds of different versions I even forget some Nims incredible invention this is one of my favorites and of course as you know we now have the ability to create large language models and pre-trained models of all kinds and we we have all of these various versions whether it's language based or Vision based or Imaging based or we have versions that are available for healthc care digital biology we have versions that are digital humans that I'll talk to you about and the way you use this just come to ai. nvidia.com and today we uh just posted up in hugging face the Llama 3 Nim fully optimized it's available there for you to try and you can even take it with you it's aail ailable to you for free and so you could run it in the cloud run it in any Cloud you could download this container put it into your own Data Center and you could host it make it available for your customers and now he's going to talk about digital humans and he specifically says none of it is animated meaning pre-rendered on a computer not at all it is all AI generated in real time the voice the face the dialogue absolutely everything and he believes this is the future of how humans and computers are going to interact and it's a little bit scary watch this we could interact with these large these AI services with text prompts and speech prompts however there are many applications where we would like to interact with what what is otherwise a humanlike form we call them digital humans Nvidia has been working on digital human technology for some time let me show it to you and well before I do that hang on a second before I do that okay digital humans has the potential of being a great interact interactive agent with you they make much more engaging they could be much more empathetic and of course um we have to uh uh cross this incredible Chasm this uncanny Chasm of realism so that the digital humans would appear much more natural this is of course our vision this is a vision of where we love to go uh but let me show you where we are great to be in Taiwan before I head out to the night market let's dive into some exciting frontiers of digital humans imagine a future where computers interact with us just like humans can hi my name is Sophie and I am a digital human brand ambassador for Unique this is the incredible reality of digital humans digital humans will revolutionize industry from customer service to advertising and gaming the possibilities for digital humans are endless using the scans you took of your current kitchen with your phone they will be AI interior designers helping generate beautiful photorealistic suggestions and sourcing the materials and Furniture we have generated several design options for you to choose from they'll also be AI customer service agents making the interaction more engaging and personalized or digital healthcare workers who will check on patients providing timely personalized care um I did forget to mention to the doctor that I am allergic to penicillin is it still okay to take the medications the antibiotics you've been prescribed cicin and metronidazol don't contain penicillin so it's perfectly safe for you to take them and they'll even be AI brand ambassadors setting the next marketing and advertising Trends hi I'm IMA Japan's first virtual model new breakthroughs in generative Ai and computer Graphics let digital humans see understand and interact with us in humanlike ways from what I can see it looks like you're in some kind of recording or production setup the foundation of digital humans are AI models built on multilingual speech recognition and synthesis and llms that understand and generate conversation the AI connect to another generative AI to dynamically animate a lifelike 3D mesh of a face and finally AI models that reproduce lifelike appearances enabling realtime path traced subsurface scattering to simulate the way light penetrates the skin scatters and exits at various points giving skin its soft and translucent appearance Nvidia Ace is a suite of digital human Technologies packaged as easy to deploy fully optimized microservices or Nims developers can integrate Ace Nims into their existing Frameworks engines and digital human experiences neotron slm and llm Nims to understand our intent and orchestrate other models Reva speech Nims for interactive speech and translation audio to face and gesture Nims for facial and body animation an Omniverse RTX with dlss for neural rendering of skin and hair Ace Nims run on Nvidia gdn a Global Network of Nvidia accelerated infrastructure that delivers low latency digital human processing to over 100 regions next he is going to talk about the evolution of AI architecture and infrastructure and where do we go from here he's literally going to say that and he's talking about specifically the capabilities growth of this hardware and how that growth is accelerating so not only is the capability accelerating but the acceleration is accelerating look at this and so where do we go from here I spoke earlier about the scaling of our data centers and every single time we scale we found a new phase change when we scaled from dgx into large AI supercomputers we enabled Transformers to be able to train on enormously large data data sets well what happened was in the beginning the data was human supervised it required human labeling to train AIS unfortunately there is only so much you can human label Transformers made it possible for unsupervised learning to happen now Transformers just look at an enormous amount of data or look at enormous amount of video or look at enormous amount of uh images and it can learn from studying an enormous amount of data find the patterns and relationships itself while the next generation of AI needs to be physically based most of the AIS today uh don't understand the laws of physics it's not ground Ed in the physical world in order for us to generate uh uh images and videos and 3D graphics and many physics phenomenons we need AIS that are physically based and understand the laws of physics well the way that you could do that is of course learning from video is One Source another way is synthetic data simulation data and another way is using computers to learn with each other this is really no different than using Alpha go having Alpha go play itself self-play and between the two capabilities same capabilities playing each other for a very long period of time they emerge even smarter and so you're going to start to see this type of AI emerging well if the AI data is synthetically generated and using reinforcement learning it stands to reason that the rate of data generation will continue to advance and every single time data generation grows the amount of computation that we have to offer needs to grow with it we are about to enter a phase where AIS can learn the laws of physics and understand and be grounded in physical world data and he's going to now introduce Blackwell and Blackwell is their new GPU architecture the future of Nvidia and it's pretty darn cool let's watch and so we expect that models will continue to grow and we need larger gpus well Blackwell was designed for this generation this is Blackwell and has several very important Technologies uh one of course is just the size of the chip we took two of the largest a chip that is as large as you can make it at tsmc and we connected two of them together with a 10 terabytes per second link between the world's most advanced cies connecting these two together we then put two of them on a computer node connected with a gray CPU the gray CPU could be used for several things in the training situation it could use it could be used for fast checkpoint and restart in the case of inference and generation it could be used for storing context memory so that the AI has memory and understands uh the context of the in the conversation you would like to have it's our second generation Transformer engine Transformer engine allows us to adapt d dynamically to a lower precision based on the Precision and the range necessary for that layer of computation uh this is our second generation GPU that has secure AI so that you could you could ask your service providers to protect your AI from being either stolen from theft or tampering this is our fifth generation MV link MV link allows us to connect multiple gpus together and I'll show you more of that in a second and this is also our first generation with a reliability and availability engine this system this Ras system allows us to test every single transistor flip-flop memory on chip memory off chip so that we can in the field determine whether a particular chip is uh failing the mtbf the meantime between failure of a supercomputer with 10,000 gpus is a measure measured in hours the meantime between failure of a supercomputer with 100,000 gpus is measured in minutes and so the ability for a supercomputer to run for a long period of time and train a model that could last for several months is practically impossible if we don't invent Technologies to enhance its reliability reliability would of course enhance his uptime which directly affects the cost and the lastly decompression engine data processing is one of the most important things we have to do we added a data compression engine decompression engine so that we can pull data out of storage 20 times faster than what's possible today well all of this represents Blackwell and I think we have one here that's in production during GTC I showed you Blackwell in a prototype State um the other side this is why we practice [Laughter] ladies and gentlemen this is Blackwell black Blackwell is in production incredible amounts of Technology this is our production board this is the most complex highest performance computer the world's ever made this is the gray CPU and these are you could see each one of these Blackwell dyes two of them connected together you see that it is the largest die the largest chip the world makes and then we connect two of them together with a 10 terab per second link okay and that makes the Blackwell computer and the performance is incredible take a look at this so um you see you see our uh the the computational the flops the AI flops uh for each generation has increased by a, times in eight years Mo's law in eight years is something along the lines of oh I don't know maybe 40 60 and in the last eight years Moors law has gone a lot lot less and so just to compare even Moors law at its best of times compared to what Blackwell could do so the amount of computations is incredible and when whenever we bring the computation High the thing that happens is the cost goes down and I'll show you what we've done is we've increased through its computational capability the energy used to train a gp4 two trillion parameter 8 trillion Tokens The amount of energy that is used has gone down by 350 times well Pascal would have taken 1,000 gwatt hours 1,000 gwatt hours means that it would take a gigawatt data center the world doesn't have a gigawatt data center but if you had a gigawatt data center it would take a month if you had a 100 watt 100 megawatt data center it would take about a year and so nobody would of course um uh create such a thing and um and that's the reason why these large language models chat GPT wasn't possible only eight years ago by us driving down the increasing the performance the energy efficent while keeping and improving energy efficent efficiency along the way we've now taken with Blackwell what used to be a th000 gwatt hours to three an incredible Advance uh 3 gwatt hours if it's a if it's a um uh uh a 10,000 gpus for example it would only take a coule 10,000 gpus I guess it would take a few days 10 days or so so the amount of advance in just eight years is incredible well this is for inference this is for token generation Our token generation performance has made it possible for us to drive the energy down by three 45,000 times 177,000 Jewels per token that was Pascal 17,000 Jews it's kind of like two light bulbs running for 2 days it would take two light bulbs running for two days amounts of energy 200 Watts running for two days to generate one token of GPT 4 it takes about three tokens to generate one word and so the amount of energy used necessary for Pascal to generate gp4 and have a chat GPT experience with you was practically impossible but now we only use 0.4 jewles per token and we can generate tokens at incredible rates and very little energy okay so Blackwell is just an enormous leap well even so it's not big enough and so we have to build even larger machines and so the way that we build it is called dgx so this is this is our Blackwell chips and it goes into dgx systems that's why we should practice so this is a dgx Blackwell this has this is air cooled has eight of these gpus inside look at the size of the heat sinks on these gpus about 50 15 kilow 15,000 watts and completely air cooled this version supports x86 and it's it goes into the infrastructure that we've been shipping Hoppers into however if you would like to have liquid cooling we have a new system and this new system is based on this board and we call it mgx for modular and this modular system you won't be able to see this this can they see this can you see this you can the are you okay I say and so this is the mgx system and here's the two uh black black wall boards so this one node has four Blackwell chips these four Blackwell chips this is liid cooled nine of them nine of them uh well 72 of these 72 of these gpus 72 of these G gpus are then connected together with a new MV link this is MV link switch fifth generation and the mvlink switch is a technology Miracle this is the most advanced switch the world's ever made the data rate is insane and these switches connect every single one of these black Wells to each other so that we have one giant 72 GPU Blackwell well the benefit the benefit of this is that in one domain one GPU domain this now looks like one GPU this one GPU has 72 versus the last generation of eight so we increased it by nine times the amount of bandwidth we increased by 18 times the AI flops we've increased by 45 times and yet the amount of power is only 10 times this is 100 kilowatt and that is 10 kilow and that's for one now of course well you could always connect more of these together and I'll show you how to do that in a second but what's the miracle is this chip this MV link chip people are starting to awaken to the importance of this mvlink chip as it connects all these different gpus together because the large language models are so large it doesn't fit on just one GPU doesn't fit on one just just one node it's going to take the entire rack of gpus like this new dgx that I that I was just standing next to uh to hold a large language model that are tens of trillions of parameters large mvlink switch in itself is a technology Miracle it's 50 billion transistors 74 ports at 400 gbits each four lengths cross-sectional bandwidth of 7 7.2 terabytes per second but one of the important things is that it has mathematics inside the switch so that we can do reductions which is really important in deep learning right on the Chip And so this is what this is what um a djx looks like now and a lot of people ask us um you know they say and there's this there's this confusion about what Nvidia does and and um how is a possible that that Nvidia became so big building gpus and so there's an impression that this is what a GPU looks like now this is a GPU this is one of the most advanced gpus in the world but this is a gamer GPU but you and I know that this is what a GPU looks like this is one GPU ladies and gentlemen dgx GPU you know the back of this GPU is the MV link spine the MV link spine is 5,000 wires a two miles and it's right here this is an MV link spine and it connects 702 gpus to each other this is a electrical mechanical Miracle the transceivers makes a possible for us to drive the entire length in Copper and as a result this switch the mvy switch MV length switch driving the envying spine in Copper makes it possible for us to save 20 kilowatts in one rack 20 kW could now be used for processing just an incredible achievement so this is is the the mvy links [Applause] spine wow that went down today and if you and even this is not big enough even this is not big enough for AI Factory so we have to connect it all together with very high-speed networking well we have two types of networking we have infiniband which has been used uh in super Computing and AI factories all over the world and it is growing incredibly fast for us however not every data center can handle infiniband because they've already invested their ecosystem in Ethernet for too long and it does take some specialty and some expertise to manage infin band switches and infin ban networks and so what we've done is we've brought the capabilities of infiniband to the ethernet architecture which is incredibly hard and the reason for that is this ethernet was designed for high average throughput because every single note every single computer is connected to a different person on the internet and most of the communications is the data center with somebody on the other side of the internet however deep learning and AI factories the gpus are not communicating with people on the internet mostly it's communicating with each other they're communicating with each other because they're all they're collecting partial products and they have to reduce it and then redistribute it chunks of partial products reduction redistribution that traffic is incredibly bursty and it is not the average throughput that matters it's the last arrival that matters because if you're reducing collecting partial products from everybody if I'm trying to take all of your so it's not the average throughput is whoever gives me the answer last okay ethernet has no provision for that and so there are several things that we have to create we created an endtoend architecture so that the the Nick and the switch can communicate and we applied four different Technologies to make this possible number one Nvidia has the world's most advanced RDMA and so now we have the ability to have a network level RDMA for ethernet that is incredibly great number two we have congestion control the switch does Telemetry at all times incredibly fast and whenever the uh the uh the the gpus or the the Nyx are sending too much information we can tell them to back off so that it doesn't create hotpots number three adaptive routing ethernet needs to transmit and receive in order we see congestions or we see uh ports that are not currently being used irrespective of the ordering we will send it to the available ports and Bluefield on the other end reorders it so that it comes back in order that adaptive routing incredibly powerful and then lastly noise isolation there's more than one model being trained or something happening in the data center at all times and their noise and their traffic could get into each other and causes Jitter and so when it when the noise of one training model one model training causes the last arrival to end up too late it really slows down to training well overall remember you have you have built a $5 billion or3 billion Data Center and you're using this for training if the utilization Network utilization was 40% lower and as a result the training time was 20% longer the5 billion data center is effectively like a$6 billion dollar data center so the cost is incredible the cost impact is quite High ethernet with Spectrum X basically allows us to improve the performance so much that the network is basically free and so this is really quite an achievement we're very we have a whole pipeline of ethernet products behind us this is Spectrum x800 it is uh 51.2 terabits per second and 256 radic the next one coming is 512 Ric is one year from now 512 Radix and that's called Spectrum x800 Ultra and the one after that is x1600 but the important idea is this x800 is designed for tens of thousands tens of thousands of gpus x800 ultra is designed for hundreds of thousands of gpus and x1600 is designed for millions of gpus the days of millions of GPU data centers are coming and the reason for that is very simple of course we want to train much larger models but very importantly in the future almost every interaction you have with the Internet or with a computer will likely have a generative AI running in the cloud somewhere and that generative AI is working with you interacting with you generating videos or images or text or maybe a digital human and so you're interacting with your computer almost all the time and there's always a generative AI connected to that some of it is on Prem some of it is on your device and a lot of it could be in the cloud these generative a will also do a lot of reasoning capability instead of just one-hot answers they might iterate on answers so that it improve the quality of the answer before they give it to you and so the amount of generation we're going to do in the future is going to be extraordinary let's take a look at all of this put together now tonight this is our first nighttime time keynote I want to thank I want to thank all of you for coming out tonight at 7 o'clock and and so what I'm about to show you has a new Vibe okay there's a new Vibe this is kind of the nighttime keynote Vibe so enjoy this [Music] [Music] [Music] w [Music] now you can't do that on a morning key note Blackwell of course uh is the first generation of Nvidia platforms that was launched at the beginning at the right as the world knows the generative AI era is here just as the world realized the importance of AI factories just as the beginning of this new Industrial Revolution uh we have so much support nearly every OEM every computer maker every CSP every GPU Cloud Sovereign clouds even telecommunication companies Enterprises all over the world the amount of success the amount of adoption the amount of of uh enthusiasm for Blackwell is just really off the charts and I want to thank everybody for that we're not stopping there uh during this during the time of this incredible growth we want to make sure that we continue to enhance performance continue to drive down cost cost of training cost of inference and continue to scale out AI capabilities for every company to embrace the further we the further performance we drive up the Greater the cost decline Hopper platform of course was the most successful data center processor probably in history and this is just an incredible incredible success story however Blackwell is here and every single platform as you'll notice are several things you got the CPU you have the GPU you have MV link you have the Nick and you have the switch the the mvlink switch the uh connects all of the gpus together as large of a domain as we can and whatever we can do we connect it with large um very large and very high-speed switches every single generation as you'll see is not just a g pu but it's entire platform we build the entire platform we integrate the entire platform into an AI Factory supercomputer however then we disaggregate it and offer it to the world and the reason for that is because all of you could create interesting and Innovative configurations and and and all kinds of different uh uh Styles and and fit different uh data centers and different customers and different places some of it for Edge some of it for Telco and all of the different Innovation are possible if it would we made the systems open and make it possible for you to innovate and so we design it integrated but we offer it to you disintegrated so that you could create modular systems the Blackwell platform is here our company is on a one-year Rhythm we're our basic philosophy is very simple one build the entire data center scale disaggregate it and sell it to you in Parts on a one-year rhythm and we push everything to technology limits whatever tsmc process technology will push it to the absolute limits whatever packaging technology push it to the absolute limits whatever memory technology push it to Absolute limits series technology Optics technology everything is pushed to the Limit well and then after that do everything in such a way so that all of our software runs on this entire installed base software inertia is the single most important thing computers it'll when a computer is backwards compatible and it's architecturally compatible with all the software that has already been created your ability to go to market is so much faster and so the velocity is incredible when we can take advantage of the entire installed Bas of software that's already been created while Blackwell is here next year is Blackwell Ultra just as we had h100 and h200 you'll probably see some pretty exciting new New Generation from us for Blackwell Ultra again again push to the limits and the Next Generation Spectrum switches I mentioned well this is the very first time that this next click has been made and I'm not sure yet whe I'm going to regret this or [Applause] not we have code names in our company and uh we try to keep them very secret uh often times uh most of the employees don't even know but our next Generation platform is called Ruben the Reuben platform the Reuben platform um I'm I'm not going to spend much time on it uh I know what's going to happen you're going to take pictures of it and you're going to go look at the fine prints uh and feel free to do that so we have the reubin platform and one year later we' have the Reuben um Ultra platform all of these chips that I'm showing you here are all in full development 100% of them and the rhythm is one year at the limits of Technology all 100% architecturally compatible so this is this is basically what Nvidia is building and all of the riches of software on top of it so in a lot of ways the last 12 years from that moment of imag net and US realizing that the future of computing was going to radically change to today is really exactly as I was holding up earlier GeForce pre 20102 and Nvidia today the company has really transformed tremendously and I want to thank all of our partners here for supporting us every step along the way this is the Nvidia blackwall platform and next he's going to talk about physical AI real world AI embodied AI something that Dr Jim fan from the Nvidia team has been releasing research papers I've made a bunch of videos about and now he's going to introduce some new tech and some really cool demo video so let's watch the next wave of AI is physical ai ai that understands the laws of physics AI that can work among us and so they have to understand the world model so that they understand how to interpret the world how to perceive the world they have to of course have excellent cognitive capabilities so they can understand us understand what we asked and perform the tasks in the future robotics is a much more per pervasive idea of course when I say robotics there's a humanoid robotics that's usually the representation of that but that's not at all true everything is going to be robotic all of the factories will be robotic the factories will orchestrate robots and those robots will be building products that are robotic robots interacting with robots building products that are robotic well in order for us to do that we need to make some breakthroughs and let me show you the video the era of Robotics has arrived one day everything that moves will be autonomous researchers and companies around the world are developing robots powered by physical AI physical AIS are models that can understand instructions and autonomously perform complex tasks in the real world multimodal llms are breakthroughs that enable robots to learn perceive and understand the world around them and plan how they'll act and from Human demonstrations robots can now learn the skills required to interact with the world using gross and fine motor skills one of the integral Technologies for advancing robotics is reinforcement learning just as llms need rlf or reinforcement learning from Human feedback to learn particular skills generative physical AI can learn skills using reinforcement learning from physics feedback in a simulated World these simulation environments are where robots learn to make decisions by performing actions in a virtual world that obeys the laws of physics in these robot gyms a robot can learn to perform complex and dynamic tasks safely and quickly refining their skills through millions of Acts of trial and error we built Nvidia Omniverse as the operating system where physical AIS can be created Omniverse is a development platform for virtual world simulation combining realtime physically based rendering physics simulation and generative AI Technologies in Omniverse robots can learn how to be robots they learn how to autonomously manipulate objects with Precision such as grasping and handling objects or navigate environments autonomously finding optimal paths while avoiding obstacles and Hazards learning in Omniverse minimizes the Sim to real Gap and maximizes the transfer of learned behavior building robots with generative physical AI requires three computers Nvidia AI supercomputers to train the models Nvidia Jetson Orin and Next Generation Jetson Thor robotic supercomputer to run the models an Nvidia Omniverse where robots can learn and refine their skills in simulated worlds we build the platforms acceleration libraries and AI models needed by developers and companies and allow them to use any or all of the stacks that suit them best the next wave of AI is here robotics powered by physical AI will revolutionize Industries and now he's going to show off the Omniverse digital twin of a factory and all the benefits of actually ulating a factory so let's watch now let's talk about factories factories has a completely different ecosystem and foxcon is building some of the world's most advanced factories their ecosystem again Edge Computers and Robotics software for Designing the factories the workflows programming the robots and of course PLC computers that orchestrate uh the digital factories and the AI factories we have sdks uh that are connected into each one of these ecosystems as well this is happening all over Taiwan foxcon has buil is building digital twins of their factories Delta is building digital twins of their factories by the way half is real half is digital half is Omniverse pegatron is building digital twins of their robotic factories witron is building digital twins of their robotic factories and this is really cool this is a video of foxc con's new Factory let's take a look demand for Invidia accelerated Computing is skyrocketing as the world modernizes traditional data centers into generative AI factories foxcon the world's largest electronics manufacturer is gearing up to meet this Demand by building robotic factories with Nvidia Omniverse and AI Factory planners use om ver to integrate facility and Equipment data from leading industry applications like seens team Center X and Autodesk Revit in the digital twin they optimize floor layout and line configurations and locate optimal camera placements to monitor future operations with Nvidia Metropolis powered Vision AI virtual integration saves planners on the enormous cost of physical change orders during construction the foxcon teams use the digital twin as the source of Truth to communicate and validate accurate equipment layout the Omniverse digital twin is also the robot gym where foxcon developers train and test Nvidia ISAC AI applications for robotic perception and manipulation and Metropolis AI applications for Sensor Fusion in Omniverse foxcon simulates two robot AIS before deploying run times to Jets and computers on the assembly line they simulate Isaac manipulator libraries and AI models for automated Optical inspection for object identification defect detection and trajectory planning to transfer hgx systems to the test pods they simulate Isaac perceptor powered fobot amrs as they perceive and move about their environment with 3D mapping and reconstruction with Omniverse foxcon builds their robotic factories that orchestrate robots running on Nvidia ISAC to build Nvidia AI supercomputers which in turn train foxcon robots a robotic Factory is designed with three computers train the AI on Nvidia AI you have the robot running on the PLC systems uh for orchestrating the the uh the factories and then you of course simulate everything inside Omniverse well the robotic arm and the robotic amrs are also the same way three computer systems the difference is the two omniverses will come together so they'll share one virtual space when they share one virtual space that robotic arm will become inside the robotic Factory and again three three uh three computers and we provide the computer the acceleration layers and pre-train uh pre-trained AI models we've connected Nvidia manipulator and Nvidia Omniverse with Seamans the world's leading Industrial Automation software and systems company this is really a fantastic partnership and they're working on factories all over the world semantic pick AI now integrates Isaac manipulator and sematic pick AI uh runs operates AB CA yasawa uh Fook Universal robotics um and techman and so seens is a fantastic integration we have all kinds of other Integrations let's take a look arcbest is integrating Isaac perceptor into Vox smart autonomy robots for enhanced object recognition and human motion tracking in Material Handling byd Electronics is integrating Isaac manipulator and perceptor into their AI robots to enhance manufacturing efficiencies for Global customers ideal works is building Isaac perceptor into their IW software for AI robots in Factory Logistics intrinsic an alphabet company is adopting Isaac manipulator into their Flow State platform to advance robot grasping Gideon is integrating Isaac perceptor into Trey AI powered forklifts to advance AI enabled Logistics Argo robotics is adopting Isaac perceptor into perception engine for advanced vision-based amrs Solomon is using ISAC manipulator AI models and their acup piic 3D software for industrial manipulation techman robot is adopting Isaac Sim and manipulator into TM flow accelerating automated Optical inspection pteridine robotics is integrating Isaac manipulator into polycope X for cobots and Isaac perceptor into mere amrs vention is integrating Isaac manipulator into machine Logic for AI manipulation robots robotics is here physical AI is here this is not science fiction and it's being used all over Taiwan and just really really exciting and that's the factory the robots inside and of course all the products going to be robotic so there are two very high volume robotics products one of course is the self-driving car or cars that have a great deal of autonomous capability Nvidia again builds the entire stack next year we're going to go to production with the Mercedes Fleet and after that in 20 2026 the jlr Fe Fleet uh we offer the full stack to the world however you're welcome to take whichever Parts uh which whichever layer of our stack just as the entire the entire uh Drive stack is open the next next high volume robotics product that's going to be manufactured by robotic factories with robots inside will likely be humanoid robots and this has great progress in recent years in both the cognitive capability because of the foundation models and also the world understanding capability that we're in the process of developing I'm really excited about this area because obviously the easiest robot to adapt into the world are human or robots because we built the world for us we also have the vast the most amount of data to train these robots than other types of robots because we have the same uh physique and so the amount of training data we can provide through demonstration capabilities and video capabilities is going to be really great and so we're going to see a lot of progress in this area well I think we have um some robots that we' like to uh welcome here we go about my size [Applause] and we have we have some friends to join us so the fut the future of robot robotics is here the next wave of AI and and of course you know Taiwan builds computers with keyboards you build computers for your pocket you buil computers for data centers in the cloud in the future you're going to build computers that walk and computers that roll you know around and um so these are all just computers and as as it turns out uh the technology is very similar to the technology of building um all of the other computers that you already build today so this is going to be a really extraordinary uh Journey for us so really cool stuff I'm going to continue covering everything that Nvidia does because that is such an amazing company if you like to this video please consider giving a like And subscribe and I'll see you in the next one by
Info
Channel: Matthew Berman
Views: 667,085
Rating: undefined out of 5
Keywords: ai, jensen, nvidia, chips, ai chips, llm, nvidia ai, openai, chatgpt, blackwell, 4090
Id: IurALhiB6Ko
Channel Id: undefined
Length: 73min 59sec (4439 seconds)
Published: Wed Jun 05 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.