NVIDIA CEO Jensen Huang Reveals Breakthrough AI Chip at COMPUTEX 2024 (Supercut)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

this last 60 years we saw several tectonic shifts in Computing where everything changed and we're about to see that happen again the further performance we drive up the Greater the cost decline Hopper platform of course was the most successful data center processor probably in history however Blackwell is here and every single platform as you'll notice are several things you got the CPU you have the GPU you have MV link you have to Nick and you have to switch every single generation as you'll see is not just a GPU but it's an entire platform we build the entire platform we integrate the entire platform into an AI Factory supercomputer however then we disaggregate it and offer it to the world our basic philosophy is very simple one build the entire data center scale dis agregate it and sell it to you in Parts on a one-year Rhythm and we push everything that technology limits what whatever tsmc processed technology will push it to the absolute limits whatever packaging technology push it to the absolute limits whatever memory technology push it to Absolute limits cies technology Optics technology everything is pushed to the Limit while Blackwell is here next year is Blackwell Ultra just as we had h100 and h200 it'll probably see some pretty exciting New Generation from us for Blackwell Ultra well this is the very first time and I'm not sure yet whe I'm going to regret this or not we have code names in our company and we try to keep them very secret most of the employees don't even know but our next Generation platform is called Reuben so we have the Reuben platform and one year later we have the Reuben um Ultra platform all of these chips that I'm showing you here are all in full development 100% of them and the rhythm is one year at the limits of Technology all 100% architecturally compatible so this is this is basically what Nvidia is building and all of the riches of software on top of it so in a lot of ways the last 12 years the company has really transformed tremendously and I want to thank all of our partners here for supporting us every step along the way this is the Nvidia Blackwell platform ladies and gentlemen this is Blackwell this is our production board this is the most complex highest performance computer the world ever made this is the gray CPU and these are you could see each one of these blackw dieses two of them connected together you see that it is the largest D the largest chip the world makes and then we connect two of them together with a 10 terte pers second link and the performance is incredible take a look at this the AI flops uh for each generation has increased by a, times in eight years and so just to compare even Mo's law at its best of times compared to what Blackwell could do so the amount of computations is incredible and when whenever we bring the computation High the thing that happens is the cost goes down the amount of energy that is used has gone down by 350 times Well Pascal would have taken $1,000 gwatt hours 1,000 gwatt hours means that it would take a gwatt data center the world doesn't have a gwatt data center but if you had a gigawatt data center it would take a month if you had a 100 watt 100 megawatt data center it would take about a year and that's the reason why these large language models chat GPT wasn't possible only eight years ago with Blackwell what used to be 1,000 gwatt hours to three an incredible advance our token generation performance has made it possible for us to drive the energy down by three four 45,000 times okay so Blackwell is just an enormous leap well even so it's not big enough and so we have to build even larger machines and so the way that we build it is called dgx so this is a dgx Blackwell this has this is air cooled has eight of these gpus inside look at the size of the heat sinks on these gpus about 15 kilow 15,000 watts and completely air cooled this version supports x86 and it's it goes into the infrastructure that we've been shipping Hoppers into however if you would like to have liquid cooling we have a new system it's based on this board and we call it mgx for modular so this one node has four Blackwell chips and these switches connect every single one of these Blackwells to each other so that we have one giant 72 GPU Blackwell this now looks like one GPU this one GPU has 72 versus the last generation of eight so we increased it by nine times the amount of bandwidth we've increased by 18 times the AI flops we've increased by 45 times and yet the amount of power is only 10 times this is 100 kilow and that is 10 kilow and that's for one you could always connect more of these together and I'll show you how to do that in a second there's this confusion about what Nvidia does how is it possible that that Nvidia became so big building gpus and so there's an impression that this is what a GPU looks like now this is a GPU this is one of the most advanced gpus in the world but this is a game GPU but you and I know that this is what a GPU looks like this is one GPU L and gentlemen dgx GPU the back of this GPU is the MV link spine and it's right here this is an MV link spine and it connects 702 gpus to each other this is a electrical mechanical Miracle the transceivers makes it possible for us to drive the entire length in Copper and as a result this switch Envy link switch driving the envying spine in Copper makes it possible for us to save 20 kilow in one rack 20 Kow could now be used for processing just an incredible achievement so this is the the MV links spine and even this is not big enough even this is not big enough for AI Factory so we have to connect it all together with very high-speed networking well we have two types of networking we have infiniband which has been used uh in supercomputing and AI factories all over the world and it is growing incredibly fast for us however not every data center can handle infiniband because they've already invested their ecosystem in ethernet for too long and so what we've done is we've brought the capabilities of infiniband to the ethernet architecture which is incredibly hard ethernet was designed for high average throughput because every single note every single computer is connected to a different person on the internet and most of the communications is the data center with somebody on the other side of the internet however deep learning in AI factories the gpus are not communicating with people on the internet they're communicating with each other because they're all they're collecting partial products and they have to reduce it and then redistribute it chunks of partial products reduction redistribution that traffic is incredibly bursty and it is not the average throughput that matters it's the last arrival that matters is whoever gives me the answer last okay ethernet has no provision for that and so there are several things that we had to create we created an end to-end architecture so that the the Nick and the switch can communicate and we applied four different Technologies to make this possible number one Nvidia has the world's most advanced RDMA and so now we have the ability to have a network level RDMA for ethernet that is incredibly great number two we have congestion control the switch does telemet Tre at all times incredibly fast and whenever the Knicks are sending too much information we can tell them to back off so that it doesn't create hotspots number three adaptive routing ethernet needs to transmit and receive in order we see congestions or we see uh ports that are not currently being used irrespective of the ordering we will send it to the available ports and Bluefield on the other end reorders it so that it comes back in order that adaptive routing incredibly powerful and then lastly noise isolation there's more than one model being trained or something causes the last arrival to end up too late it really slows down to training well overall remember you have built A5 billion or3 billion Data Center and you're using this for training if the training time was 20% longer the $5 billion data center is effectively like a $6 billion data center so the cost impact is quite High ethernet with Spectrum X basically allows us to improve the performance so much that the network is basically free and so this is really quite an achievement we're very we have a whole pipeline of ethernet products behind us this is Spectrum x800 it is uh 51.2 terabits per second the next one coming is 512 Ric is one year from now 512 Ric and that's called Spectrum x800 Ultra and the one after that is x1600 but the important idea is this x800 is designed for tens of thousands of gpus x800 ultra is designed for hundreds of thousands of gpus and x1600 is designed for millions of gpus the days of millions of GPU data centers are coming and the reason for that is very simple of course we want to train much larger models but very importantly in the future almost every interaction you have with the Internet or with a computer will likely have a generative AI running in the cloud somewhere and that generative AI is working with you interacting with you generating videos or images or text or maybe a digital human and so you're interacting with your computer almost all the time and there's always a generative AI connected to that some of it is on Prem some of it is on your device and a lot of it could be in the cloud these generative AIS will also do a lot of reasoning capability instead of just one shot answers they might iterate on answers so that it improve the quality of the answer before they give it to you and so the amount of generation we're going to do in the future is going to be extraordinary let me talk about what's next the next way wave of AI is physical ai ai that understands the laws of physics they have to of course have excellent cognitive capabilities so they can understand us understand what we asked and perform the tasks of course when I say robotics there's a humanoid robotics that's usually the representation of that but that's not at all true everything is going to be robotic all of the factories will be robotic the factories will orchestrate robots and those robots will be building products that are robotic robots interacting with robots building products that are robotic well in order for us to do that we need to make some breakthroughs first we're going to create platforms for each type of robotic systems one for robotic factories and warehouses one for robots that manipulate things one for robots that move and one for uh robots that are humanoid and so each one of these robotic robotics platform is like almost everything else we do a computer acceleration libraries and pre-train models computers acceleration libraries pre-train models and we test everything we train everything and integrate everything inside Omniverse where Omniverse is where robots learn how to be robots now of course the ecosystem of robotic warehouses is really really comp complex it takes a lot of companies a lot of tools a lot of technology to build a modern warehouse and warehouses are increasingly robotic one of these days will be fully robotic now let's talk about factories factories has a completely different ecosystem a robotic Factory is designed with three computers train the AI on Nvidia AI you have the robot running on the PLC systems uh for orchestrating the factories and then you of course simulate everything inside on Universe well the robotic arm and the robotic amrs are also the same way three computer systems and we provide the computer the acceleration layers and pre-train uh pre-train AI models we've connected Nvidia manipulator and Nvidia Omniverse with seens the world's leading Industrial Automation software and systems company this is really a fantastic partnership and they're working on factories all over the world and that's the factory the robots inside and of course all the products going to be robotics there are two very high volume robotics products one of course is the self-driving car or cars that have a great deal of autonomous capability Nvidia again builds the entire stack next year we're going to go to production with the Mercedes Fleet and after that in 2026 the jlr fleet the next high volume robotics product that's going to be manufactured by robotic factories with robots inside will likely be humanoid robots and this has great progress in recent years in both the cognitive capability because of foundation models and also the world understanding capability that we're in the process of developing I'm really excited about this area because obviously the easiest robot to adapt into the world are human robots because we built the world for us we also have the most amount of data to train these robots than other types of robots because we have the same uh physique and so the the amount of training data we can provide through demonstration capabilities and video capabilities is going to be really great and so we're going to see a lot of progress in this area well I think we have um some robots that we' like to uh welcome here we go about my size and we have we have some friends to join us so the fut the future of robot robotics is here the next wave of AOS and and of course you know Taiwan builds computers with keyboards you build computers for your pocket you build computers for data centers in the cloud in the future you're going to build computers that walk and as as it turns out uh the technology is very similar to the technology of building um all of the other computers that you already built today so this is going to be a really extraordinary uh Journey for us thank you all for coming have a great comput text

Info

Channel: Ticker Symbol: YOU

Views: 106,733

Rating: undefined out of 5

Keywords: nvidia computex 2024, nvidia keynote, jensen huang keynote, nvidia blackwell gpu, nvidia rubin gpu, nvidia new chip, nvidia new gpu, nvda, nvda stock, nvidia stock, jensen huang, openai, chatgpt, gpt-4o, sora, msft, msft stock, ai stocks, semiconductor stocks, artificial intelligence stocks, nvidia stock news, nvidia news, omniverse, best ai stocks, top ai stocks, best stocks to buy now, top stocks to buy now, buy nvidia stock, buy nvda stock

Id: JyYJey4qonQ

Channel Id: undefined

Length: 16min 35sec (995 seconds)

Published: Sun Jun 02 2024