AMD vs. Nvidia: Battle of the AI Chips

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

I am super excited to show you for the very first time Mi 300X now for those of you who are paying attention you might see it looks very very similar to Mi 300a because basically we took three triplets off and put two triplets on and we stacked more hbm3 memory this is Grace Hopper this this processor this processor is really quite amazing there are several characteristics about it this is the world's first accelerated Computing processor that also has a giant memory but what you see with Mi 300X is we truly designed this product for generative AI it combines cna3 with an industry-leading 192 gigabytes of hbm3 that delivers 5.2 terabytes per second of memory bandwidth and it has 153 billion transistors across 12 5 nanometer and six nanometer chiplets it has almost 600 gigabytes of memory that's coherent between the CPU and the GPU and so the GPU can reference the memory the CPU can reference the memory and any unnecessary copying back and forth could be avoided the amazing amount of high-speed memory lets the GPU work on very very large data sets this is a computer this is not a chip when you compare Mi 300X to the competition Mi 300X offers 2.4 times more memory and 1.6 times more memory bandwidth and with all of that additional memory capacity we actually have an Advantage for large language models because we can run larger models directly in memory and what that does is for the largest models it actually reduces the number of gpus you need significantly speeding up the performance especially for inference as well as reducing total cost of ownership now let me show you some of its performance so I'm comparing here on three different applications Vector database is a database that has tokenized that has vectorized the data that you're trying to store this is incredibly important for knowledge augmentation of the large language models to avoid hallucination the second is deep Learning recommender Systems this is how we get news and music and all the text that you see on your on your devices recommender system is the engine of the digital economy this is probably the single most valuable piece of software that any of the companies in the world runs the last one is large language model inference so let's watch for the first time ever Mi 300X running Falcon on a single GPU accelerator all right let's start we have to give the model a prompt so we are here in San Francisco let's say write a poem about San Francisco here we go the poem's coming you can see it's responding in real time I'm not a great poet I don't know about you guys the City of Dreams that always keeps you yearning for more I would say that poem's pretty good huh but what I want to emphasize that's special about this demo is it's the first time a large language model of this size can be run entirely in memory on a single GPU however 600 gigabytes is still not enough we need a lot more so let me show you what we're going to do so the first thing is of course we have the Grace Hopper Superchip put that into a computer the second thing that we're going to do is we're going to connect eight of these together using nvlink this is an Envy link switch so eight of this eight of this Connect into three switch trays into eight eight Grace Hopper pod these eight Grace Hopper pods each one of the grace Hoppers are connected to the other Grace Hopper at 900 gigabytes per second 600 gigabytes 900 megabytes per second eight of them connected together as a pod and then we connect 32 of them together with another layer of switches and in order to build this 256 Grace Hopper Super Chips connected into one axle flops one exaflops you know that countries and Nations have been working on exaflops Computing and just recently achieved it 256 Grace Hoppers for deep learning is one exaflop Transformer engine and it gives us 144 terabytes of memory that every GPU can see this is not 144 terabytes distributed this is 144 terabytes connected which is why I'm also excited to announce the AMD Instinct platform and what we're doing with this platform is again we're all about open infrastructure so what we're putting is eight Mi 300 x's in the industry standard ocp infrastructure and for customers what that means is they can use all of this AI compute capability in memory of Mi 300X in an industry standard platform that drops right into their existing infrastructure with actually very minimal changes and with this leveraging of ocp platform specification we're actually accelerating customers time to Market and reducing overall development costs while making it really easy to deploy Mi 300X into their existing AI rack and server infrastructure why don't we take a look at what it really looks like play please foreign this is 150 miles of cables fiber optic cables 2 000 fans seventy thousand cubic feet per minute it probably recycles the air in this entire room in a couple of minutes forty thousand pounds four elephants one GPU if I can get up on here this is actual size I wonder if this can play crisis only Gamers know that joke you'll keep take away s first of all we're incredibly excited about AI we see AI everywhere in our portfolio with Mi 300X what we offer is leadership TCO for AI workloads we're really really focused on making it easy for our customers and partners to deploy so that instinct platform really lowers the barriers for adoption and frankly the Enterprise ready software stack we know that this is so important we've made tremendous progress through our work with the Frameworks and the uh and models with our partners and there's going to be a lot more that's going on over the next many years in this area now let me talk about availability so I'm happy to say that Mi 300a began sampling to our lead HPC and AI customers earlier this quarter and we're on track to begin sampling Mi 300X and the eight GPU Instinct platform beginning in the third quarter this is our brand new Grace Hopper AI super computer it is one giant GPU utterly incredible we're building it now all of the every component is in production and we're so we're so excited that Google Cloud meta and Microsoft will be the first companies in the world to have access and they will be doing exploratory research on the pioneering front the boundaries of artificial intelligence with us we will of course build these systems as products and so if you would like to have an AI supercomputer we would of course come and install it in your company we also share the blueprints of this supercomputer with all of our Cloud Partners so that they can integrate it into their networks and into their infrastructure and we will also build it inside our company for us to do research ourselves and do development so this is the dgx gh200 it is one giant GPU okay thank you

Info

Channel: CNET Highlights

Views: 21,121

Rating: undefined out of 5

Keywords: event, livestream, live, 2022, CNET Highlights

Id: wMwH-gf0bg4

Channel Id: undefined

Length: 8min 41sec (521 seconds)

Published: Thu Jun 22 2023