The AI War Rages On! Every AMD AI Announcement (Supercut)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so let's go ahead and get started there's incredible interest in AI across the industry and this is also you know when we look at it AI is really the defining technology that's shaping the next generation of computing and frankly it's amd's largest and most strategic long-term growth opportunity now in AI we're focused on three key areas first it's delivering a broad portfolio of high performance gpus CPUs and adaptive Computing solutions for AI training and inference spanning across data center edge and intelligent endpoints second it's developing an open and proven software platform to enable our AI Hardware to be deployed broadly and easily and third it's really working with the industry and it's expanding the deep and collaborative Partnerships we have established to really enable the ecosystem to accelerate AI Solutions at scale because in this space there's no question that AI will be the key driver of silicon consumption for the foreseeable future but the largest opportunity is in the data center and over the last six months or so the broad adoption of generative AI with large language models has really taken this growth to a different level so people keep asking me you know what is the opportunity Lisa what is the opportunity um what I'd like to say is look we are still very very early in the life cycle of AI I mean there's so much opportunity for us but when we try to size it we think about the data center AI accelerator 10 growing from you know something like 30 billion dollars this year over 50 percent compound annual growth rate to over 150 billion in 2027. and it may be higher it may be lower but what I can say for sure is it's going to be a lot because there's just tremendous tremendous demand now AMD has been investing in the data center accelerary market for many many years and you know today if you look at where we are we power many of the fastest supercomputers in the world that are using AI to solve some of the world's biggest challenges for example at Oakridge National Labs this is the number one supercomputer in the world it's the industry's first exascale supercomputer in Frontier they're running large AI models on AMD instinct gpus to accelerate their cancer research in Finland the Lumi supercomputer uses AMD instinct gpus to power the largest finished large language models with 13 billion parameters and we're also collaborating with researchers at the Allen Institute who are also using Lumi to create a state-of-the-art fully open llm with 70 billion parameters that will be used by the global scientific community so let's just take a deeper look at how Lumi is using epic CPUs and Instinct accelerators for AI cancer is one of the major problems for human health it takes a long time for the battles is to go to one sample aim of combat AI is to build this decision support tool for Pathologists to help them make their diagnosis we show millions and millions of tissue samples taken from patients to this neural network the more data you feed the better the model becomes it takes a lot of compute to Crunch that data and develop insights that we can use to advance Humanity so the major challenges in analyzing these issue images is of course a lot of this technical variation that arises from preparing these samples like fixation and cutting and staining and this of course causes like challenges training these neural networks so this AI based decision support tools we are able to provide Pathologists with a tool for them to base their diagnosis on the data the super computer we built is going to help people to build more accurate and better models get better outcomes we're impacting the lives of billions of people [Applause] many stories of you know how people are using AI to really accelerate sort of Next Generation systems so uh turning to AI Hardware let me say the generative AI in large language models have changed the landscape okay the need for more compute is growing exponentially whether you're talking about training or frankly whether you're talking about inference larger models give you better accuracy and there's a tremendous amount of experimentation and development that's coming across in the industry at the center of this are gpus gpus are enabling generative AI so now let me turn to our Instinct GPU roadmap cdna is the underlying architecture for our Instinct accelerators it's designed specifically for AI and HPC workloads cdna3 is our brand new architecture that uses a new compute engine the latest data formats five and six nanometer process technology and the most advanced triplet packaging Technologies at CES earlier this year we previewed Mi 300a it's the world's first data center Apu and what we have is our cdna3 GPU architecture with 24 high performance Zen 4 CPU cores and we also add with that 128 gigabytes of hbm3 memory all in a single package and what we have is Unified memory across the CPU and GPU which is frankly very effective particularly for some HPC workloads now this results in eight times more performance and five times better efficiency compared to the mi250x accelerator that is in the largest supercomputers today now mi 300a has also been designed into supercomputers already and it's slated for the two plus exaflop El Capitan system at Lawrence Livermore National Labs it's the most complex trip we've ever built with more than 146 billion transistors across 13 shiplets now you guys know we've led the industry with the use of chiplets in our products and our use of chiplets in this product is actually very very strategic we created a family of products so in addition to the Mi 300a product with our triplet construction we can actually replace the three Zen 4 CPU triplets with two additional cdna3 chiplets to create a GPU only version of Mi 300 optimized for large language models and AI to address the larger memory requirements of large language models we actually added an additional 64 gigabytes of hbm3 memory so with that I am super excited to show you for the very first time Mi 300X now for those of you who are paying attention you might see it looks very very similar to Mi 300a because basically we took three triplets off and put two triplets on and we stacked more hbm3 memory but what you see with Mi 300X is we truly designed this product for generative AI it combines cna3 with an industry-leading 192 gigabytes of hbm3 that delivers 5.2 terabytes per second of memory bandwidth and it has 153 billion transistors across 12 5 nanometer and six nanometer chiplets so look when you look at the world um you know there are many things that you need you need great compute engines but you also need a lot of memory for everything that's going on so when you compare Mi 300X to the competition and my 300X offers 2.4 times more memory and 1.6 times more memory bandwidth and with all of that additional memory capacity we actually have an Advantage for large language models because we can run larger models directly in memory and what that does is for the largest models it actually reduces the number of gpus you need significantly speeding up the performance especially for inference as well as reducing total cost of ownership so of course I want to show you the chip in action for the very first time so let's watch Mi 300 in action shall we for this demo we wanted to show you a large language model running real-time inference on a single GPU we're actually going to run the recently released Falcon 40b foundational large language model which is currently the most popular model on hugging face right now featuring 40 billion parameters so let's watch for the first time ever Mi 300X running Falcon on a single GPU accelerator all right let's start we have to give the model a prompt so we are here in San Francisco let's say write a poem about San Francisco here we go the poem's coming you can see it's responding in real time I'm not a great poet I don't know about you guys the City of Dreams that always keeps you yearning for more I would say that poem's pretty good huh now look you guys have all used um you know generative AI already and you've seen a number of General AI Demos in the uh in the past few months but what I want to emphasize that's special about this demo is it's the first time a large language model of this size can be run entirely in memory on a single GPU a single Mi 300X can run models up to approximately 80 billion parameters so what does this actually mean um if you look at the industry today you often see that first of all the model sizes are getting much larger and you actually need multiple gpus to run the latest large language models with Mi 300X you can reduce the number of gpus for this and as model sizes continue growing this will become even more important so with more memory more memory bandwidth and fewer gpus needed what that's this means for cloud cloud providers as well as Enterprise users is we can run more inference jobs per GPU than you could before and what that enables is for us to deploy Mi 300X at scale to power next-gen llms with lower total cost of ownership really making the technology much much more accessible to the broader ecosystem and what that also means is that not only do we believe that we have better total cost of ownership but we've also been able to significantly reduce the uh the amount of development time needed to deploy Mi 300X our goal with Mi 300X is to make it as easy to deploy as possible and what that means is the infrastructure is also incredibly important which is why I'm also excited to announce the AMD Instinct platform and and what we're doing with this platform is again we're all about open infrastructure so what we're putting is eight Mi 300 x's in the industry standard ocp infrastructure and for customers what that means is they can use all of this AI compute capability in memory of Mi 300X in an industry standard platform that drops right into their existing infrastructure with actually very minimal changes and with this leveraging of ocp platform specification we're actually accelerating customers time to Market and reducing overall development costs while making it really easy to deploy Mi 300X into their existing AI rack and server infrastructure so I'm happy to say that Mi 300a began sampling to our lead HPC and AI customers earlier this quarter and we're on track to begin sampling Mi 300X and the eight GPU Instinct platform beginning in the third quarter and in addition we expect both of these products to ramp in production in the fourth quarter of this year so I really look forward to sharing more details on the Mi 300 family when we launched later this year now since launching epic in 2017 we have been laser focused on building the industry's best data center CPUs epic is now the industry standard in the cloud given our leadership performance and TCO across a wide range of workloads and that momentum is just growing as we ramp our 4th gen epic Genoa processors Genoa features up to 96 high performance 5 nanometers N4 cores it has the latest i o that includes PCI Gen 5 12 channels of ddr5 memory and support for cxl we launched Genoa actually last November and it had leadership performance and efficiency so let's just take a look at some of those metrics for Genoa starting first with the cloud integer performance is key we deliver up to 1.8 times more performance per watt than the industry compared to the using the industry standard spec power benchmark and what that means is that Genoa is by far the best choice for anybody who cares about sustainability actually today the vast majority of AI workloads are actually being run on CPUs and Genoa is also the best CPU for AI the best way to look at AI performance is actually to look at a broad set of end-to-end workloads and so we use the industry standard TPC XA Benchmark that actually looks at end-to-end AI performance across 10 different use cases and a host of different algorithms and what we see in this case is epic is 1.9 times more performant than the competition now you can see I'm tremendously excited about Genoa and all of the applications that our customers are running on Genoa but what I did say earlier is that data center workloads are becoming increasingly more specialized requiring optimized Computing Solutions across CPUs dpus and of course AI accelerators so all of these factors actually drove the development of Bergamo Bergamo is actually our first epic processor designed specifically for cloud workloads as you guys know I love my chips our chips our chips I am very happy to show you Bergamo this is um our new Cloud native processor and what we have here is there's actually a new compute die so the compute die is different from Genoa using our triplet technology each of these eight compute dies has 16 of our Zen four cores and then we use the same six nanometer i o die used by Genoa in the center if you take a look at the core the zen4c core is actually an enhanced version of the Zen 4 core Zen 4C is actually optimized for The Sweet Spot of performance and power and that actually is what gives us the much better density and Energy Efficiency and then we optimize the physical implementation of zenforcee for power in area and we also redesigned the L3 cache hierarchy for greater throughput so if you put all this together the result is a design that has 35 smaller area and what you have is each of the eight compute triplets on Bergamo contains twice the number of cores as was on Genoa and that's how we get to 128 cores per socket but importantly as I said it's fully soft for compatible and it's also fully platform compatible with Genoa and what that means for customers is they can easily deploy either Bergamo or Genoa depending on their overall compute needs and their overall workloads and so we really try to leverage the overall platform investment in AMD so let me show you some of the performance metrics on Bergamo if you compare Bergamo to our competition's top of Stack what you'll see is just incredible performance we're delivering up to 2.6 times more performance across a wide range of cloud native applications whether you're talking about web front end or in-memory analytics or very heavy transactional workloads and then if you look beyond that in terms of you know looking at sort of the overall density and capability Bergamo again is significantly better than the competition in compute density and Energy Efficiency so what we see is more than double the number of containers per server and two times the Energy Efficiency in Java workloads so I'm happy to say that Bergamo is shipping in volume now to our hyperscale customers now to enable generative AI you need both best-in-class Hardware but we also need a great software ecosystem so let me now invite AMD president Victor Peng to the stage to share more about the growing software ecosystem for AI Solutions [Music] Thank you Lisa good morning you know it's really great to be here and it's especially exciting for me not only because now we've really put all the focus of the company Under One Roof the data center GPU team and our newly formed AI group and I can sum up our Direction and the current state of things right now and how we're enabling Ai and software development with three words open proven and ready now while this is a journey we've made really great progress in building a powerful software stack that works with the open ecosystem of models libraries Frameworks and tools realizing application performance really does require a leadership software stack optimize for the ecosystem let me first cover Rock'em which is the software stack we have for our Instinct Data Center gpus rockem is a complete set of libraries runtime compilers and tools needed to develop run and tune AI models and algorithms a significant portion of Rock'em stack is actually open our drivers language runtime tools like our debugger and profile and our libraries are all open Rock'em also supports the AI software ecosystem including open Frameworks models and tools now rockem is actually in its fifth generation and it includes a very comprehensive Suite of optimizations for AI as well as high performance Computing now this provides a very sound uh foundation for continued pie torch and tensorflow compliance which enables a really rich out of the box AI experience so now let's move up the stack to Frameworks and specifically Pi torch which is one of the you know most popular and growing framework and what better person to do that than to have uh one of the founders of pytorch talk about the collaboration AED and pie charts are doing to advance AI so I'd like to introduce Sumit katala to the stage to talk about pytorch so it's really great to see you thank you for coming absolutely so can you share some of your thoughts about the and d and pytorch collaboration for sure um the pytorch and AMD collaboration goes way back several years back AMD and meta have been collaborating in various forms and pytorch mostly came out of meta um it's a multi-year collaboration we've been giving AMD a lot of feedback on many aspects of uh like the ideal hardware and software to run Ai workloads and amdns have been partnering together to build the Wacom stack and a bunch of pytorch operators and integration to robustly test the whole stack and I'm pretty excited for the current support especially like on the Instinct accelerators that that rockam enables and I'm pretty excited about Mi 300 as well so um I I think like this is the start um I I'm looking forward to like how customers end up like finding the maturity of the stack but like we've spent a lot of time trying to like make sure it does come out right yeah we're really excited about Mi 300 as well as several others so thank you for that so how do you say the collaboration uh that we're doing together benefiting the developer Community sure yeah this uh generally like in in AI uh workloads one of the one of the things we have is and uh we have a like a single dominating render and when you write your workloads um and you have to make them be supported by switching to a different Hardware there's a lot of work that goes into doing that like there's a lot of software work that developers have to do to like move your neural network workloads from one platform to another so one of the things that we've done with AMD with the Wacom stack and the factory integration is you don't actually have to do that much work or almost no work in a lot of cases to go from one platform to the other so um you might do a little bit of tuning work once you're deployed onto your AMD GPU but it's like super seamless and so developers I think are going to have a huge productivity boost as they try to switch um um to like the AMD back end of pytorch versus like the the TPU or like Nvidia backend so I'm pretty excited about like the overall productivity developers would have when they're switching to the AMD backend overall like and it's starting with the instant gpus and I'm hoping like you know you all enable it for like all of your other class of gpus as well absolutely it's the goal to enable that capability to all the developers across all our platforms yeah so thank you so much for the partnership really appreciate it appreciate it and uh look forward to more together High touch 2.0 provides an open performant and productive option for developers to develop their latest AI inventions and that option and that creativity and democracization if you will is super important we are on one of only two GPU platforms that is integrated with pytorch at this level Okay so we've covered Rock'em and integration open Frameworks that leads us to moving up to the top of the stack with AI models and algorithms hugging face is the leading enabler of AI model innovation in the open source Community they offer an extremely wide range of models including Transformers that are at the heart of gender of AI but also Vision models and models for all kinds of other applications and use cases so here to talk to us more about their groundbreaking work and the partnership we have with hugging face is Clem delange CEO of a hunger face please welcome fun [Applause] hey everyone Clem it's great to see you thanks for coming why don't we just start off and share your thoughts about why open source matters for the growth and proliferation of AI thanks for having me so first it's important to remember that most progress in AI in the past five to ten years has been Thanks to open science and open source maybe it would be 50 years away from where we are today without it now when we look to the Future open science and open source AI are not only a way to accelerate technology but also to level the playing field in the future we want every single company to be able to train and run their chat GPT on AMD Hardware right and all of that allows companies to build AI themselves right not just to use AI through apis for example by doing so most of the time with customized specialized smaller models it makes AI faster cheaper and better to run it also makes it safer for the companies but also for the field in general because it creates more opportunities for transparency and accountability which Fosters more ethical AI yeah that's that's tremendous yeah so just thrilled about that so you know and I'm personally thrilled that we just recently formalized our relationship uh can you share what the hugging face in AMD partnership is going to deliver yes we're super excited about this new partnership so hugging face is lucky to have become the most used open platform for AI today we have 15 000 companies using our software and they have shared over half a million open models data sets and the most like some you might have heard of like stable diffusion Falcon Bloom stockholder music gen that has just been released by metal a few days ago over 5000 models 5000 new models are were added just just last week I think it shows the crazy speed of the open source AI Community these days we will optimize all of that for AMD platforms starting with Instinct gpus we will also include AMD Hardware in our regression test for some of our most popular libraries like Transformers and our CI CD to ensure that new models like the five thousand that have been added last week are natively optimized for AMD platforms it's really important that Hardware doesn't become some sort of the bottleneck or gatekeeper for AI when it develops so what we're trying to do is to extend the range of options for AI Builders both for training and inference I'm super excited in particular about the ability of AMD to power large language models in data centers thanks to the memory capacity and the bandwidth advantage ultimately AI is becoming the default to build All Tech for all Industries right for language models that we're talking about but also image audio videos right we're seeing more and more time series biology chemistry and many many more domains hopefully this collaboration will be one step a great step to democratize AI even further and improve everyone's life that is a truly inspired vision and you know which we share in Indy we deeply believe in that and we're so excited that we're working together with hugging face to make that Vision a reality thank you so much I appreciate it thank you you know we all know that AI the rate of innovation of this is unprecedented the open source Community is clearly a major driver of that rate and the breadth of innovation and AMD is ready to help our customers and developers achieve the next breakthrough in AI thank you so much and now I'd like to invite Lisa back to the stage thank you all right thank you Victor let me wrap things up for today we showed you a tremendous amount of new data center and AI Technologies from our expanded portfolio of leadership epic server processors with Genoa Bergamo and Genoa X to our expanding AI software ecosystem and our next Generation Mi 300X data center AI accelerators it is just an incredible pace of innovation truly exciting um but what's more than that we really appreciate the partnership with our partners and a very very special thank you to our partners who are here with us today AWS meta Microsoft Citadel Pi torch and hugging face we truly believe in co-development co-innovation and partnership it's been a great day as we take another major step forward to make AMD the data center and AI partner of choice thank you so much for joining us foreign
Info
Channel: Ticker Symbol: YOU
Views: 94,105
Rating: undefined out of 5
Keywords: nvidia, nvda, nvidia stock, nvda stock, jensen huang, nvidia keynote, openai, chatgpt, gpt4, gpt3, msft, microsoft stock, msft stock, artificial intelligence stocks, nvidia stock news, semiconductor stocks, tsmc, tsm stock, gpt-4, stable diffusion, jensen huang keynote, nvidia 2023, ai copilot, computex 2023, nvidia computex 2023, nvidia keynote 2023, ai stocks, best ai stocks, amd keynote, amd stock, amd stock news, amd, amd lisa su keynote, amd ai chip, amd datacenter, amd ai
Id: dU3xW5GcJLo
Channel Id: undefined
Length: 27min 13sec (1633 seconds)
Published: Thu Jun 15 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.