AMD Presents: Advancing AI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hey, good morning. Good morning, everyone. Welcome to all of you who are joining us here in Silicon Valley and to everyone who's joining us online from around the world. It has been just an incredibly exciting year with all of the new products and all the innovation that has come across our business and our industry. But today, it's all about AI. We have a lot of new AI solutions to launch today and to news to share with you, so let's go ahead and get started. Now, I know we've all felt this this year. I mean, it's been just an amazing year. I mean, if you think about it, a year ago, OpenAI unveiled ChatGPT. And it's really sparked a revolution that has totally reshaped the technology landscape. In this just short amount of time, AI hasn't just progressed. It's actually exploded. The year has shown us that AI isn't just kind of a cool new thing. It's actually the future of computing. And at AMD, when we think about it, we actually view AI as the single most transformational technology over the last 50 years. Maybe the only thing that has been close has been the introduction of the internet. But what's different about AI is that the adoption rate is just much, much faster. So although so much has happened, the truth is right now, we're just at the very beginning of the AI era. And we can see how it's so capable of touching every aspect of our lives. So if you guys just take a step back and just look, I mean, AI is already being used everywhere. Think about improving healthcare, accelerating climate research, enabling personal assistance for all of us and for greater business productivity, things like industrial robotics, security, and providing lots of new tools for content creators. Now the key to all of this is generative AI. It requires a significant investment in new infrastructure. And that's to enable training and all of the inference that's needed. And that market is just huge. Now a year ago when we were thinking about AI, we were super excited. And we estimated the data center AI accelerator market would grow approximately 50% annually over the next few years, from something like $30 billion in 2023 to more than $150 billion in 2027. And that felt like a big number. However, as we look at everything that's happened in the last 12 months and the rate and pace of adoption that we're seeing across the industry, across our customers, across the world, it's really clear that the demand is just growing much, much faster. So if you look at now to enable AI infrastructure, of course it starts with the cloud, but it goes into the enterprise. We believe we'll see plenty of AI throughout the embedded markets and into personal computing. We're now expecting that the data center accelerator TAM will grow more than 70% annually over the next four years to over 400 billion in 2027. So does that sound exciting for us as an industry? I have to say for someone like me who's been in the industry for a while, this pace of innovation is faster than anything I've ever seen before. And for us at AMD, we are so well positioned to power that end-to-end infrastructure that defines this new AI era. So speaking about massive cloud server installations to we're going to talk about on-prem enterprise clusters to the next generation of AI in embedded and PCs, our AI strategy is really centered around three big strategic priorities. First, we must deliver a broad portfolio of very performant, energy-efficient GPUs, CPUs, and adaptive computing solutions for AI training and inference. And we believe, frankly, that you're going to need all of these pieces for AI. Second, it's really about expanding our open, proven, and being very developer-friendly in our software platform to ensure that leading AI frameworks, libraries, and models are all fully enabled for AMD hardware and that it's really easy for people to use. And then third, it's really about partnership. You're going to see a lot of partners today. That's who we are as a company. It's about expanding the co-innovation work and working with all parts of the ecosystem, including cloud providers, OEMs, software developers. You're going to hear from some really AI leaders in the industry to really accelerate how we work together and get that widespread deployment of our solutions across the board. So we have so much to share with you today. I'd like to get started. And of course, let's start with the cloud. Generative AI is the most demanding data center workload ever. It requires tens of thousands of accelerators to train and refine models with billions of parameters. And that same infrastructure is also needed to answer the millions of queries from everyone around the world to these smart models. And it's very simple. The more compute you have, the more capable the model, the faster the answers are generated. And the GPU is at the center of this generative AI world. And right now, I think we all know it, everyone I've talked to says it, the availability and capability of GPU compute is the single most important driver of AI adoption. Do you guys agree with that? So that's why I'm so excited today to launch our Instinct MI300X. It's the highest performance accelerator in the world for generative AI. MI300X is actually built on our new CDNA 3 data center architecture. And it's optimized for performance and power efficiency. CDNA 3 has a lot of new features. It combines a new compute engine. It supports sparsity, the latest data formats, including FP8. It has industry-leading memory capacity and bandwidth. And we're going to talk a lot about memory today. And it's built on the most advanced process technologies and 3D packaging. So if you compare it to our previous generation, which frankly was also very good, CDNA 3 actually delivers more than three times higher performance for key AI data types, like FP16 and BF16, and a nearly seven times increase in int tape performance. So if you look underneath it, how do we get MI300X? It's actually 153 billion transistors, 153 billion. It's across a dozen 5-nanometer and 6-nanometer chiplets. It uses the most advanced packaging in the world. And if you take a look at how we put it together, it's actually pretty amazing. We start with four IO die in the base layer. And what we have on the IO dies are 256 megabytes of infinity cache and all of the next-gen IO that you need. Things like 128-channel HBM3 interfaces, PCIe Gen 5 support, our fourth-gen infinity fabric that connects multiple MI300Xs so that we get 896 gigabytes per second. And then we stack eight CDNA 3 accelerator chiplets, or XCDs, on top of the IO die. And that's where we deliver 1.3 petaflops of FP16 and 2.6 petaflops of FP8 performance. And then we connect these 304 compute units with dense through-silicon vias, or TSVs, and that supports up to 17 terabytes per second of bandwidth. And of course, to take advantage of all of this compute, we connect eight stacks of HBM3 for a total of 192 gigabytes of memory at 5.3 terabytes per second of bandwidth. That's a lot of stuff on that. I have to say, it's truly the most advanced product we've ever built, and it is the most advanced AI accelerator in the industry. Now let's talk about some of the performance and why it's so great. For generative AI, memory capacity and bandwidth are really important for performance. If you look at MI300X, we made a very conscious decision to add more flexibility, more memory capacity, and more bandwidth, and what that translates to is 2.4 times more memory capacity and 1.6 times more memory bandwidth than the competition. Now when you run things like lower precision data types that are widely used in LLMs, the new CDNA 3 compute units and memory density actually enable MI300X to deliver 1.3 times more teraflops of FP8 and FP16 performance than the competition. Now these are good numbers, but what's more important is how things look in real world inference workloads. So let's start with some of the most common kernels used by the latest AI models. LLMs use attention algorithms to generate precise results. So for something like FlashAttention-2 kernels, MI300X actually delivers up to 1.2 times better performance than the competition. And if you look at something like the Llama 2 70B LLM, and we're going to use this a lot throughout the show, MI300X again delivers up to 1.2 times more performance. And what this means is the performance at the kernel level actually directly translates into faster results when running LLMs on a single MI300X accelerator. But we also know, we talked about these models getting so large, so what's really important is how that AI performance scales when you go to the platform level and beyond. So let's take a look at how MI300X scales. Let's start first with training. Training is really hard. People talk about how hard training is. When you look at something like the 30 billion parameter model from Databricks, MPT LLM, it's a pretty good example of something that is used by multiple enterprises for a lot of different things. And you can see here that the training performance for MI300X is actually equal to the competition. And that means it's actually a very, very competitive training platform today. But when you turn to the inference performance of MI300X, this is where our performance really shines. We're showing some data here, measured data on two widely used models, Bloom 176B. It's the world's largest open multi-language AI model. It generates text in 46 languages. And our Llama 2 70B, which is also very popular, as I said, for enterprise customers. And what we see in this case is a single server with eight MI300X accelerators is substantially faster than the competition, 1.4 to 1.6X. So these are pretty big numbers here. And what this performance does is it just directly translates into a better user experience. You guys have used it. When you ask the model something, you'd like it to come back faster, especially as the responses get more complicated. So that gives you a view of the performance of MI300X. Now excited as we are about the performance, we are even more excited about the work we're doing with our partners. So let me turn to our first guest, very, very special. Microsoft is truly a visionary leader in AI. We've been so fortunate to have a deep partnership with Microsoft for many, many years across all aspects of our business. And the work we're doing today in AI is truly taking that partnership to the next level. So here to tell us more about that is Microsoft's Chief Technology Officer, Kevin Scott. Kevin, it is so great to see you. Thank you so much for being here with us. It's a real pleasure to be here with you all today. We've done so much work together on EPYC and Instinct over the years. Can you just tell our audience a little bit about that partnership? Yeah, I think Microsoft and AMD have a very special partnership. And as you mentioned, it has been one that we've enjoyed for a really long time. It started with the PC. It continued then with a bunch of custom silicon work that we've done together over the years on Xbox. It's extended through the work that we've done with you all on EPYC for the high-performance computing workloads that we have in our cloud. And like the thing that I've been spending a bunch of time with you all on the past couple of years, like actually a little bit longer even, is on AI compute, which I think everybody now understands how important it is to driving progress on this new platform that we're trying to deliver to the world. I have to say we talk pretty often. We do. But Kevin, what I admire so much is just your vision, Satya's vision about where AI is going in the industry. So can you just give us a perspective of where are we on this journey? Yeah, so we have been with a huge amount of intensity over the past five years or so, been trying to prepare for the moment that I think we brought the world into over the past year. So it is almost a year to the day since the launch of ChatGPT, which I think is perhaps most people's first contact with this new wave of generative AI. But the thing that allowed Microsoft and OpenAI to do this was just a deep amount of infrastructure work that we've been investing in for a very long while. And one of the things that we realized fairly early in our journey is just how important compute was going to be and just how important it is to think about the sort of full systems optimization. So the work that we've been doing with you all has been not just about figuring out what the silicon architecture looks like, but that's been a very important thing and making sure that we together are building things that are going to intercept where the actual platform is going to be years in advance, but also just doing all of that software work that needs to be done to make this thing usable by all the developers of the world. I think that's really key. I think sometimes people don't understand, they think about AI as this year, but the truth is we've been building the foundation for so many years. Kevin, I want to take this moment to really acknowledge that Microsoft has been so instrumental in our AI journey. The work we've done over the last several generations, the software work that we're doing, the platform work that we're doing, we're super excited for this moment. Now I know you guys just had Ignite recently and Satya previewed some of the stuff you're doing with 300X, but can you share that with our audience? We're super enthusiastic about 300X. Satya announced that the MI300X VMs were going to be available in Azure. It's really, really exciting right now seeing the bring up of GPT-4 on MI300X, seeing the performance of LLlama 2, getting it rolled into production. The thing that I'm excited here today is we will have the MI300X VMs in preview available today. I completely agree with you. The thing that's so exciting about AI is every day we discover something new and we're learning that together. Kevin, we're so honored to be Microsoft's partner in AI. Thank you for all the work that your teams have done, that we've done together. We look forward to a lot more progress. Likewise. Thank you very much. All right, so look We certainly do learn a tremendous amount every day and we're always pushing the envelope. Let me talk to you a little bit about how we bring more people into our ecosystem. When I talk about the Instinct platform, you have to understand our goal has really been to enable as many customers as possible to deploy Instinct as fast and as simply as possible. To do this, we really adopted industry standards. We built the Instinct platform based on an industry standard OCP server design. I'd actually like to show you what that means because I don't know if everyone understands. Let's bring her out. Her or him? Let me show you the most powerful gen AI computer in the world. Those of you who follow our shows know that I'm usually holding up a chip, but we've shown you the MI300X chip already, so we thought it would be important to show you just what it means to do generative AI at a system level. What you see here is eight MI300X GPUs and they're connected by our high-performance Infinity fabric in an OCP-compliant design. What makes that special? This board actually drops right into any OCP-compliant design, which is the majority of AI systems today. We did this for a very deliberate reason. We want to make this as easy as possible for customers to adopt so you can take out your other board and put in the MI300X Instinct platform. If you take a look at the specifications, we actually support all of the same connectivity and networking capabilities of our competition, so PCI Gen 5, support for 400 gig ethernet, that 896 gigabytes per second of total system bandwidth, but all of that is with 2.4 times more memory and 1.3 times more compute server than the competition. That's really why we call it the most powerful gen AI system in the world. Now, I've talked about some of the performance in AI workloads, but I want to give you just a little bit more color on that. When you look at deploying servers at scale, it's not just about performance. Our customers are also trying to optimize power, space, CapEx and OpEx, and that's where you see some really nice benefits of our platform. When you compare our Instinct platform to the competition, I've already showed you that we deliver comparable training performance and significantly higher inference performance, but in addition, what that memory capacity and bandwidth gives us is that customers can actually either run more models, if you're running multiple models on a given server, or you can run larger models on that same server. In the case where you're running multiple different models on a single server, the Instinct platform can run twice as many models for both training and inference than the competition. On the other side, if what you're doing is trying to run very large models, you'd like to fit them on as few GPUs as possible. With the FP16 data format, you can run twice the number of LLMs on a single MI300X server compared to our competition. This directly translates into lower CapEx, and especially if you don't have enough GPUs, this is really, really helpful. So, to talk more about MI300X and how we're bringing it to market, let me bring our next guest to the stage. Oracle Cloud and AMD have been engaged for many, many years in bringing great computing solutions to the cloud. Here to tell us more about our work together is Karan Batta, Senior Vice President at Oracle Cloud Infrastructure. Hey, Karan. Hi, Lisa. Thank you so much for being here. Thank you for your partnership. Can you tell us a little bit about the work that we're doing together? Yeah, thank you. Excited to be here today. Oracle and AMD have been working together for a long, long time, right, since the inception of OCI back in 2017. And so, we've launched every generation of EPYC as part of our bare metal compute platform, and it's been so successful, customers like Red Bull as an example. And we've expanded that across the board for all of our portfolio of past services like Kubernetes, VMware, et cetera. And then we are also collaborating on Pensando DPUs, where we offload a lot of that logic so that customers can get much better performance, flexibility. And then, you know, earlier this year, we also announced that we're partnering with you guys on Exadata, which is a big deal, right? So, we're super excited about our partnership with AMD, and then what's to come with 300X? Yeah. We really appreciate OCI has really been a leading customer as we talk about how do we bring new technology into Oracle Cloud. Now, you're spending a lot of time on AI as well. Tell us a little bit about your strategy for AI and how we fit into that strategy. Absolutely. You know, we're spending a lot of time on AI, obviously. Everyone is. We are. Everybody is. It's the new thing. You know, we're doing that across the stack, from infrastructure all the way up to applications. Oracle is an applications company as well. And so, we're doing that across the stack, but from an infrastructure standpoint, we're investing a lot of effort into our core compute stack, our networking stack. We announced clustered networking. And what I'm really excited to announce is that we're going to be supporting MI300X as part of that bare-metal compute stack. We are super thrilled about that partnership. We love the fact that you're going to have 300X. I know your customers and our customers are talking to us every day about it. Tell us a little bit about what customers are saying. Yeah, we've been working with a lot of customers. Obviously, we've been collaborating a lot at the engineering level as well with AMD. And you know, customers are seeing incredible results already from the previous generation. And so, I think that will actually carry through with the 300X. And so much so that we're also excited to actually support MI300X as part of our generative AI service that's going to be coming up live very soon as well. So, we're very, very excited about that. We're working with some of our early customer adopters like Naveen from Databricks Mosaic. So, we're very excited about the possibility. We're also very excited about the fact that the ROCm ecosystem is going to help us continue that effort moving forward. So, we're very pumped. That's wonderful. Karan, thank you so much. Thank your teams. We're so excited about the work we're doing together and look forward to a lot more. Thank you, Lisa. Thank you. Now, as important as the hardware is, software actually is what drives adoption. And we have made significant investments in our software capabilities and our overall ecosystem. So, let me now welcome to the stage AMD President Victor Peng to talk about our software and ecosystem progress. Thank you, Lisa. Thank you. And good morning, everyone. You know, last June at the AI event in San Francisco, I said that the ROCm software stack was open, proven, and ready. And today, I'm really excited to tell you about the tremendous progress we've made in delivering powerful new features as well as the high performance on ROCm. And how the ecosystem partners have been significantly expanding the support for Instinct GPUs and the entire product portfolio. Today, there are multiple tens of thousands of AI models that run right out of the box on Instinct. And more developers are running on the MI250, and soon they'll be running on the MI300. So we've expanded deployments in the data center, at the edge, in client, embedded applications of our GPUs, CPUs, FPGAs, and adaptive SoCs, really end to end. And we're executing on that strategy of building a unified AI software stack so any model, including generative AI, can run seamlessly across an entire product portfolio. Now, today, I'm going to focus on ROCm and the expanded ecosystem support for our Instinct GPUs. We architected ROCm to be modular and open source to enable very broad user accessibility and rapid contribution by the open source community and AI community. Open source and ecosystem are really integral to our software strategy, and in fact, really open is integral to our overall strategy. This contrasts with CUDA, which is proprietary and closed. Now, the open source community, everybody knows, moves at the speed of light in deploying and proliferating new algorithms, models, tools, and performance enhancements. And we are definitely seeing the benefits of that in the tremendous ecosystem momentum that we've established. To further accelerate developer adoption, we recently announced that we're going to be sporting ROCm on our Radeon GPUs. This makes AI development on AMD GPUs more accessible to more developers, start-ups, and researchers. So our foot is firmly on the gas pedal with driving the MI300 to volume production and our next ROCm release. So I'm really super excited that we'll be shipping ROCm 6 later this month. I'm really proud of what the team has done with this really big release. ROCm 6 has been optimized for gen AI, particularly large language models, has powerful new features, library optimizations, expanded ecosystem support, and increases performance by factors. It really delivers for AI developers. ROCm 6 supports FP16, BF16, and the new FP8 data pipes for higher performance while reducing both memory and bandwidth needs. We've incorporated advanced graph and kernel optimizations and optimized libraries for improved efficiency. We're shipping state-of-the-art attention algorithms like FlashAttention-2, page attention, which are critical for performance in LLMs and other models. These algorithms and optimizations are complemented with a new release of rCCL, our collective communications library for efficient, very large-scale GPU deployments. So look, the bottom line is ROCm 6 delivers a quantum leap in performance and capability. Now I'm going to first work you through the inference performance gains you'll see with some of these optimizations on ROCm 6. So for instance, running a 70 billion Llama 2 model, page attention and other algorithms speed up the token generation by paging attention keys and values, delivering 2.6x higher performance. HIP graph allows processing to be defined in graphs rather than single operations, and that delivers a 1.4x speed up. FlashAttention, which is widely used kernel for very high-performance LLL performance, delivers 1.3x speed up. So all those optimizations together deliver an 8x speed up on the MI300x with ROCm 6 compared to the MI250 and ROCm 5. That's 8x performance in a single generation. So this is one of those huge benefits we provide to customers with this great performance improvement with the MI300x. So now let's look at it from a competitive perspective. Lisa had highlighted the performance of large models running on multiple GPUs. What I'm sharing here is how the performance of smaller models running on single GPUs, in this case the 13 billion Llama 2 model. The MI300x and ROCm 6 together deliver 1.2x higher performance than the competition. So this is the reason why our customers and our partners are super excited about creating the next innovations in AI on the MI300x. So we're relentlessly focused on delivering leadership technology and very comprehensive software support for AI developers. And to fuel that drive, we've been significantly strengthening our software teams through both organic and inorganic means, and we're expanding our ecosystem engagements. So we recently acquired Nod.ai and Mipsology. Nod brings world-class expertise in open source compilers and runtime technology. They've been instrumental in the MLIR compiler technology as well as in the communities. And as part of our team, they are significantly strengthening our customer engagements and they're accelerating our software development plans. Mipsology also strengthens our capabilities and they're especially in delivering to customers in very AI-rich applications like autonomous vehicles and industrial automation. So now let me turn over to the ecosystem. In addition to working closely with the ecosystem, oh, sorry. We announced that we had the partnership with Hugging Face just last June. Today they have 62,000 models running daily on Instinct platforms. And in addition, we've worked closely on getting these LLM optimizations as part of their optimal library and toolkit. Our partnership with PyTorch Foundation has also continued to thrive with CI/CD pipelines and validation, enabling developers to target our platforms directly. And we continue to make very significant contributions to all the major frameworks, including upstream support for AMD GPUs in JAX, OpenXLA, QPI, and even initiatives like Deep Speed for Science. Just yesterday, the AI Alliance was announced with over 50 founding members that also include AMD, IBM, and Meta and other companies. And I'm really delighted to share some very late-breaking news. AMD GPUs, including the MI300, will be supported in the standard OpenAI Triton distribution starting with the 3.0 release. We're really thrilled to be working with Philippe Tillet, who created Triton, and the whole OpenAI team. AI developers using the OpenAI Triton are more productive working at a higher level of design abstraction, and they still get really excellent performance. This is great for developers and aligned with our strategy to empower developers with powerful and open software stacks and GPU platforms. This is in contrast to the much greater effort developers would need to invest working at a much lower level abstraction in order to eke out performance. Now I've shared a lot with you about the progress we made on software, but the best indication of the progress we've really made are the people who are using our software and GPUs and what they're saying. So it gives me great pleasure to have three AI luminaries and entrepreneurs from Databricks, essential AI, and Lamini to join me on stage. Please give a very warm welcome to Ion Stoica, Ashish Vaswani, and Sharon Zhou. Great. Welcome, Ion, Ashish, and Sharon. Thank you so much for joining us here. Really appreciate it. So I'm gonna ask each of you a bit about first with the mission of your company and share about the innovations you're doing with our GPUs and software and what the experience has been like. So Ion, let me start with you. Now you're also not only founder of Databricks, but you're on the staff of the department of UC Berkeley, director of Sky Computing Labs, and also you've been world with AnyScale and many AI startups. So maybe you could talk about your engagement with AMD as well as your experience in the MI200 and MI300. Yeah, thank you very much. Very glad to be here. And yes, indeed, I collaborated with AMD wearing multiple hats, director of a Sky Computing Lab at Berkeley, which AMD is supporting, and also founders of AnyScale and Databricks. And in all my work over the year, one thing I really focus on is democratizing the access to AI. What this means, it's improving the scale, performance, and cost, reducing the cost, to run these large AI applications, which means everything from AI workloads, everything from training, fine-tuning, inference, and generative AI applications. Just to give you some examples, we developed VLLM, which is arguably now the most popular open-source inference engines for LLMs. We have developed Ray, another open-source framework which is used to distribute machine learning workloads. Ray has been used by OpenAI to train ChatGPT. And more recently, Sky Computing, one of the projects there is SkyPilot, which helps you to run your applications or machine learning applications and workloads across multiple clouds. And why do you want to do that? It's because you want to alleviate the scarcity of the GPUs and reduce the costs. Now, when it comes to our collaborations, we collaborate on all these kind of projects. And one thing which was a very pleasant surprise is that it was very easy to run and include ROCm in our stack. It really runs out of the box from day one. Of course, you need to do more optimization for that. And this is what we are doing and we are working on. So for instance, we had the support for MI250 and to Ray. And we are working, actually, collaborating with AMD, like I mentioned, to optimize the inference for VLLM, again, running on MI250 and MI300X. And from the point of view of SkyPilot, we're really looking forward to have more and more of MI250s and MI300X in various clouds. So we have more choices. It sounds great. Thank you so much for all the collaboration across all those clouds. Ashish why don't you tell us about Essential's mission and also your experience with ROCm and Instinct? Thank you. Great to be here, Victor. Essential, we're really excited. We're really excited to push the boundaries of human-machine partnership in enterprises. We should be able to do it. We're at the beginning stages where we'll be able to do 10x or 50x more than what we can just do by ourselves today. So we're extremely excited. And what that's going to take, I believe it's going to be a full-stack approach. So you're building the models, serving infrastructure, but more importantly, understanding workflows in enterprises today and giving people the tools to configure these models, teach these models to configure them for their workflows end to end. And so the models learn with feedback. They get better with feedback. They get smarter. And then they're eventually able to even guide non-experts to do tasks they were not able to do. We're really excited. And we actually were lucky to start to benchmark the 250s earlier this year. And hey, we want to solve a couple of hard problems, scientific problems. And we were like, hey, are we going to get long context and check? OK, so are we going to be able to trade larger models? Are we able to serve larger models and smaller chips? And so as we saw, and the ease of using the software was also very pleasant. And then we saw how things were progressing. For example, I think in two months, I believe, FlashAttention, which is a critical component to actually scale to longer sequences, appeared, so it was generally very happy and just impressed with the progress and excited about the chips. Thanks so much, Ashish. And Sharon. So Sharon, Lamini has a very innovative business model and working with enterprise for their private models. Why don't you share the mission and how the experience with AMD has been? Yeah, thanks, Victor. So by way of quick background, Sharon, co-founder CEO of Lamini, most recently, I was a computer science faculty at Stanford leading a research group in generative AI. I did my PhD there also under Andrew Ng and teach about a quarter million students and professionals online in generative AI. And I left Stanford to pursue Lamini and co-found Lamini on the premise of making the magical, difficult, expensive pieces of building your own language model inside an enterprise extremely accessible, easy to use so that companies who understand their domain-specific problems best can be the ones who can actually wield this technology and, more importantly, fully own that technology. In just a few lines of code, you can run an LLM and be able to imbue it with knowledge from millions of documents, which is 40,000 times more than hitting Claude 2 Pro on that API. So just a huge amount of information can be imbued into this technology using our infrastructure. And more importantly, our customers get to fully own their models. For example, NordicTrack, one of our customers that makes all the ellipticals and treadmills in the gym, parent companies, iFit, they have over 6 million users on their mobile app platform. And so they're building an LLM that can actually create this personal AI fitness coach imbued with all the knowledge they have in-house on what a good fitness coach is. And it turns out it's actually not a professional athlete. They tried to hire Michael Phelps, did not work. So they have real knowledge inside of their company and they're imbuing the LLM with that so that we can all have personal fitness trainers. So we're very excited to be working with AMD. We actually have had a cloud, AMD cloud, in production for over the past year on MI200, so MI210, MI250s. And we're very excited about the MI300s. And I think something that's been super important to us is that with Lamini software, we've actually reached software parity with CUDA on all the things that matter with large language models, including inference and training. And I would say even beyond CUDA. We have reached beyond CUDA for things that matter for our customers. So that's including higher memory, higher memory or higher capacity means bigger models. And our customers wanna be able to build bigger and more capable models. And then a second point, which Lisa kind of touched on earlier today is, these machines, these chips can actually, given higher bandwidth, be able to return results with lower latency, which matters for the user experience, certainly a personal fitness coach, but for all of our customers as well. Super exciting, that's great. Great. So, Ion back to you, changing this up a little bit. So, you heard several key components of ROCm is open source. And we did that for rapid adoption and also getting better, more enhancements from the community, both open source and AI. So what do you think about this strategy and how do you think this approach might help some of the companies that you've founded? So obviously, given my history, really love the open source. I love the open source ecosystem. And we try to do over time to do our own contribution, bring out, and I think that one thing to note is that many of the generative AI tools today are open source. And we are talking here about Hugging Face, about PyTorch, Triton, like I mentioned, BLM, Drey, and many others. And many of these tools actually can run today on AMD and ROCm, stack today. And this makes ROCm another key component of the open source ecosystem. And I think this is great. And it's, in time, I'm sure that actually quite fast. It's like the community will take advantage of the unique capabilities of the AMDs, MI250 and MI300X to innovate and to improve the performance of all these tools which are running at a higher level of the generative AI stack. Great, and that's our purpose and aim, so I'm glad to hear that. So I'm gonna, out of order execution, jump over to Sharon. So Sharon, what do you think about how AI workloads are evolving in the future? And what do you think, GPU Instincts, since you have great experience with it and ROCm can play in that future of AI development? Okay, so maybe a bit of a spicy take. I think that GOFAI, good old-fashioned AI, is not the future of AI. And I really do think it's LLMs, or some variant of LLMs of these models that can actually be able to soak up all this general knowledge that is missing from these traditional algorithms. And we've seen this across so many different algorithms in our customers already. Those who are even at the bleeding edge of recommendation systems, forecasting systems, classification, are even using this because of that general knowledge that it's able to learn. So I think that's the future. It's maybe more known as Software 2.0, coined by my friend, Andre Karpathy. And I really do think Software 2.0, which is hitting these models time and time again, instead of writing really extensive software inside a company, we'll be supporting enterprises 2.0, meaning enterprises of the future, of the next generation. And I think the AMD Instinct GPUs are critical to basically supporting, ubiquitously supporting the Software 2.0 of the future. And we absolutely need compute to be able to run these models efficiently, to run lots of these models, more of these models, and larger models with greater capabilities. So overall, very excited with the direction of not only these AI workloads, but also the direction that AMD is taking in doubling down on these MI300s that, of course, can take on larger models and more capable models for us. Awesome. So Ashish, we'll finish up with you and I'll give you the same kind of question. So where do you think about the future of AI workloads and how do you think our GPUs and ROCm and can play and how you're driving things at Essential? Yep. So I think that we have to improve reasoning and planning to solve these complex tasks, like take an analyst and if they actually, they want to absorb an earnings call and figure out how they should revise their opinion and whether to invest in a company or what recommendations that they should provide. It's actually gonna take, it's gonna take multiple reasoning over multiple steps. It's gonna take ingesting a large document and being able to extract information from it, apply their models, actually ask for information when they don't have any, get world knowledge, but also maybe have some reasoning and some outside reasoning and planning there. And then for all these sort of, so when I look at the MI300 with very large HBM and high memory bandwidth, I think of what's gonna be unlocked and which capabilities are going to be improved and what new capabilities will be available. So I mean, even with what we have today, just imagine a world where you can process long documents or you can make these models much more accurate by adding more examples in the prompt. But imagine just complete user sessions that you can maintain and model state, how they would actually improve the end-to-end user experience, right? And I think that we're moving to a kind of architecture where what typically is to happen in inference, a lot of search is now gonna go into training where the models are gonna explore thousands of solutions and eventually pick one that's actually the best option for the goal, the best solution for the goal. And that's good, and definitely the large HBM and high bandwidth is gonna not only be important for serving large models with low latency for better end-to-end experience, but also for some of these new techniques that we're just exploring that are gonna improve the capabilities of these models. So very excited about the new chip and what it's gonna unlock. Great, thank you, Ashish. Ion, Ashish, Sharon, this has been really terrific. Thank you so much for all the great insights you have provided us. Thank you. And thank you for joining us today. Thank you. Thank you. Thank you. Thank you. It's just so exciting to hear what companies like Databricks, Essential AI, Lamini are achieving with our GPUs and just super thrilled that their experience with our software has been so smooth and really a delight. So you can tell, they see absolutely no barriers, right? And they're extremely motivated to innovate on AMD platforms. Okay, to sum it up, what we delivered over the past six months is empowering developers to execute their mission and realize their vision. We'll be shipping ROCm 6 very soon. It's optimized for LLMs and together with the MI300X, it's gonna deliver 8X gen-on-gen performance improvement and it's higher performance in inference than the competition. We have 62,000 models running on Instinct today and more models will be running on the MI300 very soon. We have very strong momentum, as you can see in the ecosystem, adding open AI training to our extensive list of NG standard frameworks, models, runtimes and libraries. And you heard from the panels, right? Our tools are proven and easy to use. Innovators are advancing the state of the art of AI on AMD GPUs today. ROCm 6 and the MI300X will drive an inflection point in developer adoption, I'm confident of that. We're empowering innovators to realize the profound benefits of pervasive AI faster on AMD. Thank you. And now I'd like to invite Lisa back on the stage. Thank you, Victor. And weren't those innovators great? I mean, you love the energy and just all of the thought there. So look, as you can see, the team has really made great, great progress with ROCm and our overall software ecosystem. Now, I said I wanted though, we really want broad adoption for MI300X. So let's go through and talk to some additional customers and partners who are early adopters of MI300X. Our next guest is a partner really at the forefront of GenAI innovation and working across models, software and hardware. Please welcome Ajit Matthews of Meta to the stage. Hello, Ajit, it's so nice of you to be here. We're incredibly proud of our partnership together. Meta and AMD have been doing so much work together. Can you tell us a little bit about Meta's vision in AI? Cause it's really broad and key for the industry. Absolutely, thanks Lisa. We are excited to partner with you and others and innovate together to bring generative AI to people around the world at scale. Generative AI is enabling new forms of connection for people around the world, giving them the tools to be more creative, expressive and productive. We are investing for the future by building new experiences for people across our services and advancing open technologies and research for the industry. We recently launched AI stickers, image editing, Meta AI, which is our AI assistant that spans our family of apps and devices and lots of AIs for people to interact within our messaging platforms. In July, we opened access to our Llama 2 family of models and as you've seen it, have blown away by the reception from the committee who have built some truly amazing applications on top of them. We believe that an open approach feeds to better and safer technology in the long run as we have seen from our involvement in the PyTorch Foundation, Open Compute Project and across dozens of previous AI models and data set releases. We're excited to have partnered with the industry on our generative AI work, including AMD. We have a shared vision to create new opportunities for innovation in both hardware and software to improve the performance and efficiency of AI solutions. That's so great, Ajit. We completely agree with the vision. We agree with the open ecosystem and that really being the path to get all of the innovation from all the smart folks in the industry. Now, we've collaborated a lot on the product front as well, both EPYC and Instinct. Can you talk a little bit about that work? Yeah, absolutely. We have been working together on EPYC CPUs since 2019 and most recently deployed Genoa and Bergamo-based servers at scale across Meta's infrastructure where it now serves many diverse workloads. But our partnership is much broader than EPYC CPUs and we have been working together on Instinct GPUs starting since the MI100 in 2020. We have been benchmarking ROCm and working together on improvements for its support in PyTorch across each generation of AMD Instinct GPU, leading up to MI300X now. Over the years, ROCm has evolved, becoming a competitive software platform due to optimizations and ecosystem growth. AMD is a founding member of PyTorch foundations and has made significant commitment to PyTorch investment providing day zero support for PyTorch 2.0 with ROCm, Torch.compile, Torch.export, all of those things are great. We have seen tremendous progress on both Instinct GPU performance and ROCm maturity and are excited to see ecosystem support grow beyond PyTorch 2.0, like to open AI Triton, today's announcement with respect to being a default backend of AMD, that's great, FlashAttention-2 is great, Hugging Face, great, and other industry frameworks. All of these are great partnerships. It really means a lot to hear you say that, Ajit. I think we also view that it's been an incredible partnership. I think the teams work super closely together, that's what you need to do to drive innovation. And the work with PyTorch Foundation is foundational for AMD, but really the ecosystem as well. But our partnership is very exciting right now with GPUs, so can you talk a little bit about the 300X plans? Oh, here we go. We are excited to be expanding our partnership to include Instinct MI300X GPUs in our data centers for AI inference workings. Thank you, so much. So, just to give you a little background, MI300X leverages the OCP accelerator module, standard and platform, which has helped us adopt in record time. In fact, MI300X is trending to be one of the fastest designed-to-deployment solutions in the Meta history. We have also had a great experience with ROCm, and the performance is able to deliver with MI300X. The optimizations and the ecosystem growth over the years have made ROCm a competitive software platform. As model parameters increase and the Llama family of models continues to grow in size and power, which it will, the MI300X with its 192 GB of memory and higher memory bandwidth meets the expanding requirements for large language model inference. We are really pleased with the ROCm optimizations that AMD has done, focused on the Llama 2 family of models on MI300X. We are seeing great, promising performance numbers, which we believe will benefit the industry. So, to summarize, we are thrilled with our partnership and excited about the capabilities offered by the MI300X and the ROCm platform as we start to scale their use in our infrastructure for production workloads. That is absolutely fantastic, Ajit. Thank you, Lisa. Thank you so much. We are thrilled with the partnership and we look forward to seeing lots of MI300Xs in your infrastructure. So, thank you for being here. That's good. Thank you. So, super exciting. We said cloud is really where a lot of the infrastructure is being deployed, but enterprise is also super important. So, when you think about the enterprise right now, many enterprises are actually thinking about their strategy. They want to deploy AI broadly across both cloud and on-prem, and we're working very closely with our OEM partners to bring very integrated enterprise AI solutions to the market. So, to talk more about this, I'd like to invite one of our closest partners to the stage, Arthur Lewis, President of Dell Technologies Infrastructure Solutions Group. Hey, welcome, Arthur. I'm so glad you could join us for this event. And Dell and AMD have had such a strong history of partnership. I actually also think, Arthur, you have a very unique perspective of what's happening in the enterprise, just given your purview. So, can we just start with giving the audience a little bit of a view of what's happening in enterprise AI? Yeah, Lisa, thank you for having me today. We are at an inflection point with artificial intelligence. Traditional machine learning and now generative AI is a catalyst for much greater data utilization, making the value of data tangible and therefore quantifiable. Data, as we all know, is growing exponentially. A hundred terabytes of data was generated last year, more than doubling over the last three years. And IDC projects that data will double again by 2026. And it is clear that data is becoming the world's most valuable asset. And this data has gravity. 83% of the world's data resides on-prem, and much of the new data will be generated at the edge. Yet customers are dealing with years of rapid data growth, multiple copies on-prem across clouds, proliferating data sources, formats, and tools. These challenges, if not overcome, will prevent customers from realizing the full potential of artificial intelligence and maximizing real business outcome. Today, customers are faced with two suboptimal choices. Number one, stitch together a complex web of technologies and tools and manage it themselves, or two, replicate their entire data estate in the public cloud. Customers need and deserve a better solution. Our job is to bring artificial intelligence to the data. That's great perspective, Arthur. And that 83% of the data and where it resides, I think, is something that sticks in my mind a lot. Now let's move to a little bit of the technology. I mean, we've been partnering together to bring some great solutions to the market. Tell us more about what you have planned from a tech standpoint. Well, today's an exciting day. We are announcing a much-anticipated update to the family of our PowerEdge 9680, the fastest growing product in Dell ISG history, with the addition of AMD's Instinct MI300X Accelerator for artificial intelligence. Effective today, we are going to be able to offer a new configuration of eight MI300X accelerators, providing 1.5 terabytes of coherent HBM3 memory, delivering bandwidth of 5.3 terabytes per server. This is an unprecedented level of performance in the industry and will allow customers to consolidate large language model inferencing onto a fewer number of services, while providing for training at scale, while also reducing complexity, cost, and data center footprint. We are also leveraging AMD's Instinct Infinity Platform, which provides a unified fabric for connecting multiple GPUs within and across servers, delivering near linear scaling and low latency for distributed AI. Further, and there's more. Through our collaboration with AMD on software and open source frameworks, which Lisa, you talked a lot about today, including PyTorch and TensorFlow, we can bring seamless services for customers and out-of-the-box LLM experience. We talked about making it simple. This makes it incredibly simple. And we've also optimized the entire stack with Dell storage, specifically power scale and object scale, providing ultra low latency ethernet fabrics, which are designed specifically to deliver the best performance and maximum throughput for generative AI training and inferencing. This is an incredibly exciting step forward. And again, effective today, Lisa, we're open for business, we're ready to quote, and we're taking orders. I like the sound of that. Look, it's so great to see how this all comes together. Our teams have been working so closely together over the last few years and definitely over the last year. Tell us though, there's a lot of co-innovation and differentiation in these solutions. So just tell us a little bit more about that. Well, our biggest differentiator is really the breadth of our technology portfolio at Dell Technologies. Products like power scale, which is our one file system for unstructured data storage, has been helping customers in industries like financial services, manufacturing, life sciences, to help solve the world's most challenging problems for decades as the complexity of their workflows and scale of their data estate increases. And with AMD, we are bringing these components together with open networking products and AI fabric solutions, taking the guesswork out of building tailored gen AI solutions for customers of all sizes, again, making it simple. We have both partnered with Hugging Face to ensure transformers and LLMs for generative AI don't just work for our combined solutions but are optimized for AMD's accelerators and easy to configure and size for workloads with our products. And in addition to that, Dell validated designs, we have a comprehensive set and a growing array of services and offerings that can be tailored to meet the needs of customers looking for a complimentary gen AI strategy consultation all the way up to and fully managed solution for generative AI. That's fantastic, Arthur. Great set of solutions, love the partnership and love what we can do for our enterprise customers together. Thank you so much for being here. Thank you for having me, Lisa. Yeah. Our next guest is another great friend. Supermicro and AMD have been working together to bring leadership computing solutions to the market for many years based on AMD EPYC processors as well as Instinct accelerators. Here to tell us more about that, please join me in welcoming CEO Charles Liang to the stage. Congratulations. Thank you so much. Hello, Charles. For a successful launch. Yeah, thank you so much for being here. I mean, Supermicro is really well known for building highly optimized systems for lots of workloads. We've done so much together. Can you share a little bit about how you're approaching gen AI? Thank you. Because our building block solution based on a modularized design. So that enables Supermicro to design product quicker than others and deliver product to customer also quicker, better leverage inventory and better for service. And thank you for our close relationship. Thank you for all I have. So that's why we are able to design product time to market as soon as possible. Well, I really appreciate that our teams also work very closely together. And we now know that everybody is calling us for AI solutions. You've built a lot of AI infrastructure. What are you seeing in the market today? Oh, the market continues to grow very fast. The only limitation is- Very fast, right? Very fast. Maybe more than very fast. So all we need is just more chips. I know. So today, including USA, Netherlands, Taiwan and Malaysia, we have more than 4,000 rack per month capacity and customer facing to no enough power, no enough space problem. So with our rack-scale building block solution, with free air cooling, optimized for hybrid air and free air cooling, optimized for liquid cooling, that can have customer safe energy power up to 30 to even 40%. And that allow customer to install more system with fixed power budget and all same power, same system, but less energy cost. So all of those, together with our rack-scale building block solution, we installed a whole rack, including generative CPU, GPU, and storage, switch, firmware, management software, security function. And when we shift to customer, customer just simply plug in two cable, power cable, data cable, and then ready to run, ready to online. For liquid cooling customer, for sure they need a water kind of tube. So that make a customer can easily online with one chip available. Yeah, no, that's fantastic. Thank you, Charles. Now, let's talk a little bit about MI300X. What do you have planned for MI300? Okay, the big product. We have a product based on MI300X, like 8U for air cooler, or for the air cooler. And then 4U optimize for liquid cooler. So the air cooler per rack, we support up to 40 kW or 50 kW. For liquid cooler, we support up to 80 kW or 100 kW. And so all kind of rack-scale plug and play. So when customer need, once we have chip, we can ship the customer quicker. That sounds wonderful. Well, look, we appreciate all the partnership, Charles, and we will definitely see a lot of opportunity to collaborate together on the generative AI. So thank you so much. Thank you so much. Thank you. Okay, now let's turn to our next guest. Lenovo and AMD have a broad partnership as well that spans from data center to workstations and PCs, and now to AI. So here to tell us about this special partnership, please welcome to the stage, Kirk Skaugen, EVP and President of Infrastructure Solutions Group at Lenovo. Hello, Kirk. Thank you so much for being here. We truly appreciate the partnership with Lenovo. You have a great perspective as well. Tell us about your view of AI and what's going on in the market. Sure. Well, AI is not new for Lenovo. We've been talking and innovating around AI for many years. We just had a great supercomputing where we're the number one supercomputer provider to the top 500, and we're proud that IDC just ranked us number three AI server infrastructure in the world as well. So it's not new to us, but you are at Tech World, so thanks for joining us in Austin. We're trying to help shape the future of AI from the pocket to the edge to the cloud, and we've had this kind of concept of AI for all. So what does that mean? Pocket meaning Motorola, smartphone, AI devices, and then all the way to the cloud with our ODM Plus model. So our collaboration with our customers is really to accelerate AI adoption, and we recently announced another billion dollars to the original $1.2 billion we announced a few years ago to deliver AI solutions to businesses of all sizes, from the smallest business to the largest cloud. So we believe that generative AI will ultimately be a hybrid approach, and fundamentally we do want to bring AI to the data. I think one of the most exciting things for me is, I think like Arthur said, right, we'd see data doubling in the world over the next few years. 75% of that compute is moving to the edge, and today we're only computing 2% of it, so we're throwing away 98%. So more data is going to be created in the next few years in the entire history of the world combined, and together we're bringing AI to the edge with the recent SE455 ThinkEdge that we announced. We think that there's kind of three views of generative AI, public AI, private AI, and personal AI, and the key for us is protecting privacy and addressing data security. So public AI where you'd use obviously public data, enterprise AI where you'd use only your enterprise data within your firewall, and then on things like an AI PC, things that you choose to have only on your device, whether that's a phone, a tablet, or a PC. Yeah, no, no, it's a very comprehensive vision, and we see it very much the same way. Now, you talked a lot about your AI strategy at Tech World, and you had some key pillars there. Do you want to just tell us a little bit more about that? Yeah, so I think there's three fundamental pillars of our AI vision and strategy. First, we have an AI product roadmap, I think that's second to none, from a rich smart device portfolio, and we'll talk about AI PCs probably more in another day, smartphones and tablets. Then we have a huge array now of over 70 AI-ready server and storage infrastructure products, and then we've recently launched a whole set of solutions and services around that as well. So more than 70 products, and we'll talk about the new ones we're announcing today, which are very exciting. The second thing is we have something called an AI innovators program. What's really daunting to people is there's over 16,000 AI startups out there. So if you have an IT department of a few dozen people, how do you even start? So we've gone and scoured the earth, we've found 65 ISVs, 165 solutions where we've optimized them on top of Lenovo infrastructure for some of the key verticals, and are delivering kind of simplified AI to the customer base. And then at Tech World, we launched a comprehensive set of professional services. Now Lenovo, more than 40% of our revenue is non-PC, so we're transforming into data center and services. So we're doing everything in the AI from just basic customer discovery of what you can do if you're a stadium, what are the best-in-class stadium solutions if you're a fast food chain, if you're a supermarket, all the way to AI adoption. And then even from a sustainability perspective, things like asset recovery services to make sure you have a sustainable AI journey as well. Yeah, I know it makes a lot of sense. And you know, gen AI and large language models is sort of the defining moment for us right now. You're spending a lot of time with customers. What are you hearing from them and what are their challenges? Yeah, so I think the key message is that customers need help in simplifying their AI journey. I mean, there's so much coming at them. So our investments in that $2 billion they talked about are really expanding our AI-ready portfolio to deliver fully integrated systems that bring AI-powered computing to everywhere data is created, especially the edge, and helping businesses easily and efficiently deploy generative AI applications. We're also hearing that customers want choice. Choice in systems, choice in software, choice in services, and definitely large language models and model training are creating a lot of buzz. But over time, I think we all know inference is gonna become the dominant AI workload as data flows from these billions of connected devices at the edge. So generative AI from our perspective, like you said, I think in your opening comments, needs high-performance compute, large and fast memory, and a software stack to support the leading AI ecosystem solution. So with that, I believe Lenovo and AMD are really uniquely positioned to take advantage of these trends. Yeah, absolutely. And our teams are doing a lot of work together and working closely on MI300X. Tell us more about your plans. Well, we have a long proven track record as a PC company and as a data center company of bringing Ryzen AI to our ThinkPads, and we're committed to being time to market on large language models, on inferencing, and we're working with AMD to develop our next-gen AI product roadmap and our solution portfolios. So we're incredibly excited today about the addition of the MI300X to the Lenovo ThinkSystem platform. It's gonna be very exciting. Thank you. Thank you. So we're committed to be time to market with a dual-EPYC 8 GPU MI300X and have a lot of customer interest on that. So bottom line, from edge to cloud, we are incredibly excited about what's ahead for us. We're gonna have all of this available as a service through our Lenovo TruScale as well. So you only have to pay for what you need. So as we move to an asset service model, everything we talked about today will be available through that as well. So thank you very much and look forward to continuing the collaboration. Absolutely, Kirk. Thank you so much. Thanks for the partnership. All right, thank you. So that's great. Big thank you to Kirk and Arthur and Charles for all the work that we're doing together to really bring MI300X to our customers. It really does take an entire ecosystem. We're very proud of actually the broad OEM and ODM ecosystem that we have brought together to bring a wide range of MI300X solutions to market in 2024. And in addition to the OEM and ODM ecosystem, we're also significantly expanding our work with some of these specialized AI cloud partners. So I'm happy to say today that all of these partners are adding MI300X to their portfolio. And what's important about this is it will actually make it easier for developers and AI startups to get access to MI300X GPUs as soon as possible with a proven set of providers who each have their unique value and capabilities. So that tells you a little bit about the ecosystem that we're putting together for MI300X. Now, we've given you a lot of information already, but what is very, very important is not just the hardware and the software and all of our customer partnerships, but it's also the rest of the system partnerships. So now let me welcome to the stage Forrest Norrod to talk more about our AI networking and high-performance computing solutions. Thank you, Lisa. Good morning. So far, we've talked about the amazing GPU and open software ecosystem that AMD is building to power generative AI systems. But there's a third element that's equally important to the performance and scalability of these large AI deployments, and that's networking. The compute required to train the most advanced models has increased by a factor of 50 billion over the past decade. While GPU performance has also increased, what that performance demand means is we need many GPUs in order to deliver the required total performance. Leading AI clusters are now tens of thousands of GPUs, and that's only going to increase. Well, so the first way we've scaled to meet that demand is within the server. A typical server has perhaps a couple of high-performance x86 CPUs and perhaps eight GPUs. You've seen that today. These are interconnected with a high-performance, low-latency, non-blocking local fabric. In the case of NVIDIA, that's NVLink. For AMD, that's Infinity Fabric. Both have high signaling rates, low latency, both are coherent. Both have demonstrated the ability to offer near-linear scaling performance as you increase the number of GPUs, and both have been proprietary, effectively only supported by the companies that created them. I'm pleased to say that today, AMD is changing that. We are extending access to the Infinity Fabric ecosystem to strategic partners and innovative companies across the industry. Doing so allows others to innovate around the AMD GPU ecosystem to the benefit of customers and the entire industry. You'll hear more about this from one of our partners in a few minutes and much more on this initiative next year. But beyond the node, we still need to connect and scale to much larger numbers. We need fabrics to connect the servers to one another, welding them into one resource. Now, there are usually two networks connected to each of these GPU servers. A traditional ethernet network used to connect the server to the rest of the data center traditional infrastructure, and more importantly, a backside network to interconnect the GPUs, allowing them to share parameters, results, activations, and coordinate in the overall training and inference tasks. When we're connecting thousands of nodes like we do in AI systems, the network is critical to overall performance. It has to deliver fast switching rates at very low latency. It must be efficiently scalable so that congestion problems don't limit performance. And in AMD, we believe it must also be open, open to allow innovation. Today, there are two options for the backend fabric, InfiniBand or Ethernet. At AMD, we believe Ethernet is the right answer. It's a high-performance technology with leading signaling rates. It has extensions such as RoCE and RDMA to efficiently move data between nodes, a set of innovations developed for leading supercomputers over the years. It's scalable, offering the highest-rate switching technology from leading vendors such as Broadcom, Cisco, and Marvell. And we've seen tremendous innovation recently in advanced congestion control to deal with the issues of scale effectively. And most of all, it's open. Open means companies can extend ethernet, innovating on top as needed to solve new problems. We've seen that from Hewlett Packard Enterprise with their Slingshot technology, which powers the network at the heart of Frontier, the world's fastest supercomputer, enabling it to achieve exascale performance. And we've seen Google and AWS, who run some of the largest clusters in the world, develop their own ethernet extensions. And finally, maybe most importantly, we've seen the industry come together to create the Ultra Ethernet Consortium and Standard, where leaders across the field have united to drive the future of ethernet and ensure it's the best high-performance interconnect for AI and HPC. And we're proud to welcome to the stage today some of those networking leaders. Andy Bechtolsheim from Arista, Jas Tremblay from Broadcom, and Jonathan Davidson from Cisco. Welcome, gentlemen. It's not often that we have such a panel of ethernet experts on the stage. But before we jump right into ethernet, perhaps we can talk a little bit about the work of enabling an ecosystem for AI solutions, and what that looks like, and why is it so important to have an open approach? And maybe, Jonathan, you can start. Sure, absolutely. Well, first of all, congratulations on all the announcements today. We look at how ethernet is so critical, because I remember back in the day doing testing on 10 megabit ethernet interoperability. We're now at 400 gig, 800 gig. We have line of sight to 1.6 terabit. It is absolutely ubiquitous across the industry, and it's also interoperable. It's a beautiful thing. So that open standard is really important for us to be able to make this successful. Absolutely. And, Jas, your thoughts as well. No, I 100% agree. Forrest, you and I share a vision of the power of the data center ecosystem. You think about a data center, you've got thousands of companies coming together to work as one, and this is really enabled by open standards and a code of conduct that we shall interop. We're gonna make things work together across companies, in some cases, across competitors, and I'm especially excited about the work that you and I have been doing on Infinity Fabric xGMI, and we wanna let the industry know that the next generation of Broadcom PCIe switches, which are used as the internal fabric inside AI servers, are gonna support Infinity Fabric xGMI, and we'll be sharing more details around that over the next few quarters. But I think it's important that we offer choices and options to customers, and that we come together and jointly innovate. I completely agree, and Andy, you've long been a proponent of open. Yeah, well, open standards have been the driving force for a lot of the innovation throughout the industry's history, but nowhere is this more true than in the case of ethernet, where the incredible progress we've seen for the last 40 years would not have happened without the contributions of many, many ecosystem participants, including the companies that are represented here on this stage. Absolutely, well, okay, so since this is a panel of ethernet luminaries, let's talk about ethernet in particular. What are the advantages of ethernet for AI? What are the advantages of ethernet in general, and how are customers using it today? We'll talk about the future in a minute, but let's reflect on current state. Maybe, Andy, you can start out. Yeah, so ethernet, at least to me, is the clear choice for AI fabrics, and for very basic reason, it doesn't have a scalability limit. It can truly support not just 10,000s of nodes today, but 100,000s, perhaps even a million nodes in the future, and there is no other network technology that has that attribute, and without that scalability, you're just boxing yourself in. Yeah, very true, and Jonathan, I know you guys have been working quite a bit on AI networking systems as well. Maybe you could amplify. Absolutely. Well, for today specifically, we see the majority of hyperscalers, as you've had some of them on the stage today, are either using ethernet for AI fabrics, or there's a high desire for them to move to ethernet for the AI fabrics, and so that requires a lot of collaboration from the folks up here on stage to make that happen. We also have been helping customers deploy in the past their AI networks for enterprise use cases globally, and it might have started more in the financial trading sector in the past, but we're seeing a tremendous amount of interest in use cases for that whole system and how you pull all those things together from the network, the GPU, the NIC, the DPU, all the way to how you wrap the software around that to really make it simple and understand how things are working, and when they're not working, why, and making that simple for them to do that as well. Absolutely, and Jas, I know, well, all of us have been working together in deploying ethernet-based solutions for AI leaders today, but I mean, we've been working with the two gentlemen on the end on switching, but Jas, maybe you can reflect on the NIC as well. I think the NIC is critical. People want choices, and we need to move the innovation even faster in the NIC, and you'll see much more linkages between the NIC and the switch, where before you had a compute domain and a network domain, and these things are really coming together, and AI is a driving force of that, because the complexity is going up so much. Yeah, absolutely. Well, okay, so let's talk about the future a little bit. You know, the Ultra Ethernet Consortium is all three, all four companies on stage are founding members, and there's many others that have joined. You know, UEC is one of the fastest growing, or maybe the fastest growing consortium under the Linux Foundation, which has been great to see. It's gonna shape, I think, UEC is gonna shape the future of AI networking, and so let's unpack that, because I think that's a critical topic for folks. And maybe, Jas, why don't you go ahead and start off. Yeah, so first of all, ethernet is ready today for AI, but we need to continue to innovate, and UEC started with a group of eight companies, including four of our companies here, cloud providers, system providers, and semiconductor providers, coming together around a common vision, and the vision is AI networks need to be open, standards-based, we need to offer choices, and we need to enhance them. And with that common vision, you know, the engineers we've assigned from other companies really got together and rolled up their sleeves, and the innovation happened extremely quickly. It's quite exciting, actually. And one of the things that I'm most excited about this is we're not building something new. We are jointly going to enhance ethernet that's existed for 50 years. So it's not starting from scratch, it's enhancing, it's recognizing that ethernet is what people want. We just need to continue to enhance it and making this open and standards-based. Absolutely, and Jonathan, I know Cisco's been a huge proponent of UEC as well. Maybe you can reflect on your thoughts of where this is going. Absolutely, well, I think that UEC absolutely is very critical for Cisco, everyone on the panel, and the whole industry so that we can continue to drive that movement towards open. It always takes time. You gotta debate what are the right technical way to solve things, but I think that overall it's moving in the right direction. What I see what's happening here is that we're gonna have to have interoperability in more than just one area. Andy, I wanna talk about LPO and all the things that we need to do there to make that actually happen. And what's happening at UEC is another important part. And what I see what's happening between now and when the first standard comes out is really a coalition of the willing. Like, how do we get all of us together to drive towards those open interfaces, whether it be at the ethernet layer, whether it be at things that you need to plug into it, how the GPUs connect into that, how you're actually gonna spray traffic across a very broad radix, how you're gonna make sure you can reorder packets in a consistent way. These are all things that we need to make sure that we are driving towards from an interoperability perspective. And we've got our own silicon, we've got optics, but we also are in the component business at Cisco. And so we sell those things. Hyperscalers might wanna just buy pieces from us, like the silicon, and enterprises may want the full system. But we wanna make sure that it's absolutely 100% interoperable in every single environment. Absolutely. And Andy, maybe you can hone in a little bit more. I mean, I think many people that aren't familiar with networking may think, hey, how hard can this be? We're just shuffling bits around between systems, but there's a lot of problems to solve. Yeah, so UEC is in fact solving a very important technical problem, which is the way we describe it is modern RDMA at scale. And this has not been solved before. To be clear, you know, RoCE today exists, but it has its limitations. And it does take an ecosystem effort approach, and it involves in particular the adapter, the next silicon vendors, but also the whole end-to-end interoperability of that architecture. We're very excited to be part of this. We're not in the NIC business ourselves, but this is absolutely key to enable scaling of RDMA across 100,000s, if not a million nodes. Yeah, absolutely. And when you look at what's being predicted in terms of million node, hundreds of thousands up to a million node systems, I mean, we all have our work cut out for us, but working together, I know we can solve the problems. Well, guys, thanks so much for coming to talk to us today. I'd like to thank you all for your partnership in this journey, and thank you all for coming today. Thanks very much. Thanks, guys. Thank you so much. I'd really like to thank our partners from Arista, Broadcrown, and Cisco for attending and for their partnership in driving this critical third leg that determines the performance of AI systems. Now, let's turn our focus to high-performance computing, the traditional realm of the world's largest systems. AMD has been driving HPC technology for many years. In 2021, we delivered the MI250, introducing third-generation Infinity architecture. It connected an EPYC CPU to the MI250 GPU through a high-speed bus, Infinity Fabric. That allowed the CPU and the GPU to share a coherent memory space and easily trade data back and forth, simplifying programming and speeding up processing. But today, we're taking that concept one step further, really to its logical conclusion, with the fourth-generation Infinity architecture bringing the CPU and the GPU together into one package, sharing a unified pool of memory. This is an APU, an Accelerated Processing Unit. And I'm very proud to say that the industry's first data center APU for AI and HPC, the MI300A, began volume production earlier this quarter and is now being built into what we expect to be the world's highest-performing system. Now, Lisa already showed you what our chiplet technologies make possible with the MI300X. The MI300A takes those same building blocks in a slightly different fashion. Now, the IO die is laid down first, as before, and contains the infinity cache and connections to memory and IO. The XCD accelerator chiplets are bonded on top, as in the MI300X. But with the MI300A, we also take CPU chiplets leveraged directly from our fourth-generation EPYC CPUs, Genoa, and we put those on top of the IODs as well, thus bringing together our leading CPU, Zen, and CDNA technologies into one amazing part. Finally, eight stacks of HBM3 with up to 128 gigs of capacity complete the MI300A. A key advantage of the APU is no longer needing to copy data from one processor to another, even through a coherent link, because the memory is unified, both in the RAM as well as in the cache. The second advantage is the ability to optimize power management between the CPU and the GPU. That means dynamically shifting power from one processor to another, depending on the needs of the workload, optimizing application performance. And very importantly, an APU can dramatically streamline programming, making it easier for HPC users to unlock its full performance. And let's talk about that performance. 61 teraflops of double-precision floating point, FP64. 122 teraflops of single-precision. Combined with that 128 gigabytes of HBM3 memory at 5.3 terabytes a second of bandwidth, the capabilities of the MI300A are impressive. And they're impressive, too, when you compare it to the alternative. When you look at the competition, the MI300A has 1.6 times the memory capacity and bandwidth of Hopper. For low-precision operations like FP16, the two are at parity in terms of computational performance. But where precision is needed, MI300A delivers 1.8 times the double and single-precision FP64 and FP32 floating point performance. And beyond simple benchmarks, the real advantages of an APU come with the performance of real-world applications which have been tuned for the APU architecture. For example, let's look at OpenFOAM. OpenFOAM is a set of computational fluid dynamics codes widely used across research, academia, and industry. With MI300A, we see four times the performance of Hopper on common OpenFOAM codes. Now, that performance comes from several places, from higher performance math operations as we talked, larger memory and the increased memory bandwidth. But much of that uplift really comes from that unified memory eliminating the need to copy data around the system. That can perform for tuned applications truly transformative performance. And I'm also proud to say that beyond performance, AMD has stayed true to its heritage, to its history of leading in power efficiency. At the node level, the MI300A has twice the HPC performance per watt of the nearest competitor. Customers can thus fit more nodes into their overall facility power budget and better support their sustainability goals. With the MI300A, we set out to help our customers advance the frontiers of research and not just running traditional HPC applications. One of the most exciting new areas in HPC is actually the convergence with AI, where AI is used in conjunction with HPC techniques to help steer simulations, thus getting much better results much faster. A great example of this is CosmoFlow. It couples deep learning with traditional HPC simulation methods, giving researchers the ability to probe more deeply and allowing us to learn more about the universe at scale. CosmoFlow is one of the first applications targeted to be run on El Capitan, which we believe will be the industry's first true two exaflop supercomputer running double precision float when it's fully commissioned at Lawrence Livermore National Labs. Now it. It's gonna be an amazing machine. So let's hear more about El Capitan and its applications for HPC and AI from our partners at LLNL and Hewlett Packard Enterprises. We expect El Capitan to be an engine for artificial intelligence and deep learning. We will recreate the experimental environment in simulation, generate lots of data, for example, and then train our artificial intelligence methods on that simulation data. El Capitan will be the most capable AI machine and its use of APUs at this scale will be the first of its kind. As you operate these exascale level workloads, all of those nodes talk to each other. AMD and HPE have a long legacy of partnership and it was only natural for us to partner again for El Capitan. The MI300A can be versatile across many different workloads and we couple it directly with our slingshot fabric to give it high performance as it operates as a system. We work very closely with AMD and HPE to deliver the hardware and the software that's actually used by the scientists in the machine itself. It's really that partnership together that can really go after and build these supercomputers. El Capitan will be 16 times faster than our existing machine here at Lawrence Livermore. It will enable scientific breakthroughs that we can't even imagine. We're proud to have partnered with Hewlett Packard Enterprises to design and now build this amazing system. And so I'd like to invite to the stage Trish Damkroger, the Senior Vice President and Chief Product Officer for HPC AI and Labs from Hewlett Packard Enterprise. Thank you. Welcome, Trish. The AMD and HPE teams have been working closely together over the years to deliver some next generation supercomputers. Most recently, of course, we've broken the exascale barrier. I gotta say that again. We broken the exascale barrier with Frontier for Oak Ridge National Labs. And now we're looking forward to powering another exascale system and another bench, another record with you with El Capitan for Lawrence Livermore National Labs, another US Department of Energy lab. Maybe you can share more with this audience about our journey together and the innovations that we've ushered in this journey to exascale. Sure. First, I wanna echo the long partnership that we've had with AMD. Frontier continues to be the fastest computer in the world. Many doubted our ability to actually reach exascale, but we were able to achieve this feat with industry-leading liquid cooling infrastructure, next-generation high-performance interconnect with Slingshot, our highly differentiated system management and Cray programming environment software, along with the incredible MI250. With Frontier, exascale computing has already made breakthroughs in areas such as aerospace, climate modeling, healthcare, and nuclear physics. Frontier is also one of the world's top 10 greenest supercomputers. In fact, HPE and AMD have the majority of the world's top 10 energy-efficient supercomputers. I am very excited to deliver El Capitan to Lawrence Livermore. As you know, I worked there for over 15 years. El Capitan's computing prowess will fundamentally shift what the scientists and engineers will be able to achieve. El Capitan's gonna be 15 to 20 times faster than their current system. Supercomputing is truly essential to the mission of the Department of Energy. Lawrence Livermore has been at the forefront driving the convergence of HPC and AI, demonstrated by work at the National Ignition Facility and other of the national security programs. I'm really looking forward to continuing our journey of bringing more leadership-class systems to the world. Absolutely. I couldn't agree more, Trish. It's been a rewarding journey working together with HPE. But speaking of our shared success in building these record-breaking systems, can you tell us a bit more about El Capitan and how HPE is developing the Instinct 300A-powered CPU to El Capitan? Great, yes. El Capitan will feature the HPE Cray EX supercomputer with the MI300A accelerators to power large AI-driven scientific projects. The HPE Cray EX supercomputer was built from the ground up with end-to-end capabilities to support the magnitude of exascale. El Capitan nodes include the MI300A, coupled with our Slingshot Fabric to operate as a fully integrated system. Supercomputing is the foundation needed for large-scale AI, and HPE is uniquely positioned to deliver this with our Cray supercomputers. El Capitan will be that engine for AI and deep learning for the Department of Energy. They will be recreating the experimental environment and simulations and training the AI models with all of that vast amount of data. El Capitan will be one of the most capable AI systems in the world. And beyond El Capitan, we're excited to have expanded our supercomputing portfolio with the MI300A to bring next-generation accelerated compute to a broad set of customers. Yeah, so Trish, that's fantastic. And actually, let's double-click into that a little bit more. I know that there are a growing number of supercomputing customers, not just at LLNL, that are really applying AI to their projects. Can you tell us a little bit even more about that? Sure, so AI undoubtedly will be the catalyst to transform scientific research. As I said earlier, supercomputing is the foundation needed to run AI. And HPE is the undisputed leader in delivering supercomputers. Some example where AI will be fundamental in El Capitan include the National Ignition Facility, where they will be using 1D, 2D, 3D simulations, along with trained AI models to develop a more robust design for higher-yield fusion reactions. Just imagine fusion energy in our future. Another application is high-resolution earthquake modeling, essential for understanding building structural integrity and also emergency planning. And one more application is bioassurance, where simulation and AI models will be key in developing rapid therapeutics. Supercomputing and AR are tools to allow engineers and scientists the ability to find the unknown. I'm thrilled to be part of the journey of accelerating scientific discovery and the scale of impact it has on changing the way people live and work. Fantastic. Well, Trish, thank you. I'm so excited about the opportunities that researchers and scientists will have with the systems that we're bringing to the market together. Thanks so much. Thank you. Yeah, on behalf of AMD and the entire team, I really wanna just thank HPE and our customers for the opportunity to participate in the development of these massive systems. Because El Capitan will be an amazing machine and a real showcase for the MI300A, which defines leadership at this critical junction as HPC and AI converge. AMD is proud of the leadership systems powered by MI300A, which will be available soon from partners around the world. I can't wait to see what researchers and scientists are gonna do with these systems. And with that, I'd like to welcome Lisa back on stage to conclude our journey today. Thank you. All right, thank you, Forrest, and thank you to all of our partners who joined us. You've heard from Victor, Forrest, our key partners. We have significant momentum, and we're building on that for the data center AI platforms. To cap off the day, let me now talk about another important area for AMD where we're delivering leadership AI solutions, and that's the PC. Now, for the PCs, we recognized several years ago that on-chip AI accelerators, or NPUs, would be very, very important for next generation PCs. And the NPU is actually the compute engine that will enable us to reimagine what it means to build a truly intelligent and personal PC experience. At AMD, we're on actually a multi-year journey. We have a strong roadmap to deliver the highest performance and most power-efficient NPUs possible. We were actually the first company to integrate an NPU into an x86 processor when we launched Ryzen Mobile 7040 series earlier this year, and we integrated the XDNA architecture that actually came from our acquisition of Xilinx. It actually took us less than a year to bring Xilinx's proven technology into our PC products. Let me tell you a little bit about XDNA. It's a scalable and adaptive computing architecture. It's built around a large computing array that can efficiently transfer the massive amounts of data required for AI inference. And as a result, XDNA is both extremely performant and also very energy-efficient. So you can run multiple AI workloads simultaneously in real time. Now, I'm happy to say that we've already shipped millions of Ryzen AI-enabled PCs into the market with all of the leading PC OEMs and all of this provides the hardware foundation for developers to leverage this first wave of AI PCs. Now, if you look at some of the applications, today Ryzen AI powers hundreds of different AI functions, things like advanced motion tracking and sharpening to deep blur 4K video, enabling production level digital production capabilities with unlimited virtual cameras, all in an ultra-thin notebook for the very first time. We're also working with key software leaders like Adobe and Blackmagic, and they're using our on-chip Radeon GPU to accelerate the AI-enabled editing features so that you can dramatically improve productivity for content creators. And of course, we've worked very, very closely with Microsoft to enable Windows 11 Studio Effects on Ryzen AI. Now, today we're launching some additional capabilities. So Ryzen AI 1.0 software, it will make it easier for developers to add advanced gen AI capabilities. So with this new package, developers can create an AI-enabled application that's ready to run on Ryzen AI hardware just by choosing a pre-trained model. So for example, you can choose one of the models that are available on Hugging Face, quantize it based on your needs, and then deploy it through ONNX Runtime. So this is a major step forward when you think about the broad ecosystem that wants to run AI apps for Windows, and we can't wait to see what ISVs will do when they really capture the leadership performance that you can get from an NPU in Ryzen AI. Now, of course, we know developers always want more AI compute. So today, I'm very happy to say that we're launching our "Hawk Point" Ryzen 8040 Series Mobile Processors. And, Thank you. "Hawk Point" combines all of our industry-leading performance and battery life, and it increases AI TOPS by 60% compared to the previous generation. So if you just take a look at some of the performance metrics for the Ryzen 8040 Series, if you look at the top of the stack, so Ryzen 9 8945, it's actually significantly faster than the competition in many areas, delivering more performance for multi-threaded applications, 1.8x higher frame rates for games, and 1.4x faster performance across content creation applications. But when you look at the AI improvements of Ryzen 8040, you really see some substantial improvements. So I talked about additional TOPS in "Hawk Point", and what that results in faster performance when you're running the key models. So things like Llama 2 7B, we run 1.4x faster, and also 1.4x faster on things like AI image recognition and object detection models. So all of this, what does it do? It provides faster response times and overall better experiences. Now, I really believe that we're actually at the beginning of this AI PC journey, and it's something that is really gonna change the way we think about productivity at a personal level. So we've been working very closely with Microsoft to ensure that we are co-innovating across hardware and software to enable those next generation of AI PCs. To share more about this work, I'm pleased to welcome Pavan Davuluri, Corporate Vice President of Windows and Devices at Microsoft to the stage. Hey, how are you? Great to be here. Pavan, thank you so much for being here. We started the show with Kevin Scott talking about the great partnership between Microsoft and AMD, and all the work we're doing on the big iron, and the cloud, and Azure. And it seemed fitting that we close the show with the other very, very important work that we're doing together on the client side. So can you tell us a little bit, Pavan, about all the great work and your vision for client AI? For sure. As you and Kevin covered, Microsoft and AMD have a long partnership together across Azure and Windows. And it's incredible to see us moving that partnership together into the next wave of technology with AI. As you shared, Lisa, for us, there are millions of PCs right now with Ryzen 7040 AI in market. And that's amazing because these are the first x86 PCs with integrated NPUs, enabling enhanced AI experience. You told me everybody wanted NPUs. Absolutely. And you know, right now we get to see some incredible AI features. Somebody talked about Windows Studio FX coming to life across the scale of the ecosystem. Absolutely fantastic, I would say. Now, for us at Microsoft and for the ecosystem, our marquee AI experience is really Copilot. Similar to how the start button is the gateway into Windows, the Copilot for us is the entry point into this world of AI on the PC. It has a fundamental impact on everything we will do on a computer, from work and school and play and entertainment and creation. You know, I completely agree, Pavan. I think Copilot is so transformational. I mean, for everyone who's had a chance to experience it, it's so, it really changes the way we do work. So let's talk about the tech that's underneath it. So to enable Copilot and everything that we want to do on PCs. We are putting together new systems architectures that really power those experiences going forward and they really pull together GPU, NPU and certainly the cloud as well. And quite honestly, we're seeing customer habits change early at this point in time and we believe to your point earlier, we're early in the cycle of innovation that's coming. When we have these powerful NPUs like the ones you're building, it gives us an opportunity to create apps that take advantage of both local and cloud inferencing. And to me, that's what the Windows AI ecosystem is about and that's what we're building in partnership with you. It's designed to enable those scenarios with the ONNX runtime of course and the Olive tool chain to back this up. Applications are gonna have many models like Llama that you mentioned, FI2 running and they will run very capably in the TOPS that we will have. And of course, not to mention the foundation models that are powered by the GPUs in the cloud. Yeah, I mean, I think this is an area where Microsoft and AMD really have a very unique position because we have so much capability in the cloud, we have also access to the client and the local view. Can you share a bit about how we're thinking about across all of these, the cloud local view? Yeah, with AMD, we're making it simpler to incorporate what we call the hybrid pattern or the hybrid loop into applications. And we wanna be able to load shift between the cloud and the client to provide the best of computing across both those worlds. For us, it's really about seamless computing across the cloud and the client. It brings together the benefits of local compute, things like enhanced privacy and responsiveness and latency with the power of the cloud, high performance models, large data sets, cross platform inferencing. And so for us, we feel like we're working together to build that future where Windows is the destination for the best AI experiences on PCs. Yeah, no, I think that sounds great. Now, one of the things though that you definitely are always talking to me about is more TOPS. I ask for more TOPS all the time. So look, we completely believe that to enable your vision for AI experiences, we've really thought about how do we actually accelerate our client AI roadmap. So I wanna share a little bit of our roadmap today. Ryzen 7040 and 8040, we've already delivered those industry-leading NPU capabilities. But today, I'm very excited to announce that our next-gen Strix Point Ryzen processors will actually include a new NPU powered by our second-generation XDNA architecture coming in 2024. Congratulations. Thank you. So a little bit about XDNA 2. It's designed really for leadership-gen AI performance. It delivers more than three times the NPU performance of our current Ryzen 7040 series. And Pavan, I'm very happy to share, I know your teams already know this because you have the silicon, but today, Strix Point is running great in our labs and we're really excited about it. Our teams have been working really closely together to make sure that all of those great future Windows AI features run really well on Strix Point. So I can't wait to share more about that later this year. Lisa, that's awesome. And we will use every TOP you will provide us. You promised, right? Absolutely. And it's not just the size of the neural engines, the dramatic increase in efficiency performance per watt of these next-generation NPUs. We think we'll bring a whole new level of capabilities to the market, enabling personalization on every interaction on these devices. Together with Windows, we feel like we're building that future for the Copilot where we will orchestrate multiple apps, services, and across devices, quite frankly, functioning as an agent in your life that has context and maintains context across entire workflows. So we're very excited about these devices coming to life for the Windows ecosystem. We're excited to see what developers will do with this technology. And quite frankly, the other day, ultimately, what customers will do with all of this innovation. Thank you so much, Pavan. We are so excited about the partnership. We appreciate all the long-term work we're doing together and look forward to lots of great things to come. Thank you for having me, Lisa. Thank you, Pavan. Thank you. Thank you. All right, so it's been such a fun day, but now it's time for me to wrap up a bit. We've showed you a lot of new products, a lot of new platforms, a lot of new technologies that are all about taking AI infrastructure to the next level. MI300X, MI300A accelerators, these are all shipping today in production. They're already being adopted by Microsoft, Oracle, Meta, Dell, HP Enterprise, Lenovo, Supermicro, and many others. You heard from Victor how we're expanding the ecosystem of AI developers, working with us, ROCm 6 software, the open ecosystem, that our goal is to make it incredibly easy for everyone to use Instinct GPUs. You heard from Forrest in our panel on the overall system architecture, our work with Arista, Broadcom, and Cisco. We believe that to create this high-performance AI infrastructure, it has to be open, and that's what we're doing together for scale-out AI solutions. And then you heard what we're doing on the other side, the client part of our business, because we actually believe AI should be everywhere. So our latest Ryzen processors really extend our compute vision and our AI leadership. I hope you can see that AI is absolutely the number one priority at AMD. Our goal is to push the envelope, to bring innovation to the market, to do more than anything thought was possible, because we believe, as wonderful as our technology is, it is about doing it together in a partner ecosystem where everybody brings their best to the market. Today is a... I want to say on a personal level, today is an incredibly proud moment for AMD. If you think about all of the innovation, everything that we bring to the market, to be part of AI at this time, at the beginning of this era, to work with these amazing people throughout the industry, throughout the ecosystem at AMD, I can say that I've never seen something more exciting. A very, very special thank you to all of our partners who joined us today, and thank you all for joining us.
Info
Channel: AMD
Views: 164,990
Rating: undefined out of 5
Keywords: AMD, Advanced Micro Devices, EPYC, cpu, processors, gpu, graphics, pc, together we advance_, Gaming, Server, Computer, Desktop, Laptop, Ryzen, AI
Id: tfSZqjxsr0M
Channel Id: undefined
Length: 130min 9sec (7809 seconds)
Published: Wed Dec 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.