FINALLY! Intel Joins the Generative AI Chip War (Supercut)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm excited to introduce the Intel gouty 3 AI accelerator for the first time so you ready Intel gouty [Applause] [Music] 3 huge advancements in generative AI building on the pro proven performance of gudy 2 delivering 4X the AI compute with bf16 2x with FP uh 8 2x the networking bandwidth one and a2x the memory bandwidth for massive scaleout significant leap in performance for AI training popular llms multi model support and Intel gudy 3 is architected from the ground up for efficient large scale AI Computing supporting both scale up and scale out uh configurations gudy 3 is great on benchmarks versus the h100 on top open Llama models 50% faster on time the train 50% better on inferencing 40% better on inference power efficiency a scalable solution with gy 3 using industry standard ethernet like we talked about uh before scaling performance from one socket to a eight accelerator ubb a single node up to thousands of nodes and racks and nearly all of the geni deplo developments today are moving to high higher level environments pie torch Frameworks and other community models from hugging face industry is quickly moving away from proprietary Cuda models literally a few lines of code and you're able to be up and running with industry standard Frameworks on power performant efficient gouty infrastructure now let's talk a little bit more about availability and a bit nerdier together Intel gouty 3 is available to OEM starting this quarter of this year three different industry standard form factors the accelerator card that I was just showing off here to you we're also adding to the gouty family a PCI E card and finally our Universal baseboard man look at this big boy yeah I always bring Big Brother along with me and uh you know with it you know that range of solutions and here you know up to 14.6 fp8 pedop flops greater than a terabyte of hbm 3E 100 0 192 native 200 gbit ethernet Connections in this uh big boy and with our first gouty pcie card extraordinary compute density the same capabilities as the mezzanine card giving our customers great flexibility 1.8 peda flops of fp8 capabilities 128 gab of uh memory in 600 watt TDP making it an easy adjunct to a Zeon deployment being able ble to deliver industry standard infrastructure into your data center and gudy 3 eliminates vendor lockin with standard networking Solutions in your data centers in your cloud and with open software environments best of all best of all you ready huge TCO advantages for your deployment this is a winning solution ladies and gentlemen welcome to the gouty 3 generation and I'm very happy to announce the next generation of Zeon Zeon 6 processors the new brand for our next generation of both efficient core and performance core solution zon 6 and with the rapid rise in Computing requirements and the immense data sets that we're working with workloads are driving tradeoffs for performance efficiency and density and in return more demand for CPUs and cores that deliver better performance per watt for the first time our zon 6 with ecore uniquely addressing these challenges and this is our first volume part on Intel 3 that we'll be bringing uh into the marketplace and super excited it's coming along very well and we'll be moving Sierra Forest into production this quarter and we're seeing this uh ability to deliver up to 2.7x better rack density 2 .5x performance per watt improvements let's just take an example of a Telco server deployment where they might have 200 racks using second gen Zeon for their uh infrastructure for their Telco environment with Zeon 6 with ecores we can reduce that down to just 72 racks same performance same capabilities and less management less networking less things to upgrade less things to be supp supporting the environment and this results in over 1 megawatts of power Savings in a Trad traditional data center or 1300 Arizona homes you know now have energy sources available to them simply put it frees up energy capacity builds TCO efficiency improves physical space that enables our next Generation gen and PE cor Solutions into the marketplace well the Big Brother of Sierra Forest is our Zeon gen processor formerly known as Granite Rapids with peores and of course right we have one of those Wafers here too what a surprise right and here's our Granite Rapids Intel 3 wafer right which will be launching shortly after the Sierra Forest product uh and uh we're quite excited to be ramping this into the marketplace this year also on the Intel 3 process technology so it's sort of like twins right over 60% of computing is now done in the cloud or in some Cloud embodiments but the vast majority of data is still on Prem and indications are that 66% of that data is unused and 90% of the unstructured data is unused what a value unlock and this idea of llms and rag gives us extraordinary opportunity to unlock this hidden asset that you and your businesses aren't being able to leverage and operate on today and Zeon is clearly a tremendous machine for running these rag environments and making llms more effective and efficient on your data Zeon is not only able to be your database front end but increasingly it's able to run the llms as well no new management no new networking no new security models no new things to learn no proprietary networking it just works on the industry standards that you know and love with that let me introduce you to the next star of the geni show The zon 6 P cor or Granite Rapids that we were looking over so here's my granite Rapids quite excited for this baby coming into the marketplace this year but let's get started by looking at four different Tech streaming Ser uh services that are running on three generations of Zeon and over here right we have two generations of Gen 4 right we have Gen 5 and then Zeon 6 here and what we're going to do is we're going to take one common llama 2 70b and in our first two systems we're going to write a haiku about Intel AI Solutions and this one's going to be using uh uh fp16 and this one's going to be using the 4 bit that we just described so let's kick these off and we're going to see the comparison of just moving to the 4-bit format on the same generation of Hardware so we'll get this kicked off write a high coup about Intel AI Solutions okay sure and uh quickly running through it and we see over here 170 milliseconds of latency and not quite done over here yet we're getting to it shortly here and uh what you'll see is about a 3X Improvement in the latency just as we move to the modern data types for fp4 okay now let's go from Gen 4 to gen five and we're going to run the same query and we'll see how long it takes us on a gen five machine and uh we'll have this guy run through here so here's a k cou about Intel AI software Solutions and we're now at 150 milliseconds or about a 3X reduction as we went from gen four to Gen 5 what do you think pretty good and that is really good but most would say 100 milliseconds is sort of the threshold what do you think can we get below 100 milliseconds maybe let's give it a try and we'll kick this baby off here if I hit the enter versus the shift key and over here we'll quickly see that's 82 milliseconds literally across three generations of Zeon taking advantage of our latest Innovations on granite Rapids Monument Creek memory the highest performance dram you know memory combining that with fp4 you know model MX fp4 capabilities brings us clearly in range where I can now run very Hefty models right on the Zeon platform from fourth gen Zeon with 16 bits up to today uh less than 100 milliseconds 6.4x improvement from Gen 4 to zon 6 again without major upgrades just run it in your data centers this is you know llama 270b parameters no expensive proprietary stuff crap whatever entering your data center and with that we see them turning to models like gouty and Intel gouty's price performance advantages lie in three fundamental domains First Choice people one an alternative right open software approach and Investments and providing time to Market available lower TCO Solutions and gudy is the only Benchmark alternative to the Nvidia h100 for training llms and with bringing Zeon and gouty together you know it really is an AI system but then how do I scale those systems in my data centers and that's where the networking fabric comes to play while there's some proprietary Sol Solutions available we also see that the ultra ethernet Consortium and the work that we're driving is standing up to fill this scale up and scale out networking domain and through ueec Intel is revolutionizing building on ethernet networking for AI fabrics for the future and we'll be introducing an array of AI optimized ethernet Solutions this lineup will include Cutting Edge AI Nick cards that we'll be delivering standard Nick Solutions it will also include AI chiplets that will enable for our customers and partners in Intel Foundry it will also include soft and hard IP through Intel Foundry and it's building on and leveraging the work that we've done integrating it fully into our gouty Solutions as well a range of soft and hard reference designs for AI connectivity because we don't need proprietary networking for our AI solutions for the future what do you think is that a winner but should the AI happen in the cloud or at the edge and I call it the three laws the laws of Economics the laws of physics and the laws of the land economics it's simply too expensive to bring your data back to the cloud it gives you better cost control bandwidth is the most expensive resource that you have and cloud is the most expensive compute that you have in the fleet so the laws of Economics move it to the edge but second is the laws of the physics as I like to say and how many of you have changed the speed of light this week what's the roundtrip requirement what's the user interface expectations for instantaneous response or how fast does the robotic arm need a response if it's 20 milliseconds you can't 200 millisecond round trip to the cloud the laws of physics draw you to the edge and hey we're we're pretty profound technologist but we ain't changing the speed of light but maybe the third is the most powerful the laws of the land and we often need to keep data on Prem for privacy regulatory security reasons and data laws have become increasingly diverse and restrictive by regions across the world and every nation has some form of gdpr and this volume of data to tap in at the edge is simply staggering so the three laws and as the edge becomes increasingly important it's going to become we believe the dominant AI workload Research indicates that by 2026 50% of edge computing deployments will involve machine learning and AI compared to just 5% today a killer use case where before competitors Shi their first chips we're launching our second lunar Lake 3x the AI performance this little Marvel has over 100 platform tops 45 npu tops alone and before others get started we are on to the second generation the third generation's in Fab because we're going to drive the aipc category to every fingertip of your users and your customers and just like with Wi-Fi after you started to use an aipc you can't imagine not what do you mean you don't have a Wi-Fi connection what's the matter with you get rid of this piece of trash AI will be enabling every business critical worker automating streamlining collaborating new insights everything I've talked about today it's not going to happen overnight but it's also not going to take long you we see this Enterprise AI transformation happening right around us today but we see it unfolding in three different stages first the age of the AI co-pilot and we expect productivity improvements of maybe 25% from those kind of capabilities but the second age is nigh upon us the age of Agents where the AI automates entire workflows and think about this like an autonomous car I let go of the wheel and it's doing it for me the age of AI agents is nigh upon us and we expect that enterprises will develop the main specific models and agents will program to act autonomously through their periods and checkout agents and compliance and customer service automatically partnering with us 7 by 24 and with this just being able to unlock next phases of productivity but the third age the age of functions where agents start to interact with other agents and literally entire departments become AI automated Solutions and maybe we'll have the age of the first one person billion doll company as we look to it securely unlocking the world's data that's still held inside of these data bases on Prem that's not being leveraged today and how do we you know bring this to every portion of our business operations as you saw I need to transform my Manufacturing in our Fabs we need to be 10x because the people I'm competing with they have lower costs of Labor we have to out innovate them using AI to drive our Fabs and Manufacturing and each one of you need to do the same for your businesses as well where the technology is gaining cognitive capabilities as strong as humans in different domains and we expect that we need not a little bit more computing power but 10,000 times more computing power for this generation and achieving full aging capabilities literally this decade and as stewards of Moore's Law we're are going to relentlessly pursue more power efficient Computing because all of this that we've talked about today everything we've talked about is made possible by the power of silicon as I said at the start every company is going to become an AI company together we'll Unleash the Power of your data we're going to accelerate your productivity we're going to enable your Workforce we're going to vastly improve the ROI of your technology Investments all while making it more sustainable and secure Intel we were made for moments like this together with all of you we're going to change the world again thank you
Info
Channel: Ticker Symbol: YOU
Views: 38,622
Rating: undefined out of 5
Keywords: intc stock, intel stock, intc, intel gaudi 3, intel gaudi 3 vs nvidia h100, intel vs nvidia, openai, chatgpt, sora, sam altman, msft, msft stock, microsoft stock, arm stock, tsmc, tsm stock, intel foundry, nvda, nvda stock, nvidia stock, amd stock, ai chips, semiconductor stocks, intel keynote, ai stocks, best ai stocks, top ai stocks
Id: j9C03FJZVww
Channel Id: undefined
Length: 16min 49sec (1009 seconds)
Published: Fri Apr 12 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.