New IBM AI Chip Explained: Faster than Nvidia GPUs and the Rest!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

this video is brought to you by Gradient. Some days ago IBM released a new AI Chip called NorthPole it's 20 times faster and 25 times more energy efficient than any other AI chip on the market and what's interesting it's even outperforms Nvidia H100 GPU which is fabricated in more advanced process node. As a Chip Designer I'm impressed by this masterpiece of engineering. Let me explain! what makes this chip so interesting is that it implements neuromorphic architecture it drives the inspiration from the way our brain works and our brain is vastly more energy efficient than any silicon chip on Earth you can have a chip with hundreds of megabytes of SRAM so you can fit your entire application on there but you still have to read the SRAM and get this data to the processing unit and this proc process of moving and shuffling the data around takes a lot of time and energy and we can address this bottleneck the way our brain does because my brain has no off chip memory at least for now so we can apply the same approach to Computing by performing Computing operations directly in the memory so one of the most fundamental properties of neuromorphic Architecture is that there's no external memory the the memory which in a conventional processor is the cache right that's sitting apart from the the processor in some cases far on a different chip right in the case of dram um that that's not the case in a neuromorphic architecture the memory elements are intertwined at a very fine scale they're they're close to the processing elements where the neural processing is happening and to this end IBM has built the Computing structure around the memory now IBM's n pole chip is a fully digital chip it is fabricated using 12 nanometer technology which is a mature process node and that's perfect because unlike analog digital is fully scalable it means that this chip can be fabricated in a five or 3 nanometer process in the future which will allow even better Computing efficiency here is the North Pole chip it's designed by IBM research lab in California it's actually the next generation of the true north chip do you guys still remember it let me know in the comments because IBM presented the true north chip first time back in 2015 and its new generation the North Pole chip is roughly 4,000 times faster than the previous chip oh wow it's built of 22 billion transistors placed on 800 square mm of silicon so it's a pretty big chip here you can see its architecture the memory is coloured in blue and compute is in red you see it has many Computing engines distributed across a chip and each has a memory closely intertwined with a Computing Engine with that on chip memory they are able to store entire network on the chip during the computation and in case we run a neural network which is too big too large to fit on a single nurse pole chip then we can distribute it across several of them now let's have a look what they've achieved with this chip it's described in this paper which is beautifully called neural inference at the frontier of energy space and time I love it and here they've listed almost all AI chips which are currently on the market and they did a pretty solid comparison here starting from the top they listed here Intel chips Google TPU before then Intel Habana chip Qualcomm chip the next is by do chip then two Amazon chips then TP by applied brain research then they've got SambaNova chip Graphcore IPU I have no idea what is this one then Cerebras Wafer Scale Engine 2 another Intel chip and then all the Nvidia chips starting with v 100 all the way to the a100 GPU and the latest h100 GPU and they list here a process nodes power consumption throughput and energy let's focus on the energy first it actually shows us the power consumption for each throughput of the Chip and these values and these results are achieved using ResNet 50 Benchmark which is widely used for image classification when they compare it to Nvidia v00 GPU which is also fabricated in the same technology at 12 nanometers North Paul delivers 25 times more frames per Jewel in other words it's 25 times more energy efficient and on this Benchmark it even outperforms the latest Nvidia H100 GPU which is taped out in more advanced process node at 4 nanometers 4 nanometer transistors are by default have greater Energy Efficiency still despite that North Pole is five times more energy efficient let me know what you think in the comments to me that's impressive and they've achieved this Energy Efficiency by implementing in memory Computing and also High data parallelism again due to it decentralised memory now let's have a look at the time space metric put it simply it describes how much time it takes for a chip to perform computations so to say to arrive to result when we compare North Pole to Nvidia V 100 it's 18 times faster and in comparison to the latest Nvidia h100 GPU it is eight times faster while achieving a latency 22 times lower and that's cool what's interesting out of all the chips listed here only three mentioned which have uh on chip memory this is the wafer scale engine 2 from Cerebras TSP from applied brain research and IBM's North Pole there are a bunch of other chips which are for some reason didn't make it onto this list but before we discuss them if you are looking to incorporate AI into your product to help to drive more value to your business there is an easy solution. let's say you want to build a chatbot a chatbot that manages customer support or helps you with scheduling a process to build a custom AI application like this can be challenging however gradient makes it easy and cost effective creating a chatbot typically requires a large language model and fine-tuning a large language model can significantly increase the accuracy and overall performance the only problem is that this process of fine-tuning the model can be complex and expensive requiring specialised hardware and resources and this is where gradient can help you gradient is a platform that provides simple Web APIs for fine-tuning the models generating completions and embeddings with just a few lines of code you can fine-tune the state of-the-art open-source models like llama 2 Bloom 560 and others to help you create a private AI application you can use JavaScript python or gradient common line interface whatever you like and you only pay for the tokens that you use if you sign up to gradient AI through the link below you will get $5 in free credits to get started now where the new IBM's North Pole chip will be used IBM is positioning this chip for AI inference application to already pre-trained AI models on different devices locally like on my phone and by that avoid the cloud and drastically reduce the latency and improve the Privacy today if you look at the progress that's being made um I mean that it's it's uh very very exciting what's happening with conventional architectures and conventional very large you know Transformer networks right um so I would not bet against this conventional architecture in in terms of um overcoming the limitations today that it faces but I think where neuromorphic architecture clearly thrives and I think that it's very clear that it will succeed in the long term I believe is where we need to deploy this intelligence into devices that respond to real world change and stimulus and have to Control Systems.. so in response to that stimulation uh make decisions inferences adapt to add you know to its knowledge uh and and to do that in this realtime setting so the way I look at it is if you're wanting to deploy intelligence to the data center and you're going to have you know vast troves of knowledge that are built up in data sets uh databases that exist on disk right next to those processors conventional architectures are probably going to do very well for a long period of time uh for that um on the other hand if you want to deploy the intelligence out into the world in systems in your vehicles into robots um or even into just your cell phone maybe you know that that there's that realtime interaction that's required that's where neuromorphic uh Computing will will really Thrive and succeed in the future in the papers they mentioned that North Pole can be placed in the devices locally and be used for image and speech recognition as well as for natural language processing it can be also used in satellites to monitor agriculture or Wildlife population actually a great example here that North Paul can be used as a brain in autonomous vehicles and robots and the reason researchers see the neuromorphic architecture as the most suitable for robotics is because our brain has evolved over a long period of time to control organs and bodies and to learn patterns from the world and adapt to it that's where IBM's North Pole chip could thrive! IBM is not the only company working on neuromorphic chips also so Intel has been developing their Loihi 2 chip for several years now this chip is still in the research phase and the most interesting thing about it that it's implementing spiking neural networks which is an entirely different class of neural networks that directly mimics our brain activity it has no synchronous clock just like our brain that's really interesting project I have a separate video about them it will be linked below then what about the Akida chip it's developed by an interesting startup brain chip originally from Australia which has created quite a buzz from investors they were the first company to commercialise the neuromorphic technology to bring the first neuromorphic chip to the market and their chip is also fully digital it's fabricated in 28 nanometers by TSMC and they also make use of spiking based model just like our brain then there is a DYNAP chip by SynSense which is a spin-off from the University of Zurich and ETH Zurich and they're also developing chips to be used at age and for robotics you see many companies and investors are seeing a lot of potential in neuromorphic Computing and that's why this space is getting pretty crowded and competitive because apart of the digital chips we discussed there are also several projects ongoing in which are working on analog neuromorphic chips which are using different flavours of analog memory one of them is another IBM chip Hermes, which I discussed in my previous video about a month ago here they using phase-change memory to perform in memory Computing and they have been getting pretty great results with it but the big difference here that digital memory store binary values 0s or 1s while analog memory can store several values each that basically allows the chip to store more information in the same amount of space let's say to store four or eight bits in just one memory cell as a result it can perform calculations much more energy efficient it's also worth mentioning NeuRRAM AI chip from Stanford this is a similar one but instead of face change memory they use analog resistive memory finally there is a famous us-based startup Mythic which is working on analog AI chips there were a lot of discussions about Mythic over the last several years they run out of capital by the end of last year but luckily this spring raised $13 million more and have kept working on their analog AI chip but they've now pivoted and are focusing more on the defence sector still I would say analog approach is in the research phase and has certain limitations usually new memory techniques which are used here like face change memory resistive memory ECRAM and so on they are not yet so mature and robust as multi-bit digital memory what makes it even more complicated that in order to run your AI models on analog chips you need to do additional pre-training of the models and there are still many things that have to to be... figured out So what do you think guys let me know your thoughts in the comments to me the performance of the NorthPole is truly impressive and especially that it bits even Nvidia H100 GPU on some of the benchmarks however I think it will still take quite some time maybe 3 to 5 years until we see this chip or such chips being deployed in the real systems if IBM ever decides to move forward and create a product out of it what I can say with the Advent of humanoid robots we will definitely see more chips more custom chips like this one being created for these applications and Computing at edge is really where neuromorphic Computing can stand out! If you enjoyed this video leave me a comment below and share it with your friends remember to check out gradient with the link below thank you for watching and we will see you in the next video ciao

Info

Channel: Anastasi In Tech

Views: 214,927

Rating: undefined out of 5

Keywords: IBM, New AI Chip, NorthPole, IBM Chip, Computer Chip, Neuromorphic Chip, Neuromorphic Computing

Id: p0W5eHn5sZ0

Channel Id: undefined

Length: 15min 33sec (933 seconds)

Published: Thu Nov 02 2023