New IBM AI Chip Explained: Faster than Nvidia GPUs and the Rest!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this video is brought to you by Gradient.  Some days ago IBM released a new AI Chip   called NorthPole it's 20 times faster  and 25 times more energy efficient   than any other AI chip on the market and  what's interesting it's even outperforms   Nvidia H100 GPU which is fabricated in more  advanced process node. As a Chip Designer   I'm impressed by this masterpiece  of engineering. Let me explain! what makes this chip so interesting is that it  implements neuromorphic architecture it drives   the inspiration from the way our brain works  and our brain is vastly more energy efficient   than any silicon chip on Earth you can have a  chip with hundreds of megabytes of SRAM so you   can fit your entire application on there but you  still have to read the SRAM and get this data to   the processing unit and this proc process of  moving and shuffling the data around takes a   lot of time and energy and we can address this  bottleneck the way our brain does because my   brain has no off chip memory at least for now  so we can apply the same approach to Computing   by performing Computing operations directly  in the memory so one of the most fundamental   properties of neuromorphic Architecture is that  there's no external memory the the memory which   in a conventional processor is the cache right  that's sitting apart from the the processor in   some cases far on a different chip right in the  case of dram um that that's not the case in a   neuromorphic architecture the memory elements are  intertwined at a very fine scale they're they're   close to the processing elements where the neural  processing is happening and to this end IBM has   built the Computing structure around the memory  now IBM's n pole chip is a fully digital chip   it is fabricated using 12 nanometer technology  which is a mature process node and that's perfect   because unlike analog digital is fully scalable it  means that this chip can be fabricated in a five   or 3 nanometer process in the future which will  allow even better Computing efficiency here is the   North Pole chip it's designed by IBM research lab  in California it's actually the next generation of   the true north chip do you guys still remember it  let me know in the comments because IBM presented   the true north chip first time back in 2015 and  its new generation the North Pole chip is roughly   4,000 times faster than the previous chip oh wow  it's built of 22 billion transistors placed on 800   square mm of silicon so it's a pretty big chip  here you can see its architecture the memory is   coloured in blue and compute is in red you see  it has many Computing engines distributed across   a chip and each has a memory closely intertwined  with a Computing Engine with that on chip memory   they are able to store entire network on the  chip during the computation and in case we   run a neural network which is too big too large  to fit on a single nurse pole chip then we can   distribute it across several of them now let's  have a look what they've achieved with this chip   it's described in this paper which is beautifully  called neural inference at the frontier of energy   space and time I love it and here they've listed  almost all AI chips which are currently on the   market and they did a pretty solid comparison here  starting from the top they listed here Intel chips   Google TPU before then Intel Habana chip Qualcomm  chip the next is by do chip then two Amazon chips   then TP by applied brain research then they've got  SambaNova chip Graphcore IPU I have no idea what   is this one then Cerebras Wafer Scale Engine  2 another Intel chip and then all the Nvidia   chips starting with v 100 all the way to the a100  GPU and the latest h100 GPU and they list here   a process nodes power consumption throughput and  energy let's focus on the energy first it actually   shows us the power consumption for each throughput  of the Chip and these values and these results are   achieved using ResNet 50 Benchmark which is widely  used for image classification when they compare   it to Nvidia v00 GPU which is also fabricated  in the same technology at 12 nanometers North   Paul delivers 25 times more frames per Jewel in  other words it's 25 times more energy efficient   and on this Benchmark it even outperforms the  latest Nvidia H100 GPU which is taped out in more   advanced process node at 4 nanometers 4 nanometer  transistors are by default have greater Energy   Efficiency still despite that North Pole is five  times more energy efficient let me know what you   think in the comments to me that's impressive  and they've achieved this Energy Efficiency by   implementing in memory Computing and also High  data parallelism again due to it decentralised   memory now let's have a look at the time space  metric put it simply it describes how much time   it takes for a chip to perform computations so  to say to arrive to result when we compare North   Pole to Nvidia V 100 it's 18 times faster and  in comparison to the latest Nvidia h100 GPU it   is eight times faster while achieving a latency 22  times lower and that's cool what's interesting out   of all the chips listed here only three mentioned  which have uh on chip memory this is the wafer   scale engine 2 from Cerebras TSP from applied  brain research and IBM's North Pole there are a   bunch of other chips which are for some reason  didn't make it onto this list but before we   discuss them if you are looking to incorporate AI  into your product to help to drive more value to   your business there is an easy solution. let's say  you want to build a chatbot a chatbot that manages   customer support or helps you with scheduling a  process to build a custom AI application like this   can be challenging however gradient makes it easy  and cost effective creating a chatbot typically   requires a large language model and fine-tuning  a large language model can significantly increase   the accuracy and overall performance the only  problem is that this process of fine-tuning   the model can be complex and expensive requiring  specialised hardware and resources and this is   where gradient can help you gradient is a platform  that provides simple Web APIs for fine-tuning the   models generating completions and embeddings with  just a few lines of code you can fine-tune the   state of-the-art open-source models like llama 2  Bloom 560 and others to help you create a private   AI application you can use JavaScript python or  gradient common line interface whatever you like   and you only pay for the tokens that you use if  you sign up to gradient AI through the link below   you will get $5 in free credits to get started now  where the new IBM's North Pole chip will be used   IBM is positioning this chip for AI inference  application to already pre-trained AI models on   different devices locally like on my phone and by  that avoid the cloud and drastically reduce the   latency and improve the Privacy today if you look  at the progress that's being made um I mean that   it's it's uh very very exciting what's happening  with conventional architectures and conventional   very large you know Transformer networks right  um so I would not bet against this conventional   architecture in in terms of um overcoming the  limitations today that it faces but I think where   neuromorphic architecture clearly thrives and I  think that it's very clear that it will succeed   in the long term I believe is where we need  to deploy this intelligence into devices that   respond to real world change and stimulus and  have to Control Systems.. so in response to that   stimulation uh make decisions inferences adapt  to add you know to its knowledge uh and and to do   that in this realtime setting so the way I look at  it is if you're wanting to deploy intelligence to   the data center and you're going to have you know  vast troves of knowledge that are built up in data   sets uh databases that exist on disk right next  to those processors conventional architectures   are probably going to do very well for a long  period of time uh for that um on the other hand   if you want to deploy the intelligence out into  the world in systems in your vehicles into robots   um or even into just your cell phone maybe you  know that that there's that realtime interaction   that's required that's where neuromorphic uh  Computing will will really Thrive and succeed   in the future in the papers they mentioned that  North Pole can be placed in the devices locally   and be used for image and speech recognition as  well as for natural language processing it can   be also used in satellites to monitor agriculture  or Wildlife population actually a great example   here that North Paul can be used as a brain in  autonomous vehicles and robots and the reason   researchers see the neuromorphic architecture  as the most suitable for robotics is because our   brain has evolved over a long period of time to  control organs and bodies and to learn patterns   from the world and adapt to it that's where  IBM's North Pole chip could thrive! IBM is not   the only company working on neuromorphic chips  also so Intel has been developing their Loihi 2   chip for several years now this chip is still in  the research phase and the most interesting thing   about it that it's implementing spiking neural  networks which is an entirely different class of   neural networks that directly mimics our brain  activity it has no synchronous clock just like   our brain that's really interesting project  I have a separate video about them it will be   linked below then what about the Akida chip it's  developed by an interesting startup brain chip   originally from Australia which has created quite  a buzz from investors they were the first company   to commercialise the neuromorphic technology  to bring the first neuromorphic chip to the   market and their chip is also fully digital  it's fabricated in 28 nanometers by TSMC and   they also make use of spiking based model just  like our brain then there is a DYNAP chip by SynSense   which is a spin-off from the University of  Zurich and ETH Zurich and they're also developing   chips to be used at age and for robotics you see  many companies and investors are seeing a lot of   potential in neuromorphic Computing and that's  why this space is getting pretty crowded and   competitive because apart of the digital chips we  discussed there are also several projects ongoing   in which are working on analog neuromorphic  chips which are using different flavours of   analog memory one of them is another IBM chip Hermes, which I discussed in my previous video about   a month ago here they using phase-change memory to  perform in memory Computing and they have been   getting pretty great results with it but the big  difference here that digital memory store binary   values 0s or 1s while analog memory can store  several values each that basically allows the chip   to store more information in the same amount of  space let's say to store four or eight bits in   just one memory cell as a result it can perform  calculations much more energy efficient it's also   worth mentioning NeuRRAM AI chip from Stanford this  is a similar one but instead of face change memory   they use analog resistive memory finally there is  a famous us-based startup Mythic which is working   on analog AI chips there were a lot of discussions  about Mythic over the last several years they run   out of capital by the end of last year but  luckily this spring raised $13 million more   and have kept working on their analog AI chip but  they've now pivoted and are focusing more on the   defence sector still I would say analog approach is  in the research phase and has certain limitations   usually new memory techniques which are used  here like face change memory resistive memory   ECRAM and so on they are not yet so mature and  robust as multi-bit digital memory what makes it   even more complicated that in order to run your AI  models on analog chips you need to do additional   pre-training of the models and there are still  many things that have to to be... figured out   So what do you think guys let me know your thoughts  in the comments to me the performance of the   NorthPole is truly impressive and especially  that it bits even Nvidia H100 GPU on some of   the benchmarks however I think it will still  take quite some time maybe 3 to 5 years until   we see this chip or such chips being deployed  in the real systems if IBM ever decides to move   forward and create a product out of it what I  can say with the Advent of humanoid robots we   will definitely see more chips more custom chips  like this one being created for these applications   and Computing at edge is really where neuromorphic  Computing can stand out! If you enjoyed this video   leave me a comment below and share it with  your friends remember to check out gradient   with the link below thank you for watching  and we will see you in the next video ciao
Info
Channel: Anastasi In Tech
Views: 214,927
Rating: undefined out of 5
Keywords: IBM, New AI Chip, NorthPole, IBM Chip, Computer Chip, Neuromorphic Chip, Neuromorphic Computing
Id: p0W5eHn5sZ0
Channel Id: undefined
Length: 15min 33sec (933 seconds)
Published: Thu Nov 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.