Jetson Xavier NX versus Jetson Nano - Benchmarks and Custom Model Inference

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to my channel hardware AI today we are going to have a good look at new development board from Nvidia Jetson Xavier and X and compare it to another development kid from Nvidia Jetson Anna the compute module Xavier and X was announced on November 6 2019 but the development kit which includes the module itself and the reference carrier board was announced half a year later on May 14th 2020 this is the opposite situation of what happened with Jetson Anna in that case development kit came first and then the compute module became available for purchase so how does Xavier NX compared to Jetson Anna the price difference is significant it's 99 u.s. dollars versus 399 for Xavier and X or about 400% you Sarah clears fresh hold when you know that Jetson Anna is not going to be enough for your application and you need to up your game a level before we start our comparison we should note that these two products are being targeted for two different consumer markets Nvidia Jetson anna is for makers and STEM education and new Xavier and X more geared to its professional and commercial use it was very obvious from sample applications that were released on products launched for Jetson Anna it was G ed bought with series of user friendly notebooks and for Jetson Xavier and X it was a demo of cloud native applications which appeals more to commercial users they still warrants a comparison between these two you think about it as Xavier and X being a sports car with two turbo engines and Jetson anna is more down-to-earth seden we don't always go for fastest car available there are the considerations that we also have in this video we'll go over hardware specs benchmarking cloud native container demo and custom model inference looking at small but important details sometimes overlooked in on the reviews let's start by looking at two deaf kids side-by-side I'll skip the unpacking because well you know how to open the box without me if not please leave a comment below I'll make another tutorial soon first the similarities both development boards are similar in size and the modules are exactly the same size and form factor which is great for developers it means that during development stages swapping nano module for jetsam Xavier and ex won't require changing connectors or physical design of carrier boards providing they use revision BR one of Jetson and development kits both carrier boards have Gigabit Ethernet jack for USB 3 ports although for Jetson Anna is 3.0 and for Xavier it is 3.1 USB HDMI port and DisplayPort now for differences Xavier and next development keyboard has 1 M 2 ke and 1 M 2 km connector with M 2 key slot already occupied with bullit who is Wi-Fi module M 2 key M can be used for attaching nvme SSD Jetson and carrier board has only one M two key connector right here Jetson Nana carrier board I have here is a or two carrier board the one that originally went on sale when Jetson Nano development kit was just released it has some differences from beyond one carrier board notably that bo1 has to CSI two camera interfaces same as Xavier and next carrier board and on my old a two carrier board you only see one CSI two camera connector xavier NX has active cooling installed while Jetson anna only has a heatsink it is because Xavier is much more power hungry than its younger cousin the development kit cannot be powered by five volt USB and it requires nineteen world power supply which fortunately is included in the box the reference carrier board of Xavier NX has more wealth of design with little plastic base that serves two purposes protects the circuits from directly touching the surface of your workplace and it also holds two antennas unlike with Jetson Anna where the two antennas sort of dangling around it's a slight detail but very nice let's quickly compare the specs now using the data from official and video website apart from very obvious things such as Xavier and X having CPU with more cores and higher clock and more M faster - it is the AI performance column that worse pay more attention to before you go holy cow the AI performance of Xavier in X is 44 times higher than of Jetson Anna it is worth noting that the comparison is between different units tops versus gigaflops or teraflops the reason for that is for Xavier and X the compute of envy GLA engines is included in that very impressive number Jetson Anna and all the members of Jetson family TX 1 TX 2 only have GPU for acceleration machine learning inference which are optimized for floating-point operations and vidyalay engines on the other hand a different beast entirely they are Asics or application specific integrated circuits more I came to Google TPU or intima videos chips they excel at running CNN inference in integer a precision doing that task faster and more energy efficient than GPUs the downside being that they are not as general purpose when it comes to different network architecture support and via realized that GPUs alone cannot beat highly specialized hardware and decided to take best of both worlds by having both advanced 384 core and video voltage GPU with 48 tensor course and dedicated CNN accelerators and a new module speaking of tensor course we also see that these are missing from Nvidia Jetson NS GPU now what on earth is tensor core that's exactly the title of an article in the first page of Google search results for a tensor core compare it to CUDA cores CUDA cores operate on per calculation basis each individual to the core can perform one precise calculation per revolution of GPU as a result clock speed plays a major role in the performance of CUDA as well as the mass of CUDA cores available on the card tensor course on the other hand can calculate with an entire 4x4 matrix operation being calculated per clock look at the animation here to get a sort of intuitive understanding of what's going on in regular kodokor versus tensor core alright that was a brisk but invigorating walks through forests of high-performance computer if you're an engineer like me you want to know exactly how all that translates to inference performance this time and Vidya created a dedicated github repository with easily downloadable and executable benchmarks a decision I can applaud - so I ran benchmarks from this repository on BOS and vidya Jetson Anna and Xavier and X quite unsurprisingly their results were very close to this from and video blog article something worth paying attention to is that for Xavier and X the total FPS is sum of FPS obtained from running model on two DLA engines and GPU simultaneously you can find that by looking at the content of benchmark pi you can also modify it to print out the total FPS and also inference time for each device like you saw in my benchmark for the second test we'll be using much-touted new feature in jetpack 4.4 cloud native basically it is about bringing continues ation and orchestration from servers to edge devices why do these to simplify the continued development containers are self-contained packages that include all the necessary environments to run the application main selling point of contouring ization is that they make upgrades easier because they're self-contained you don't need to worry about changing one application will how will it affect others and the environment you work in containers come and go being easily replaceable Nvidia prepared a customer service robot demo to showcase the new container management system and hardware capabilities of Xavier and X because the containers for the demo have tensor RT engine files built for Jetson a GX Xavier and Jetson Xavier and X and can be run on only on these two platforms so we'll download and try another container deep stream l40 and run sample applications from within the container sample applications are ready compiled and ready to run within the container unfortunately they hard-coded to look for their respective config file in the folder where you're on the application which might cause some confusion have a look at the article to find out how to run the sample applications we will go for deep stream app test three Jetson Anna can only run smoothly one beer stream once we add the second one performance drop significantly surprisingly enough for Jetson Xavier and X we see similar picture once we add the second stream we can see that it starts dropping frames as we can see GPU and CPU load did not go to 100% so it means the bottleneck that hurts our performance is somewhere else my suspicion that it is the slow speed of memory card for and VDS customer service robot demo it is necessary to use nvme SSD so perhaps that is the reason as for our final comparison let's try stepping aside from the demos Nvidia have provided and using a model we trained ourselves in the end this is the real test of performance and user friendliness the results you can achieve as opposed to carefully optimized demos we will use accelerate a carries based framework for AI on the edge its purpose is to simplify the training and conversion of the models to be run with hardware acceleration on various edge devices such as k2 10 edge TPU Android and Raspberry Pi and also and video development boards and video model optimization toolkit 10 CRT is very different from other toolkits for model conversion such as an encase on Google coral converter unlike the rest of them with 10 CRT you need to optimize model on the target device since optimizations depend on target devices architecture 10 CRT definitely deserves a video or possibly video series of its own and I will make one in the future we will use NASA app mobile trained on Stanford dog breeds data set after training is done download the own nnx file to Jetson Oh an annex to trt Pi file in example scripts and Vidya Jetson classifier folder now we can run classifier Vidya Pi on a sample video file Jetson ena average is FPS of around fifteen frames per second and Jetson Xavier can process the video with adorable dogs at about thirty frames per second consult the article to reproduce experiments by yourself and convert other models to be run on jets and devices as for conclusion new and Vidya Jetson Xavier and X is a beast its power-hungry but if performance is what you're aiming for this is possibly best module you can get at this footprint and price if you are developing an application that requires processing multiple video streams at high resolution while performing a sro NOP tasks or another GPU related tasks such as cuda enabled slam for example then Jetson Xavier and next deep learning accelerators can take on CNN inference and leave GPU for other tasks something that older TX - oh just and Nana are not capable of for makers and hobbyists well Hugh buying it make sure you have enough technical knowledge to properly utilize the capabilities of this hardware and eliminate the performance bottlenecks if you encounter them do you have any ideas in mind what sort of application can fully utilize the full extent of hardware capabilities of Jetson Xavier and X if you do leave the comment below if the video was helpful for you press the like button and subscribe to my channel until the next time
Info
Channel: Hardware.ai
Views: 28,151
Rating: 4.873817 out of 5
Keywords: nvidia, jetson, jetson nano, jetson xavier nx, tensorrt, computer vision, asr, nlp, comparison, benchmark, versus, better
Id: s0lW3Hkcx-k
Channel Id: undefined
Length: 17min 4sec (1024 seconds)
Published: Wed Jun 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.