M3 Max Benchmarks with Stable Diffusion, LLMs, and 3D Rendering

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video I want to share some Impressions and Benchmark results for the 16-in M3 Max MacBook Pro I've seen quite a few videos on this machine but none of Dove to what I consider this machine's Target demo which is creators and consumers of video 3D and AI if you don't actively require computing power folders types of applications this machine is probably Overkill and your money is probably better spent on the MacBook with the pro trip specifically the Bass Pro with its 12 core CPU and 18 core G CPU with 18 GB unified memory the 14 it starts at around $1,999 and the 16 at $24.99 now as spec the machine that I bought runs $41.99 it has a top of line M3 Max chip with its 16 core CPU in 40 core GPU along with 64 gigs of RAM and for me a more than adequate enough uh 1 TB SSD this is honestly a lot of money to pay for any computer but if you want a portable Mac that should fly through complex Computing test this is the one to get so the question is does it fly through those tasks to help us decide and see how far we've come we're benchmarking against my outgoing M1 MacBook Pro this was the base model at the time and for around $2,000 came with the 8 core CPU and 14 core GPU and I speced it with 16 gigs of RAM now to be clear the M3 should be much faster than this one but that's not really the point my expectation is a machine will able me to work more efficiently and more importantly open up new avenues of work that simply weren't possible before so start with AI I have two main programs to test private llm for chat and draw things for image generation the results here actually surprised me quite a bit first chat the program we're using is a paid app on the Mac App Store called private llm it features two models mestro 7B open Orca and for 16 gig machines like we have here the more capable wizard LM 13B here's where I was surprised using the more complex 13 billion parameter model routinely resulted in the older M1 completing chat task in about the same time as the M3 but the M3s answers were significantly more detailed for example asking what clouds are made from resulted in an answer almost five times as long on the M3 compared to the M1 now that said the initial times to First Response were much f fter on the M3 and individual token generation especially for shorter responses was also much faster here we can see the M3 generated response faster than what I can read as opposed to the M1 which pokes lwn a lot slower overall this is a huge win for the M3 and better still the developer plans to release larger models that'll require 32 gigs of RAM or more my Max's 64 gigs of RAM provides plenty of overhead whereas the m116 gigs is simp simply going to be left out in the cold that's one potential buying lesson here if you're already spending several thousand on one of these machines I'd highly recommend you grab at least 64 gigs of RAM honestly having less means you're potentially depriving these incredible processors of their full potential one quick odity I'll have to mention is while running these llms both machines produced a high-pitch wine it's not hurting anything but it's the first time I've ever heard this behavior from these machines and it happened on both of them the M3 and M1 okay let's move on to image generation via stable diffusion for this task I use an app called Draw things and going into this purchase my primary Hope was the performance I get would be much closer to my PC with its dedicated GPU which is an RTX 380 I want to be clear here from this task it's more than just raw time it's about how I tend to work with image generation the gist is for me this is an iterative task meaning I rarely get the result I want on the first go which means the faster I can spit out images the more likely I am to use this technology with my PC I'm able to spit out a square 512 pixel image with 20 steps in just over 2 seconds that's fast enough that this may as well be instantaneous and here's where I was most looking forward to this new machine with the same task the mwm was just two darn slow each image using approximately the same settings would take anywhere between 30 seconds to a minute to generate making that inative workflow a real pain so how does the M3 Max chip do surprisingly well in short a 512x 512 image takes about 4 and 1/2 seconds not quite as good as the 3080 but now more than fast enough to enable that quick iterative workflow moving up to a 768 x 768 image the PC clocks in at 6 seconds the M1 at 65 and the M3 at 14 I'm quite happy with this result but I would implore Apple to keep keep improving its hardware and software in this area finally using the stable diffusion XL 8bit model at 30 steps the M3 took 11 seconds compared to 55 on the M1 and going up to 150 steps the M3 completed it in 54 seconds compared to the m1's 272 so as far as image generation takeaways the M3 is much more usable to the point where I have no complaints the PC still dominates and to be clear the RTX 4090 is twice as fast again as my 3080 but overall the M3 enables AI workflows to a far greater degree than my M1 it goes from that's neat to a tool I actually want to use now couple of quick notes draw things lets you set which compute units are used with your options being CPU and GPU CPU and neural engine or allil I found that with the M1 the results are about the same across all options but on the M3 the neural engine becomes a liability posting results that are up to 40% slower than just using the CPU and GPU so I'm to keep in mind moving forward for Apple's designers of the neural engine okay on to 3D so here the results are far more Stark and honestly all come with a pretty big caveat see the M3 chips now support Hardware raate tracing and just as we saw when the same feature came to the Nvidia 30 and AMD 6000 series cards there is a night and day difference between rendering with Harbor retracing and without bottom line though I ran a few common scenes and the results are just plain awesome for the classic BMW scene the M3 finished in 8 and 1/2 seconds compared to a minute 15 on the M1 and there's a note here too which is that the BMW Benchmark was created long before the optical denoisers became as powerful as they are today this means we can easily pull our samples down to around 300 while making sure to turn on the open image de noiser the results are on the same quality but much faster still and so in this case on the M1 we get down to 19 seconds on the M3 3 seconds the big surprise for me here was the M3 is now almost as fast as my 380 which finishes a second faster using the Optics rendering engine when the PC uses Cuda however the M3 was actually faster by about 20% again a great result for Apple here moving on to the more complex tug boat scene we find the PC finished in 24 seconds the M3 at 30 and the M1 at a rather leisurely 3 minutes and 54 seconds now honestly I could go on with more scenes but the point is hopefully clear the M3s Hardware rate tracing is well it's a night and day difference and honestly kind of a GameChanger one final note about these tests Hardware R tracing is so key to this machine's performance that you actually get worse performance and higher power consumption when you enable to CPU specifically for Tugboat I saw render times go from 28 to 32 seconds all while having the fan spin up to be quite loud whereas with the GPO only we were actually faster and with total silence and so for my conclusion this is for me personally a fantastic upgrade every task I perform is faster and specifically when it comes to Hardware grade tracing strikingly so
Info
Channel: Matthew Grdinic
Views: 11,083
Rating: undefined out of 5
Keywords:
Id: YN4jFm-Eg6Q
Channel Id: undefined
Length: 8min 24sec (504 seconds)
Published: Mon Nov 20 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.