Logic Pro X CPU Optimization | 2021 | M1 System Tips | Run more Plugins!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey logic users you want to run more plugins you want to master on the fly and you want your system to run stably without system overloads or interrupts how do i tune logic settings to get the absolute best out of my system if you run logic pro this is for you if you run logic pro on apple m1 systems this is definitely for you my name is mark payne i'm a live sound engineer and producer working here in the uk let's go so we're going to split this into six sections in the notes below i'm going to give the timestamps so you can jump around but i really suggest you come with me on the whole journey throughout the discussion in section 1 we're going to discuss how mac os assigns process to processors in section two we're going to look at the apple m1 system and both the ice storm high efficiency cluster of cpus and the firestorm cores for high performance in section three we will load our benchmark logic session and look at how the cpu is utilized while it's running in section 4 we're going to add more and more plugins to chew out the cores until we form system overload in section 5 we will tune the audio settings of logic to make best use of the system resources in section 6 we will conclude and bring together our ideas for how we tune logic to run the most complex sessions without interrupt we will also be able to conclude and advise the difference between running an air and a pro version of the apple macbook and maybe how this would compare to running a mac mini m1 machine in section one we're going to talk about processes and processor affinity now a programmer developer knows that their application is going to be run on a multi-core cpu so they take the one application and they write it as multiple threads so that the threads can be distributed across all of the cpus to get best possible throughput a program developer cannot know in advance what the specifications of the machine that's going to run their application is capable of for example my imac pro here has got 10 cores but it supports hyper threading so there's 20 cores there whereas my system on chip and one machine has got eight cores and some of them are the slower cores and some of them are faster cause so how can the application developer know this well the answer is they can't it's the job of the operating system to assign the threads to the cause now sometimes you want threads to run on the same cpu you might have a single channel that's got multiple plugins and those plugins will create at separate threads within the logic pro system it would appear to be sensible to spread those threads over the multiple cores that's not necessarily the wise thing to do because these threads on a single channel have got a lot of commonality they basically want to be manipulating the same data and because that data lives in the l2 cache which is closely associated with a processor you can see that for a single audio channel it's likely that we want the threads for those particular plugins to all run on the same core otherwise we'll have to be juggling the data and sending it from one cpu to another cpu just so that we can get the process chain done because of this we talk about this concept of processor affinity where the application developer can say i would like these to run together and these to run separately if at all possible this is done through the sys control call there are no sys control calls that allow a developer to nail a process to a processor but there are calls that allow certain processes to be grouped together and other processes to be grouped somewhere else we can pre-advertise our desire to how that load is spread over the cpus the actual one-to-one relationship of the thread and the core is always made by the operating system in session two we're going to look at some practical demonstrations of cpu load now let's just review uh the m1 machine effectively it's got if you like uh four uh high performance causes four kings you know they're high performance cards and it's also got uh four high efficiency cores which are of lower processing power this is the firestorm cluster it's a cluster of cores which are high performance but they use more energy and this is the ice storm cluster so if you're just using your basic word processing or you're reading your mail you you don't need to be using the heavyweight cores for that you can play your lower value cards and get away with using less energy so eight calls in total two clusters one of the advantages of a system on chip is that these cores these clusters share l2 caches so these guys can access the same high speed memory shared between them without penalty this gives us a lot more flexibility with processor affinity because we can spread with confidence different processes over these process ores because they are sharing the same l2 cache now it's not obvious in fact i'm going to say more than that it's very unobvious uh which calls on which we can see the overall processor utilization here is only around ten percent if we look at application monitor we can see that around six percent is user process that's programs that i'm running and around three to four percent of it is system overhead the operating system running uh although this machine is doing nothing it's still running uh quicktime which is doing background screen recording so it's not true to say that there's nothing going on and in fact you can see here that the screen capture process is using 31 of cpu now that's not 31 of the whole of the cpu resource you know the whole of the eight the eight cards if you like it's 31 of one of these single cores so to start the cpu stressing process i'm going to run the yes command and i'm going to output redirect that to slash dev slash null and put that in the unix background basically what that's done is it's just made a process which is just wasting time by saying yes yes yes yes and throwing that in the bin it is using 98 of cpu um 98 of one of these or 98 of one of those now i don't know let's have a look let's see if we can tell my guess is a bit on three a bit on four a bit on five and a bit on seven now even though you can see that it's only a single thread the operating system is choosing to run it in multiple places why would it take a hungry caterpillar and decide to just keep on hitting a different cpu with it well it spreads the load maybe that's a thermal thing maybe if you were a program that was already resident on one of these cores it would be very disadvantageous for you to have your core completely obliterated by one program causing pain so maybe we spread the pain so let's carry on and make this even worse by actually introducing uh four of these things now it starts to become clear which cores are being dominated it is in fact cause three four five and seven now if you listen carefully the fans are running on i statistica in fact the fans are now running at uh uh 5 000 rpm uh because we are starting to warm the uh processor up so three four five and seven are the fire storm cores one two six and eight they are the lower performance but lower energy ice storm cores you'll notice that although it's fifty percent of the cores have been utilized we've still got notionally 50 left but that's a bit of a misleading figure isn't it because if you're a card player and i play cards in my family we've already played the big cards so although it's 50 percent of the core resources it's certainly not 50 percent of the power of the cpu that's left so be careful when you're looking at cpu utilization because the percentages won't always tell you the truth okay so let's kill off this ridiculous cpu load and you can see now that also the cpu fan is starting to slow and obviously the machine will cool so section three uh we're going to benchmark the system uh with a logic session that i've been using in some of my other demonstrations i've just had me dinner and i've got a glass of wine now so uh it's got to be six o'clock somewhere in the world if you're watching out there in youtube so if it's an appropriate time for yourself pull yourself glass what's going to happen now is i'm going to think that my youtube delivery is going to improve but in actual fact it's going to get worse but there we go so let's have a look at what's going on i've got an empty logic session uh we can see that the machine is not busy uh and i'm running the performance monitor so the performance monitor comes by clicking on the cpu bar and then you can see that we've got four cores in use now the default behavior four calls in use um the default behavior of logic is to uh in on an m1 machine anyway is to only use the four uh performance cores the uh the firestorm uh cores um the idea being is there's no point using the low performance calls we might might as well leave those for something else and i'll be discussing later on whether that's a good idea and whether there's any benefit to using all of the cores so you can see nothing going on in the machine not really any cpu activity and nothing measured uh on the performance monitor uh from logic's point of view so let's load the machine up with some of those um uh pointless um uh yes go into devnet processes and our general behaviors to start four of those uh to really get some load going we know by testing that these are going to go to the performance cores okay so what we can see here is even though the four cores are now very busy with the load of that uh yes program the actual fact logic is registering no busyness if you like in its affinity cues and the reason for that is because there are no plugins there's no dsp load there really aren't any tracks so logic isn't busy but the machine is and what this reinforces is that these performance meters are not telling us how busy the machine is but really telling us how busy logic is and those two things are very very separate and that's really why you don't really want to be running other programs at the same time as you're running logic because otherwise it will make bad decisions on how it tries to load up the affinity cues um if you're uh are drinking along with me so then have a little sip we've opened up a uh a real session uh which i'll just get running in the background just so that we checking that's gonna work fine and let me just check on my own cans yeah that's okay that's running in the background and this is logic out of the box standard settings using the four performance cores you can see that the loading of uh this session is reasonably high i'm using all my favorite plugins i'm mastering on the fly i'm running 96k session uh this is the as i've proven in a lot of my other uh tests in this playlist of logic running on m1 machines i've shown that there is no way this session runs on an intel uh macbook pro of any description in the 13 inch kind of world it would run on a 16 inch intel machine but no chance on the 13. so i've already discussed that and i don't want to go into that whole thing again but here we're running with you can see our affinity cues are just over 50 utilized uh we've got reasonable busyness in the high performance cores and the machine thinks that it's in in some way 36 utilized um we've we've probably used pretty much half of all of this resource we've got available to us and you can see that kind of utilization uh on the cpu history which is part of activity monitor let's have a look at those settings together and let me just introduce them and if you go to preferences and audio preferences here so the settings that we're going to consider in the rest of this session are going to be changing the io buffer size talking about uh the processing threads um and how many of the threads were using at the moment were defaulted to the four high performance cores we're going to look at the size of the processing buffer which by default is medium we're going to talk about the multi-threading option of whether we're optimized for playback and live tracks or playback only and the default is as you see playback and live tracks and together those things really are the things we can manipulate to optimize how logic runs in playback now i will say that trying to relate these four affinity cues to the four cpus is a pointless task there is no one-to-one relationship because remember that the operating system mac os is a variable here and has the ultimate responsibility to assign an affinity queue to a processor so in this fourth section we're going to look at loading up logic with more and more plug-in load to do this i'm going to use an example of a very expensive plug-in which is the abbey road reverb simulator from waves now i'm not blaming the plug-in i don't i'm not i'm not down on this plug-in i don't happen to use it very much what's important here is that you know just how hungry the abbey road reverb is i've created in nine instances uh they're on auxiliaries 18 19 and 20 and i've put three plugins per auxiliary and i can turn this plugin on and what i want to show you that is the in doing that the affinity queue really hasn't changed that much add a second one of these on another auxiliary i'm telling you that this is an expensive plug-in but not a lot's going on here over here i've got happen to have kicks kicking a couple of snares which i've opted to assign to those sends you can see that they're going to 18 19 and 20 but the sends are turned off but watch what happens when i turn one of the sends on so here's 18 going on boomer bang look at the cpu utilization and now we're jumped up to 43 and now let's turn on another audio send we're now at 48 overall utilization and my point is this listen never never watch me talk about final cut pro x with any authority i use it for editing this kind of thing but um i don't it's not my profession audio is what i do and logic is certainly when i'm working in the studio what i do and i get really bored of these kind of youtube demonstrations where people are trying to prove benchmarks using 70 tracks of audio which is the same copy of some sine wave or anything or something it's not rooted anywhere and that they're not using the machine as a mix engine and it just makes no sense it's not a quality metric because it has no relationship to how you and i are going to use logic stop watching those kind of youtube things where people going oh and i could add 72 tracks to this hopefully what i'm showing you here is that sometimes plugins don't apply the cpu load until you actually root audio through them because you may be automating the routing of audio uh in and out during the length of your session so this cpu load it's a dynamic changing thing as the parameters of our mixes change and that's why logic requires cpu headroom you can start a session and then suddenly later you might see a system overload that you weren't expecting well maybe it's because you've opened up a an audio stream where you're routing maybe to an aux and nothing else is going in that direction so now let's come back to uh the setup where now i've opened up auxiliaries 18 19 and 20. get things running 40 odd percent cpu we're balancing around that 50 percent just over uh within the affinity cues now let's turn on the first plugin so let's not even think that one uh reverb is one thread and one plug-in is one thread the plugins themselves might be multi-threaded let's open up another one and you can see again uh the affinity cues have become even busier and uh now let's open up a third one ah now we've seen you can see here the logic is aborted on a system overload uh what i should do is give logic another go okay it aborts let's let it run again it's running again we've come up with a an affinity q balance which is not sending the system over the edge if you like so i've managed to get two of these abbey road reverbs added and we're on the ragged edge of this not working i'm going to suggest to you this ain't going to run a third one but let's let's try oh yeah we've managed to run a third one uh no system overload so let's stop it and try it again to see if it can find an affinity q balance which will work throughout the cpus we're almost giving logic a chance to rebalance its load and then the operating system a chance to assign that to the cause and at the moment looks like it's working oh no it's just gone so three instances would appear to be definitely over the edge two instances maybe that's the benchmark that i wanted to draw for the standard settings as logic comes out of the box when we're really pushing into the maximum plug-in load so in this next section we're going to look at tuning some of the audio parameters that we have available to us within logic you'll notice that i've turned off all of these hungry abbey road plugins and now let's talk about some of these options so if you go to preferences and audio preferences we're going to talk about the options that we see within this panel here so when you see a system overload one of the first things it will say to you is that you might want to consider increasing your i o buffer size and this is this setting here which at the moment i have set to 102 for samples mostly i record live so if i'm recording i will max that out i want to use maximal buffers which is the safest i can be i don't really care about latency it says here that this is going to give me a a total latency a round-trip latency from input to output of 26 milliseconds now if i was trying to play and record listening to that in my ears it would be almost impossible 26 milliseconds is not i can't integrate that as the same event as what i'm doing so you could never record with that degree of latency if you're trying to play along to what you're playing back but if you're tracking then there's no good reason to not max this and make your recording as safe as it possibly can be now i'm recording blind of course i my digital consoles come into the mic preamp they go through the a to d conversion stage i take a split at that point and it's that direct splitter the mic pre-digital signal that gets recorded my live mix is then hits the console and gets processed but what i actually record is the is the clean feed from the mic pre but i wanted to say that this really has no impact on memory at all because uh 1024 samples multiplied by 24 bits is 24k bits but we need to divide that by eight to get it into bytes that's three k bytes per track well if i've got 3k bytes per track and multiply that by let's say there's even 72 tracks being recorded we're talking about a total of 216k maximum for that buffer and of course in memory for a program that's already 4.7 gigabytes 200k is nothing it's not even measurably interesting so um really for playback um tuning this parameter of the i o buffer size is going to really achieve nothing really it's a it it's a tuning for recording which is basically biasing off risk versus latency so next we're going to talk about this process buffer range uh it's the size of the buffers that we're using for processing dsp so the bigger the buffers we make during playback the more data we can put in there we're increasing maybe playback latency but we really don't care about that you can see that at the moment logic is using 4.57 gigabytes in memory let's change this uh to be a large process buffer range apply that so we've started logic again and we're going to have a look at the size of logic in memory and we're now using something in the order of 4.59 gigabytes so this is using slightly more memory but in in the size overall memory profile size of the machine it's no big deal so i see no good reason to not run it large and we'll carry on uh with this discussion then the performance impact in being able to run more plug-ins with this larger buffer well let's see we couldn't really get beyond three and three wasn't stable and i need to run it for a bit longer just to be sure and at the moment we seem to be running three okay and i want to give it a bit bit longer and i also want to maybe stop it and restart it uh to be sure and it would appear that um three is running reasonably reliably and if you remember before when we had the setting down at medium three was a struggle and we you'll notice that we've got audio here so we can't blame the fact that i've not rooted i definitely have rooted 18 19 and 20 is live uh that would appear to be running three with stability of the additional plug-in loads let's try see what happens when we go to four no four four bombs it out let me take the fourth one away restart it yeah you'll remember that three was on the ragged edge and now i've not had a problem running three at all so running a large buffer size improves uh performance we can now run uh reliably the third copy of our heavy abbey road plug-in whereas before we could only run two reliably and three was on the ragged edge let's talk about the uh the next one and the multi threading uh our policy so what logic is doing is it's preserving a degree of cpu head room for the recording of additional tracks or in fact the live input from maybe a keyboard to a software instrument to make sure that none of your playing or or additional recordings are lost during the session we reserve some head room in cpu to ensure reliability in that area we have the option of turning off that protection to benefit only the playback tracks so in setting that and applying it let's see what happens now when we run the session uh we we are successfully running the three we have this stable situation now where we can run three so let's try and add a fourth and no that sends us over the threshold and let's allow it to rethread that no and another time no and let's turn back to three yes that that is running fine allowing it to the cpu power to be dedicated to playback really hasn't made that much of a difference it's not made it worse but it's it's not made it any better so the final part of this tuning is to now allow the m1 machine to have all eight of its cores used so in this situation you can see that the automatic the recommended setting is only use the uh the firestorm cluster you know the high performance cpus four cpus which is why we see these four pros these bars here now that why don't we max that out and say well why don't you use all eight of the cores so you've got you know the firestorm and the ice storm cause available to you and in in applying that when it restarts instead of the four cores being available there's now eight cores if we now run the session and allow it to stabilize out you can see that really we've been able to actually spread the plug-in load wider over the eight cores and so the overall affinity thread uh loading if you like has gone down so maybe this will give us the opportunity to load the machine up with more plug-ins well there's the fourth one uh that we couldn't run before which is now running fine and there's the fifth one gone in um which again is running fine now i want to stop this and restart it just to see if i have any problems with the reliability of logic on the fly but that would that would seem to be fine and then here comes the sixth one which is running so we are successfully running now with six and we we couldn't get beyond three before and you can see now we are getting very close to full utilization of the cause there's a bit of a spinny wheel i'm actually losing some responsiveness in the machine because we're using uh the low performance cores for uh plug-in processing you can imagine that we we don't have cpu so alive anymore to deal with my uh system interrupts of moving the mouse playing with the keyboard maybe refreshing the screen although i'm that's more gpu but you know what i'm saying we've used up those resources which were spare and available but we are getting a lot more plug-in work done let's try number seven and that's that's where i get the overload and run it again you can see six is reliable now that's that's twice the number of these really heavyweight abbey road programs uh compared to what i was able to run before now if this was a lower cost plugin you can imagine there might be another 20 plugins that you could run that you weren't able to run before and by setting these settings some of these things that we saw earlier weren't making an obvious difference so if we go to the um the preferences audio preferences the process buffer range didn't really make a obvious difference but if we apply it now it won't it won't run let me i've done quite a bit of testing on this let me do it again for you it's not going to run and set that back to the large buffer range and now you're going to find that it it it does make a difference or where we're in this ability to spread the load over all of the cores so again we're we're running i am convinced that the best setup for playback performance is the one i'm showing to you now which is on an m1 machine you open up all eight cores to the playback processing you set a large buffer size and you prioritize the multi-threading option to playback only and sacrifice the performance of the machine and the reliability of the machine as you over dub and hit hit up with more live input to software instruments which obviously in this situation we're not doing because we're in we're in mixdam so finally i want to bring together certain things in conclusion first of all we've found that logic in terms of performance is a dynamic machine in the same way that audio is dynamic because we are automating in our mix and we might be turning on routing through automation really our cpu needs to be dynamic we need headroom in our cpu capability otherwise a session that we might start in real time at some point during the session as we try and do something extra within our plug-in resources we might actually have a system overload so that's the first thing it's a dynamic process secondly we found that the m1 system is incredibly thermally efficient and because of the peak nature of logic's requirement for cpu although we do have spikes we're not really heating up the cpus we've seen a maximum of maybe 60 top 70 cpu utilization this leads me to the suggestion that if funds are tight the macbook air is probably still a really good choice for logic the difference between effectively ram being the same if you compare a macbook air m1 to a macbook pro m1 really the difference is the air has no active calling we find that even when we're having unstable logic performance we're nowhere near that thermal limit and so i would suggest that a macbook air m1 would be a very very good choice for running logic now i'm not saying that you shouldn't put 16 gigs in it i'm still a firm believer that having the memory does make a difference if you were making a pound for pound comparison for logic then a macbook air with 16 gigabytes is probably the same kind of price as a macbook pro with eight gigabytes and so go with the air and put a little bit more memory in it and your buck is doing a better job for logic in terms of the new m1 mac mini that has a cooling architecture quite similar to the macbook pro so really the macbook pro and the mac mini m1 are both really good examples of the best we can do in terms of calling i would love to see comments from you based on what plugins are you using what's your favorite stuff or what do you find to be uh the biggest bang for buck in terms of cpu utilization uh what's your most efficient uh plugin that's your favorite what is your most inefficient uh hungry plugin that's your favorite if this has been helpful please subscribe to the channel so you keep up to date with whatever else i i do videos of next this was all about cpu utilization uh i will put in a link for you here that will show you more about memory utilization finally let's keep mixing and keeping it real i just think we need to be really careful of uh logic demonstrations on youtube where you never hear any audio and uh and nothing is happening for real we've learned actually logic can be a really interesting beast when you don't root anything anywhere okay see you next time
Info
Channel: Mark Payne
Views: 69,074
Rating: undefined out of 5
Keywords:
Id: W3zabsRfT8c
Channel Id: undefined
Length: 33min 16sec (1996 seconds)
Published: Tue Jan 05 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.