We Exploded the AMD Ryzen 7 7800X3D & Melted the Motherboard

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Lol that comment about building an ultimate death-PC by combining the exploding motherboard BIOS with the 7800X3D, the 4090 with improperly inserted cable and the Gigabyte PSU that explodes when overloaded. The ultimate nightmare build.

πŸ‘οΈŽ︎ 717 πŸ‘€οΈŽ︎ u/sips_white_monster πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

I mid way through the video and glancing over at my 7800X3D and Asus X670E Hero. HWMonitor is reporting 1.25V for SOC. I updated my BIOS to the second latest (I noticed a new BETA BIOS released yesterday, but I’m going to wait until that’s out of BETA).

I think I dodged this issue, fingers crossed! I only ran the chip on the older bios for a week.

Still, very disappointed at the whole situation and if my CPU fails months from now, I’ll always wonder if it was slowly being cooked in that first week when I got it before the BIOS update became public.

πŸ‘οΈŽ︎ 263 πŸ‘€οΈŽ︎ u/[deleted] πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

asus wtf???

πŸ‘οΈŽ︎ 266 πŸ‘€οΈŽ︎ u/puffz0r πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

Imagine if we didn't have people like Steve doing this kind of work. Companies would have free reign for consistently poor quality control.

πŸ‘οΈŽ︎ 756 πŸ‘€οΈŽ︎ u/heymikeyp πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

so for the user, basically just make sure bios is up to date and keep an eye on SOC and make sure its not too high or just not use expo at all and wait until bios updates become more stable? am I getting that right, does anyone want to correct me? just trying to make sure since I'm building two pcs with 7000 series cpus in a week or so

πŸ‘οΈŽ︎ 178 πŸ‘€οΈŽ︎ u/Mouselift πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

Who else is watching this on their Ryzen 7000 system, running a beta BIOS?

Oh, it would be the ultimate irony.

πŸ‘οΈŽ︎ 321 πŸ‘€οΈŽ︎ u/jacf182 πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

Asus just used to be the go to, now it's just problem after problem. "For those who dare" sounds like a threat now lmao

πŸ‘οΈŽ︎ 222 πŸ‘€οΈŽ︎ u/techtimee πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

Looking back over the video, and at vendor responses given previously, I can't help but read all this as really shady and unethical behavior from all of these companies. Each of them more or less kinda pawned it off as overclocking or cooling related at some stage, implying on some level it's related to user error:

We are aware of a limited number of reports online claiming that excess voltage while overclocking may have damaged the motherboard socket and pin pads - AMD

To support EXPO and/or memory overclocking at DDR5-6000 and beyond, SoC voltage has to be sufficiently increased to ensure compatibility and stability - Asus

As confirmed with AMD, any intentional manipulation of these settings can damage the processor, socket, and motherboard." - Asus

AMD EXPO technology can be used to optimize memory performance by appropriately increasing the CPU SoC voltage to ensure system stability when operating at higher memory frequencies - MSI

Everyone kiiinda soooorta admitted it was related to excess SOC voltages, but didn't really own the fact that they're the ones who caused those excessive voltages or that it was done deliberately. That part wasn't a bug, they chose to do it.

That creates a funny problem. If memory DDR5-6000+ functions at 1.3v SOC or less, then it validates GN's statements that EXPO shouldn't be messing with SOC at all, and establishes the above statements as outright lies. If DDR5-6000+ now ceases to function on a bunch of these boards, then all of them have been falsely advertising speeds they can't support.

And at the end of all that, it's not even the whole issue. They look even worse once you look the overarching issue, especially in relation to OCP and PROCHOT.

No one's faultless here, and Intel's pulled their own share of insidious crap over the years, but this leaves a really bad taste in my mouth about AMD and Asus in particular.

Multi-billion dollar companies with thousands of employees, but it takes a comparatively tiny operation on friggin' YouTube to sink a week plus and thousands of bucks into it for people to get honesty? That's so messed up (but appreciated).

πŸ‘οΈŽ︎ 136 πŸ‘€οΈŽ︎ u/duskmarch πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies

Paying 700$ for a MOBO only to have OCP not trigger when the socket is trying to feed a CPU 440W, bro, even the FX 9590 didnt use 440W, who in their right mind thought a CPU needs 400W+ of power, especially a CPU with a TDP of 120-180W???? Engineers at ASUS thought everyone got that liquid nitrogen cooling😭😭😭

πŸ‘οΈŽ︎ 112 πŸ‘€οΈŽ︎ u/Laziik πŸ“…οΈŽ︎ Apr 30 2023 πŸ—«︎ replies
Captions
holy it's at 35 amps oh it's on fire yeah I can smell it Jesus Christ 200 Celsius holy that's um that's not supposed to be like that there's there's also a level of possible Fury here that you should be aware of [Music] these CPU is self-destructed literally they internally combusted to such a degree that the inside of them blasted outward when we conducted internal failure analysis we saw Scorch marks in the same spot on three dead CPUs the center of the iodine the edge of the CCD and that immediately pointed us to the root cause also we saw a fractured silicon although AMD released a statement about lowering the soc voltage we still need to understand exactly what happened these companies bottle up details forever when they find the actual root cause and in our testing there's more to it than simply the soc voltage at the surface level it's not just death over time then from vsoc we found improper Fail-Safe protections on Asus and useless ocp on Asus we also found bugs on therm trips between a couple of boards we found a bios so buggy with gigabyte that it can kill chips accidentally now that's interesting wow okay what the hell something's wrong yeah maybe they're maybe their SSC knob is broken and we found borderline adversarial communication between AMD and its Partners which is a different issue entirely we're setting one of these out to the same failure analysis lab that worked on the Nvidia 12 volt high power cables for us but that content will be in a few weeks today though we soldered leads to rails that we suspected were running voltage too high we employed thermography to prove that yes explosions are hot we used an external voltage controller to read the vrm configuration and Rand testing on the BIOS revisions that Asus silently snapped out of existence oh and we also killed two of our 7800 X 3DS and two motherboards so it's quite expensive but it's in the name of science this video is brought to you by us and our special limited edition 15-year anniversary GN foil shirt on store.gamersaccess.net I started the gamer's Nexus website in May of 2008 and within the next few months of 2023 we're going to hit 2 million subscribers to commemorate this we're launching this shirt on pre-sale right now it features a vibrant Brilliant Blue component diagram GM logo complete with pcie slots gpus fans and more modeled in we also used a gold foil 15 integrated with this special variation of the GN logo the back of the shirt uses gold foil to show our trend line from when we were only a website until today there will be a total of five special gn15 items launching over the course of the next year or so to commemorate 15 years in operation all of which are limited and collectible within their different categories this shirt directly finds massive testing effort efforts like this video and helps us to continue investing money and time into public service testing and Analysis and you get something unique shiny and Brilliant just like exploding CPUs backorder one on store.gamers and access.net today all this started going down right as we arrived at amd's headquarters for a totally unrelated video so that was convenient for us and AMD was in the middle of its scramble trying to diagnose it so they were just as fresh working on this as us maybe a little bit of lead time but this is all very new issue so as soon as we got back I set forth to working on some specific aspects of diagnosis and asked Patrick on our team and to assist me tackling the rest and here's a quick recap of what's going on although there's evidence of this happening prior the biggest post was on April 20th of 2023 when user speed rookie posted this heroin set of images on Reddit showing a CPU and a motherboard we reached out to the user and requested the fastest ship in under 500 and then we paid the user for their board their CPU and the shipping and this Arrangement by the way we are always open to if you have some kind of catastrophic failure of a part it allows you to skip the RMA process it allows us to research it and have a public demonstration of what happened rather than the company going oopsie here's a new one let's find a rug to put that under and it moves you along quickly so this allowed speed rookie to move forward and we got our content a trend of other users with Asus boards especially in dead CPUs popped up on Reddit but since then other users have posted in a mega Thread about similar issues including one from users Sky fish JY this one was of a gigabyte board and a 7800 X 3D as well the user didn't apply Expo and used only factory settings in the photos the user's board looked okay but the CPU was clearly bulged once again we bought these components from skyfish as well and got the expedited shipping so we could to add it to this video so this setup behind me is what we use for testing we have two camera setup plus a thermal camera we added a camera this time from the 12 volt high power stuff because last time we were like wow that footage is some good meme material for ever it's currently at 258 degrees I think we see the smoke Steve you see the smoke there's no cause for panic immediately so if you have an ax3d CPU even if you have it in an Asus board you don't need to freak out right now we do have some steps at the end of the video we're going to explain why it's all happening but this isn't so widespread that everyone's CPU is going to fail just like the 12 volt high power stuff there tends to be a lot of chaos and panic as soon as the first couple of these get posted and you don't need to panic completely only some of you will have this happen does that does that help but we think most of these are related to an SOC voltage that's too high causing a slow death of the CPU this is something that's known at this point and well documented future Steve here however we also discovered that Asus has a major up on its over current protection where it's super fancy 700 Ultra Premium motherboards are designed in such a way that there are insufficient protection mechanisms where over current protection allows for the CPU to be fast cooked from underneath after the CPU is separately slow cooked from within so it's a combination of failures here but the catastrophic part of this is one that Asus can completely avoid which we're going to talk about a little bit later it's possible this happens on motherboards outside of Asus 2 we can't possibly test them all but Jesus was the easy one to figure out and it took four days of non-stop testing so let's start with filtering what we know and some plans so first as we entered this testing we kept an open mind to the variables presented by motherboard vendors AMD bios versions and users alike we noticed some of the failed Parts online didn't exhibit the same Mutual destruction between the board and the CPU that indicated to us that there are actually two problems rolled into one one is the CPU is dying and the other is that some motherboards then proceed to self-emulate and barbecue the CPU even further because of protection mechanism probably failed that's not good you killed it the wrong way based on the data we're presenting today then we believe this to be one part amd's fault and one part board vendor's fault we're going to get into the hypothesis and some of the physical inspection first again we have a separate video it's coming out in a few weeks this stuff it's not quick to do we're trying to move as fast as we can because the internet moves fast and this particular failure analysis we're doing separately is going to require external lab investigation and that's going to take some time but for the first part here's our hypothesis we have three sets of components we're working on we have a dead CPU and a catastrophically failed motherboard from speed rookie we have a dead CPU and actually a still fully functional gigabyte board from skyfish that one failed properly and we have a was working but died for the cause GN CPU and the motherboard still works actually make that four sets of Parts not three while filming this video we decided one more try and yellowed another 650 on a 7800 X 3D another 700 on a motherboard and we managed to kill 13 hundred dollars worth of parts and that is all the more reason you should go to store.arisaccess.net and grab one of these shirts because it would help us out with funding these types of videos we also have a promo where if you spend 65 or more you get an additional discount and that's at the top of the store Page at store.gamersaccess.net using the full CPU pin out and schematic from a friend of ours it's a lengthy document we're able to use these images of the pin out to figure out what we care about on these CPUs at this point Patrick and I split them up and we started doing some resistance checking we found a low resistance shorts on the CPU that failed catastrophically and torch to the motherboard but we found that zero ohm complete shorts for the CPU we killed which we'll talk about later and the second CPU from our viewers Sky fish in both instances of zero ohm shorts the motherboard was still working in the instance of the low resistance short however the motherboard incinerated itself to the point of melting the socket speed rookie system was that one and it was still on after the failure because power good was still asserted that implies that either Asus has ocp that's too high such as on Loop 2 or the chip burned in a way that the low resistance kept it being fed current if the CPU is destroyed in such a way that the motherboard shuts down obeying ocp and OTP then the Silicon will be damaged irreparably but the motherboard Will Survive low resistant shorts can break ocp and other protections which may allow for the motherboard to remain on and shoving current into a dead chip in such an instance and especially if a motherboard vendor like Asus configured Loop 2 OCB to be too high to where it's ineffective this system ends up in a state where the board continually attempts to power up a dead CPU sort of like a boot loop with improper or blown apart protection mechanisms an additional form of catastrophic failure could be the indium melting and shorting the smds on the CPU in conversations with experts in the industry many of whom we can't name we receive a confirmation that a low resistance short like this could lead to a catastrophic motherboard failure however in our final hours of attempting to reproduce the motherboard incinerating itself problem we made a new discovery that threw a wrench into the gears and that Discovery was an additional failure path in the final task we ran the 7800 x3d and the Asus hero with a full coolant we ran a prime 95 small fft heat load and monitor The Thermals now it's well known at this point that a high vso set point of say 1.35 volts especially if it actualizes as 1.4 volts which is what was happening that will slowly degrade a CPU and it could eventually lead to this failure that we saw it's a time bomb on an extended scale maybe never if you have a really good piece of silicon but worse Lottery polls on chips will yield failures faster so it's just an aging problem we chose a vsoc set point of 1.45 volt which is actually within 50 millivolts of what some motherboard vendors were originally running anyway now this is hot but the point of this is a rapid simulation of aging and silicon degradation of an equivalent let's say 1.35 volt set point which was the original Asus set point we'll talk about later over maybe six months of use in any event even with a 50 millivolt higher than board vendor set point vsoc the result shouldn't be fire it should be some kind of protection they're supposed to be in place we observed as our clamped vsoc pads that we soldered to climbed slowly from a combination of leakage heat load and a screwy Asus vrm configuration to actually 1.5 volt SOC despite a much lower set point this is insane and it's what will happen from prolonged exposure to excessive odm voltages eventually without ever hitting TJ Maxx and therm tripping in the OS we we're still under temperature control here over current protection kicked in instead and the system partially shut down to a double zero bios code this is normal we observed that the fans weren't spinning in this state which made us wonder if that's leading to a thermal failure from very slow cooking say overnight fortunately this was not the case as Wendell noted in our simultaneous testing call it's on but you're getting double zero but the soc voltage voltage okay probably considered as safe forever but we allowed the system to run in the double zero state for a while and we even removed the cooler completely it's not like it was on anyway the CPU IHS never breached 38 degrees Celsius and thermal imaging with an RSE 600 so this alone doesn't cause the failure but it's related to one and these external mechanism for throttling the CPU is proc hot usually asserted by the vrm which would allow it to avoid an over temperature protection shot down but something didn't work here so we flipped the power supply off then we flip the switch off we remounted the cooler and we turned the system on instantaneously we were met with this that's it we did it we desoldered it and we desoldered the CPU in the socket here's what happened the part is obviously shorted out we're very close to it in this state so what should have happened after the first ocp kicked in and shut us down is asus's ocp should have taken over on that next Boot and prohibited the scenario we ended up in we reached out to an electrical engineer embedded deeply with this issue can't name and we asked for some additional assistance in explaining it and they put it in simple terms the engineer said yeah that's asus's ocp hop so everyone can understand that when you turn on the power supply and attempt to boot it like we did the system immediately knows there's a short and it should cut the power at that point Asus even has an extra embedded controller on these high-end boards and that controller could be leveraged for additional fail safes in really cool ways to prevent any of this from happening unfortunately laziness prevented an Asus from taking advantage of its own advertising and marketing gimmick which is the controller it could be not a gimmick from the board's perspective a double zero code means that the power good signal has not yet been asserted so the CPU should be holding itself in reset until the power stabilizes double zero is basically the same as no CPU present at all once the power good signal is asserted it'll start the process and exit double zero double zero is essentially holding the board in a cold reset you can see the system shoving current into the CPU even though it's clearly already shorted and it can only lead to the socket melting we watched as the current clamp climbed to 37 amps on the EPS 12 volt cables or about 440 Watts the vrm can take it so that means that OTP in the mosfets won't kick in and we were running without a vrm heatsink anyway so we actually put it in the best possible position to shut itself down safely a real user would be at more risk here than we were we pulled the cooler when we started seeing smoke and the thermal camera revealed that the CPU was at 200 degrees Celsius insanely we noticed that the substrate of the CPU was hotter than the IHS itself this in combination with the crack we heard indicates to us that the CPU literally desoldered itself and sunk all the heat into the substrate instead of the IHS but it was already long dead at this point and because Asus has an unreasonably high ocp for these CPUs really in general the end result is that the CPU continues to get force-fed current that it eventually can't handle and that's in a double zero State no cooling potentially this is pretty damning and changing the vsoc isn't enough here that's the lazy approach it's a catch-all but there's more that could be done and Asus and AMD both need to take action if we were Asus here's how we'd look at this this scenario this bad scenario 37 amps in this test at say 80 plus percent efficiency puts us at 400 watts at least for being force fed into the CPU and this is for a non-overclockable CPU that was not overclocked with a nominal TDP P of say 120 watts maybe 170 Watts if you're running something high-end so the ocp here is way overrated for what's actually in the socket and that's bad the fact that it was at 400 watts at least and power good was still not asserted indicates to us minimally that asus's fancy premium controller and its fancy premium motherboard should at least be detecting let's say the CPU is on for five seconds or attempting to be on and it's receiving excessive current maybe we should do something about that and shut the system down the controller has these capabilities they could also reconfigure ocp to be lower and again these aren't overclockable chips anyway and for the overclockers and for xoc you can always manually bypass that so it's not taking away any of the levers that we have it's just enforcing stricter ones for more likely stock scenario an Asus isn't alone here AMD has a lot it can improve as well besides just the communication which Andy's kind of always struggled with what's up Gamers and we haven't even gotten to gigabyte yet that's in a few minutes this particular issue should have never made it out of Asus and it's up to AMD the makers of the CPU and the chipset the ones who have the actual knowledge of what it can tolerate safely to enforce and police its Partners in a way that at least makes them follow these basic guidelines you don't want too much enforcement and too many restrictions on the partners because it restrains creativity and it stops the motherboard Market from being as diverse and interesting as it is and it could have knock on bad effects for enthusiast things like overclocking but in the very least something like this is not unreasonable for AMD to add to its list of hey we should make sure the CPUs don't melt themselves in the socket because the motherboard vendor maybe did something stupid it's up to AMD to tell Asus that it is in fact an issue that could escalate to a safety issue now it's extremely unlike thing but just to maybe scare some of these vendors into gear without hopefully pushing them too far it is feasible that in the exact wrong scenario with perhaps the raw material build up around the socket with a short to ground somewhere else that's not accounted for in the CPU or board you could potentially get into a scenario where there's a complete runaway and a house fire or at least a small containable fire extremely unlikely to emphasize that but we've seen it before and we've repaired user systems that have been through such scenarios so AMD at least is responsible for getting insights from its partners and providing them to make sure that settings like ocp are not incompetently set perhaps under an assumption that guidance really is just guidance as in you should do this not hey it's gonna fail if you do this the wrong way now it's time to move to the d-litting for internal inspection so for this we completely preserved the motherboard and the CPU from speed rookie those are the important ones we did not tamper with them at all Beyond probing some specific pins and we informed the fa lab which those were so instead we delighted the CPU from Sky fish because that's the one that came with the gigabyte board that still works we confirmed it's working it even got the BIOS f5a or whatever it is the other one it'll go through submersive acoustic testing scanning electron microscopes and more but to delid the CPU while preserving as much damage as we could Patrick and Vitale flossed the eight legs of the CPU to cut through the adhesive without affecting surface mount devices making a dentist [Music] cutting that tree are you a dentist or a lumberjack engine professions you give this guy a strain of floss and he's like three different professions once done I clamped the CPU lightly in a vice and shoved the thermocouple under the heat spreader using guidance from our friend and Community voted number one Bromance of the Year dare Bower we used a heat gun to heat the IHS up to 160 degrees Celsius approximately the melting point of the indium solder holding the heat spreader to the Silicon at this point the lid fell freely no prying required whatsoever and we were able to delete it without any damage to the Silicon itself we took some microscope photos and videos and we sent those to the external component failure analyst we worked with again for total high power and the same analyst said this quote I'm seeing a cracked die from the high heat of the overstress event cracks emanate from the center melt site that's not something you could have done with the Delight there's evidence of overheating that's the dark and silicon and of melted silicon 1410 degrees Celsius melting point if so there may be a few other is right of Center fortunately acoustic scans should clue us into whether such material discontinuities are present prior to destructive testing sometimes when larger scale EOS or electrical overstress occurs a lot of different surrounding materials are pulled into The Fray I think this is what happened here too our analyst said it looks like an organic material has bubbled up through the Silicon melt site the nearest organic polymer could be at the active site where the connection from the dye to the substrate occurs that is surrounded by an organic-like solder mask or underfill the easy solution would be to run it through EDS or energy dispersive x-ray spectroscopy to confirm yes yes easy that is why we're sending it to them and the result was a ripple that cracked the Silicon from the inside out you can actually see it beginning originating at that Scorch mark it exploded to such an extreme that it bulged the land grid array outward on the bottom of the CPU when we scoped our own dead x3d that we killed during the course of testing and more on that later we saw a burn mark in nearly the same spot on the soc more importantly though we noticed at least one of these CPUs had discoloration of the epoxy at the edge of the CCD and a Scorch mark on the adjacent part of the substrate actually in fact vsrc feeds part of the CCD and so there may be an issue here of a voltage differential where it is forced to exit through the explosion of the Silicon so to do our next part of analysis we used some images from locusa who post a lot of very high quality photography of dies and die shots online and does phenomenal work and we transformed that by multiplying it as a mask on top of the damaged silicon first we used a clean delighted die that we also have here and we scoped it to figure out the orientation of locus image then we multiplied the image onto the failed iodine the failure within the iodine appears to be in the middle of the GPU core complex which actually drives the display 5. interestingly we learned that the AMD igp is one of the iodine parts that's fed voltage via V core not just vsoc we have a voltage block diagram we acquired from a contact at a motherboard manufacturer that confirms again that vsoc actually goes to part of the CCD it's specifically this combination of the damage at the edge of the C CCD with the discoloration of epoxy and The Scorch Mark plus the i o die Scorch Mark within the GPU complex that led us down the path of this is probably why it exploded and it was such an explosion that gigabytes power supply team would blush we brought this next Theory to a different contact our friend Wendell from level one tax and he had this to say this makes sense because we know there are different voltages going into different parts of the iodine chiplets the voltage differential has to go somewhere those are probably not completely isolated circuits you're going to have some leakage even if they're perfectly isolated but the reality is they probably aren't you're going to have at least the logic circuits talking to each other between those sides if you have a voltage differential the differential has to be sunk to ground somewhere somehow in other words it's going to go to heat the voltage differential could go to heat that the processor experiences in a way that the engineers did not anticipate because the voltage is so much higher than stock it'd be 50 percent higher if it's one volt versus 1.5 volts or whatever I would go so far as to say it's not even really just the voltage by itself if you have a really high voltage but you only let a few electrons through that's fine but the SRC could be saying no no I'm in boot up now mode I need all the current and it makes sense why it'd be tricky to catch in QA or QC since it's likely on an axis where a combination of silicon quality between iodine CCD are related you roll the one for the i o die for the CCD and for the BIOS quality control congratulations what do you win rapid unscheduled disassembly thanks Wendell for the quote I'll use that forever now the BIOS quality control point is important this brings us to the next aspect of this which is it's not just an AMD thing it comes down to the board vendors as well prior to any of the physical testing we noticed one big problem with the Asus board specifically in order to do this testing we soldered some leads to the Asus x670e Crosshair hero to get more accurate readings on the soc voltage that's actually delivered methodologically this is necessary because software reporting and bio settings are often different from what can be measured and a massive shout out to Elmore from Elmore labs for helping us with some guidance on the right direction to go for this section Elmer provide us some expert Insight that helped us formulate test plans and also builds custom overclocking tools external voltage controller is one of which we're using and other cool Enthusiast gear you can find him on Elmore labs.com I just want to jump back in here for a second and say this piece was awesome to work on because we collaborated with so many experts in the industry some could be named some couldn't but to all of you thank you it's been a lot of fun working on this so here are our findings enabling Expo has the Crosshair hero running at 1.35 volts SOC set in the Bios by set what we mean is that's the number it says but not necessarily what it reads this is already very high by amd's own standards though and amd's new statements says it shouldn't go higher than 1.3 the actual voltage delivered when probed at SOC mlcc's or on the vsoc check pads on the board is 1.4 volts we saw one Spike up to 1.41 despite the already high 1.35 volt in BIOS and we often saw it most commonly sitting at about a 1.39 volt reading like you see here that's getting dangerously high for the CPU and at this point this is also something AMD has confirmed but we'll come back to that one thing is for certain Asus is running its SOC voltage way too high it should be closer to 1.3 volts or ideally 1.25 volts and the reason to blast the soc voltage like this largely comes down to laziness it's the lazy way to ensure that there's wide compatibility without finer deeper tuning to the BIOS in order to get the memory compatibility that they want as for these numbers we're reporting so we confirmed this on bio 0922 that was one just before the official x3d support we also confirmed this behavior of the 1.4 ish voltage for SOC on 1101 that bios officially supported x3d but don't let Asus trick you because they retconned it and decided oops no it doesn't but it did originally it said it supported x3d and then we also confirmed this behavior on 1202 that is the last bios that Asus left on its site before it started hiding and burying all prior bioses so they all lazily blast the soc because Patrick and I were so slam trying to get the content through the pipeline we again reached out to Wendell and asked him for assistance checking more motherboards he was happy to assist maybe a little too happy we are going to film The Death of the CPU or try to film at the end of the CPU here's what Wendell found using a mixture of leads wired to the boards like we did and Hardware info Wendell noted that the ASRock live mixer b650 board pulled 1.01 volts on auto 1.20 Volts for bios set to Expo one and 1.246 volts with prime 95 loading it he found the Phantom gaming pulled between 1.22 and 1.24 with Expo depending on if Prime was loading it and he found that gigabyte was at 1.2 to 1.22 volts MSI was at 1.19 to 1.2 volts as well so Asus has a little bit of an outlier here they're much higher than all those but they're still not alone they're not the only reason these things can fail that's where part AMD and part other board vendors comes in because if you remember skyfish's system was a gigabyte board that one failed the motherboard lived though and the gigabyte board didn't even have Expo turned on according to skyfish so that would be close to the 1.0 vsoc which makes things a little trickier and that's probably where the fa lab will come in in a few weeks so we think it's more than vsoc and if you look at amd's statement they did talk about tuning other voltages including the SRC but we'll come back to that and we're also aware of at least one biostar board that was at or in Access of 1.4 volts SOC as well but one important note here for expo Expo itself think of it again as sort of an XMP it has nothing to do within the profiles about vsoc Expo does not contain a number for the soc or the system on chip voltage it apply and Expo shouldn't necessarily state that you are also applying a dangerously high voltage that's on the motherboard vendors that is asus's decision to do that so Expo although it is a potential trigger if configured incorrectly by the motherboard vendor itself is not the cause for this just want to make that really clear because Expo is pretty cool it has a lot of really high quality tuning and timings built into it for the different kits and that's what's making AMD as competitive as it is right now because without it they're kind of weak regardless SOC voltages of 1.4 are clearly not necessary in the week or so of testing that we ran on this we found a whole host of other bugs some related to this some not but there were a lot of them and we can't fit them all in this video we tried it added 15 minutes we're going to run a follow-up piece with some more information on that some ranting about how this was handled very poorly in a lot of cases by everybody AMD and motherboard vendors especially Asus though uh but that's a separate thing so the main issues the sphere ones from this piece were prokot vsoc problems of various types and ocp being basically improperly executed to quickly recap some of the other ones we found because we've never had to dig this hard into amd's am5 platform before here's the short version on some platforms we noticed x3d CPUs were shutting down at inappropriately high temperatures for over temperature so on ASRock gigabyte and MSI various boards we observed in combination with Wendell a 116 degree thermal trip point but on Asus for the board we tested it it was 106. it's supposed to be 106 for x3d and 116 for non-x3d so this got royally screwed up by at least three board Partners which to us means AMD is at fault for not communicating it properly and offering adequate support to those Partners on gigabyte f5a we observed a potentially fatal bug where it was sometimes not possible to reset vsoc to autoer defaults and on Asus we noted another bug where loading Expo via the AMD overclocking menu provided by the AMD agusa binary rather than the Asus menu would load the correct SOC voltage so it'd often be 1.2 or 1.25 while asus's menu was loaded at 1.35 for the BIOS as we were testing except for those newest patches that came out now unfortunately Asus has another bug or AMD does it's in their menu after all where it doesn't load the correct vddio voltage using this menu that results in blue screens and crashes because of the Ron v2dio honestly this platform is just a complete mess and we didn't realize that going into the initial reviews because we followed our standard reviews process and everything worked fine for the purposes of testing other than whatever we might have noted in those reviews but this issue necessitated a much deeper look exploring a lot more settings a lot more combinations of hardware and it just made it look really much messier than uh than the standard approach does so that's not good for am5 and there were more bugs that again we're not even talking about here because they're just not related enough for the rest here's the quick and final rundown for now this as always is not fully and definitively conclusive because it's an extremely complex issue it's really hard to replicate this problem it took us the entire time to get the catastrophic failure to happen so that should reduce your concerns significantly we had to actively try to find out how to make this happen doesn't mean it won't accidentally happen probably a much higher likelihood obviously than trying but the point is it's not easy and so that should temper sort of the concern about How likely is this to happen to you we had a limited sample size of course and we also were working off of changing information now the good news here is we had some extremely conclusive results so we're out of hypothesis Tara Tory and into just actually proven fact with some of these problems the ocp issue that was proven fact the theory about a sort of voltage differential causing an escape path that is undesirable that's in theory territory here's the bulleted recap AMD and its partners are actively rolling out bioses to address this most address it by locking down the vsoc the rollout has been complete chaos Asus for example is clearly incapable of getting the CPU names right gigabyte doesn't get out unscathed either they have their own rollout issues AMD is offering replacements for CPUs killed in this manner that's even if Expo is on we asked the MD and we confirmed that at least the us will have free shipping both directions if one of your CPUs dies motherboard vendors may not be so kind it's to be determined next question of is this resolved we believe the SRC clampdown is inadequate as a blanket fix it's a quick fix and it resolves a lot of the concerns of longevity and slowly damaging the processor over time remember that's sort of what you get with typing in a higher vs so see as you simulate that aging much faster but in a real use case running at 1.4 for months or a year on end or whatever you're going to lose some kind of ability over time and maybe it results in a failure maybe it doesn't it depends on the chip quality but it doesn't really matter what matters is you stop running at that voltage because there's no reason to maybe there's some memory stability problems but you could probably bring it down to a more reasonable 1.3 or something the Asus ocp problem needs to be resolved separately the gigabyte problem where it just refuses to listen to user input and won't zero out and reset to Auto the soc voltage on some bios is that's another problem too and right now as an end user wondering if this affects you here's the answer if you have x3d especially but really any of them you should update BIOS you should check the soc as a baseline on Hardware info there's a free version this number isn't perfect but it's a good start and some boards have embedded controllers that give you even more accurate information that's more useful if vsrc is greater than 1.3 when you're looking at this lower it manually you check under load and idle remember that setpoint vsoc can often be different from actualized but typically not as much as that 50 millivolt deviation we saw earlier also x3d is more fragile so this has been a lot of fun it's been really exciting to work on this it's a different process from the standard reviews there's just sort of a lot of mental puzzling with it trying to figure it out videos long enough though and at this point just the reminder that there are always limitations to testing we've worked very hard to close all those Loops that we can there are still of course limitations and that's why we're going to have follow-ups but even then if some new information pops up please make sure we see it you can tweet it at us or something and we feel pretty good about the results that we've found but some of our attempts had explanation of course rely on assumptions so that's the one thing we want to make sure everyone and keeps in mind just full of transparency we tested Asus the most heavily since it was the clearest offender but that doesn't mean other motherboards can't also have been problematic and ultimately updating the BIOS quickly checking the SRC that should be enough to just just put it out of your mind once you do that because any other issues are probably out of your control anyway and just pending more updates and there's no point being anxious about something that you have to absolutely no control over anyway so we're doing more testing we'll keep you all updated subscribe for more go to store.gamersaccess.net to grab this shirt to help us out directly and support our efforts this is for 15 years in business now we've been doing it a while and we're looking forward to another 15. thanks for watching we'll see you all next time
Info
Channel: Gamers Nexus
Views: 1,162,011
Rating: undefined out of 5
Keywords: gamersnexus, gamers nexus, computer hardware
Id: kiTngvvD5dI
Channel Id: undefined
Length: 38min 46sec (2326 seconds)
Published: Sat Apr 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.