Author Topic: Debugging silent reboot HW Issues  (Read 1719 times)

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Debugging silent reboot HW Issues
« on: November 20, 2020, 04:14:57 PM »
I have a PC in following configuration:

- Power Supply Corsair 650W
- Intel i5-6600
- 16GB DDDR-3
- GPU: Radeon rx 560 / 16CU or GTX 960 (at some period I had both installed)

The PC crashes once in a while, frequently it happens when I load a GPU with some computations - much more frequently but can reboot just without a notice.

It happens both with NVidia GTX 960 (much more frequently - has PCI power connected) and with MSI Radeon rx 560 that gets all its power from MB.

Now once I removed both cards and used internal Intel GPU for two weeks I had no issues, when I put GTX 960 in crashed frequently, when I replaced it with rx 560 it can work for many days go through games but once in a while crashes as well.

How can I debug the problem:

1. It isn't reproducible consistently so I can't put another power supply for tests and return it
2. I don't think it is GPU since two different GPUs from different vendors have exactly the same issues
3. Memory checks OK.

What can it be and how can I figure out the faulty part?
Artik, 101 "Red" Squadron, Israel

Offline TyFoo

  • Copper Member
  • **
  • Posts: 214
Re: Debugging silent reboot HW Issues
« Reply #1 on: November 20, 2020, 10:39:34 PM »
How old is your PC?

Are all of the fans operating? GPU, Chipset Fans if any, Power Supply?

Can you see what temperature the Processor is/ has been running at?



Ty

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Re: Debugging silent reboot HW Issues
« Reply #2 on: November 20, 2020, 10:46:49 PM »
How old is your PC?

Are all of the fans operating? GPU, Chipset Fans if any, Power Supply?

Can you see what temperature the Processor is/ has been running at?



Ty

PC is 4.5 years old, all fans operating, no fan for chipset on MB, PS fan operational as well, No overheating - I watched this closely temperatures are low when it happens for both CPU and GPU.
I checked voltages in BIOS they are withing specs
Artik, 101 "Red" Squadron, Israel

Offline TyFoo

  • Copper Member
  • **
  • Posts: 214
Re: Debugging silent reboot HW Issues
« Reply #3 on: November 21, 2020, 12:15:08 AM »
You more or less eliminated my troubleshooting list. . . . lol

If you can run onboard Graphics without issue, I would think that you have an issue with either the PCI socket, where the socket attaches to the MB or somewhere downstream on the MB itself.

I don't think I am saying anything you don't already know, but if you are creating demand on the GPU when it crashes - it sounds like something is heating up disrupting flow. If it isn't the PCI socket, then it has to be the MB. The outlier would be two bad GPU's - while probable - its unlikely.

Offline Bizman

  • Plutonium Member
  • *******
  • Posts: 9508
Re: Debugging silent reboot HW Issues
« Reply #4 on: November 21, 2020, 01:03:21 AM »
I'd say you've pretty much nailed it to the Power Supply as it happens most often with the most power hungry GPU. The PSU may not be faulty, it may just be underpowered for the task. Without knowing the brand and model it's hard to tell whether it's a known poor firecracker or a quality unit.
Quote from: BaldEagl, applies to myself, too
I've got an older system by today's standards that still runs the game well by my standards.

Kotisivuni

Offline zack1234

  • Plutonium Member
  • *******
  • Posts: 13182
Re: Debugging silent reboot HW Issues
« Reply #5 on: November 21, 2020, 03:31:47 AM »
My corsair psu blew they sent be a new one
Bizman is awesome as well
There are no pies stored in this plane overnight

                          
The GFC
Pipz lived in the Wilderness near Ontario

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Re: Debugging silent reboot HW Issues
« Reply #6 on: November 22, 2020, 02:15:37 AM »
I'd say you've pretty much nailed it to the Power Supply as it happens most often with the most power hungry GPU. The PSU may not be faulty, it may just be underpowered for the task. Without knowing the brand and model it's hard to tell whether it's a known poor firecracker or a quality unit.

Actually the GPU that is now inside MSI Radeon rx 560/16CU/4GB is under 75W and takes all its power from MB. Another GPU that I tested Gigabyte GTX 960 4GB OC crashes as well even more easily has 6+8 bit connectors (120W TDP) while my PSU is 650W and half a year ago handled them both (960 + 560) - I had dual GPU for development

So single GPU isn't that hungry
Artik, 101 "Red" Squadron, Israel

Offline Bizman

  • Plutonium Member
  • *******
  • Posts: 9508
Re: Debugging silent reboot HW Issues
« Reply #7 on: November 22, 2020, 03:11:24 AM »
That's exactly why I believe the PSU is the culprit. Correct me if I'm wrong in the following:
  • The crashes happen more easily with the externally powered GTX 960 - 12 V, TDP 120W + 65W for PSU
  • Less crashes happen with the motherboard powered Radeon - 12 V, TDP <75W + 65W for PSU
  • NO crashes happened with the CPU integrated Intel graphics - 0.55 V-1.52 V, TDP 65W

To me that seems that the 12V line has fried. No matter how new your PSU is it can be a lemon. Corsairs have been made at least by Channel Well, Chicony, Flextronics and Seasonic and the quality can vary even within the same series.

A few years ago a friend had a 1½ years old computer with similar issues to yours - can't remember if the PSU was a Corsair or maybe a Chieftec but it started with a C. Anyway, as I studied the reviews to find if the PSU was a known poor choice I learned that from that series the lowest (450) and highest (850) wattage versions were built by a higher tier maker than the two middle versions (550 and 650). The reviewers wondered why the mid powered versions had cheaper (both price and quality) capacitors than the other two. When we tried to get a new one through warranty that very model was no more available, it was replaced by a model with a letter added to the name...
Quote from: BaldEagl, applies to myself, too
I've got an older system by today's standards that still runs the game well by my standards.

Kotisivuni

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Re: Debugging silent reboot HW Issues
« Reply #8 on: November 22, 2020, 05:13:44 AM »
That's exactly why I believe the PSU is the culprit. Correct me if I'm wrong in the following:
  • The crashes happen more easily with the externally powered GTX 960 - 12 V, TDP 120W + 65W for PSU
  • Less crashes happen with the motherboard powered Radeon - 12 V, TDP <75W + 65W for PSU
  • NO crashes happened with the CPU integrated Intel graphics - 0.55 V-1.52 V, TDP 65W

To me that seems that the 12V line has fried. No matter how new your PSU is it can be a lemon. Corsairs have been made at least by Channel Well, Chicony, Flextronics and Seasonic and the quality can vary even within the same series.

A few years ago a friend had a 1½ years old computer with similar issues to yours - can't remember if the PSU was a Corsair or maybe a Chieftec but it started with a C. Anyway, as I studied the reviews to find if the PSU was a known poor choice I learned that from that series the lowest (450) and highest (850) wattage versions were built by a higher tier maker than the two middle versions (550 and 650). The reviewers wondered why the mid powered versions had cheaper (both price and quality) capacitors than the other two. When we tried to get a new one through warranty that very model was no more available, it was replaced by a model with a letter added to the name...

when you say it this way... it is very-very logical... :-)


Artik, 101 "Red" Squadron, Israel

Offline Denniss

  • Nickel Member
  • ***
  • Posts: 607
Re: Debugging silent reboot HW Issues
« Reply #9 on: November 22, 2020, 10:20:48 AM »
What's the exact model number of your power supply and how old is it?
If its a modular one check the connectors (cable and PSU side) for anything looking anormal like brown/melted spots

Offline Shuffler

  • Radioactive Member
  • *******
  • Posts: 26763
Re: Debugging silent reboot HW Issues
« Reply #10 on: November 22, 2020, 11:48:03 PM »
It does point to a bad rail in the psu... if the cables are good.
80th FS "Headhunters"

S.A.P.P.- Secret Association Of P-38 Pilots (Lightning In A Bottle)

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Artik, 101 "Red" Squadron, Israel

Offline Bizman

  • Plutonium Member
  • *******
  • Posts: 9508
Re: Debugging silent reboot HW Issues
« Reply #12 on: November 23, 2020, 04:05:12 AM »
There's not too many reviews available other than those on the online shops. I found a couple in Spanish and some threads about it. None of those said that it's a firecracker but as Corsair say on their website, it's "ideal for powering your new home or office PC".

I found out a few things: It's made by HEC who are a decent manufacturer building what the branded customer wants. The capacitors are made by Teapo which rhymes with cheapo for a reason. There was some comments about the rails in the Spanish reviews but Google Translator wasn't too exact. All in all, it's a cheap product seemingly intended for light use. The Bronze certification is a sign of that as well.

 
Quote from: BaldEagl, applies to myself, too
I've got an older system by today's standards that still runs the game well by my standards.

Kotisivuni

Online artik

  • Silver Member
  • ****
  • Posts: 1895
      • Blog
Re: Debugging silent reboot HW Issues
« Reply #13 on: November 23, 2020, 04:11:40 AM »
what are recommended brands/price-range, I thought corsair should be good brand
Artik, 101 "Red" Squadron, Israel

Offline Bizman

  • Plutonium Member
  • *******
  • Posts: 9508
Re: Debugging silent reboot HW Issues
« Reply #14 on: November 23, 2020, 04:30:46 AM »
Seasonic is a safe bet. They both design and build their products and give them a 10 year warranty. As I said in a previous post, they've also built PSU's for Corsair. Unfortunately the Who's Who list hasn't been updated since 2013 so reviews and PSU/tech forums are the only source for reliable information.

As a rule of thumb independently from the brand, Gold, Platinum and Titanium rated models should be of better quality as the higher efficiency rate requires a more thought of design. Corsair is a good brand but they have several series some of which are intended for low budget markets: https://www.tomshardware.com/reviews/best-psus,4229.html
« Last Edit: November 23, 2020, 04:40:36 AM by Bizman »
Quote from: BaldEagl, applies to myself, too
I've got an older system by today's standards that still runs the game well by my standards.

Kotisivuni