My new RAID 10 arrays are unstable and give me problems

mh321

Member
Jul 16, 2013
33
0
0
This is my first time setting up a RAID 10 on my 6 month old home computer. It has an Adaptec 5805 controller. I used to have 2 RAID 0 arrays on it, which were stable. I decided to upgrade to 4x Kingston V300 SSDs in RAID 10, and 4x Seagate ST1000DM003 in RAID 10.

I have been having problems with both arrays:
-I get frequent alarms on my RAID controller
-Both arrays will randomly get degraded/rebuild on system startup after a proper shutdown
-Chkdsk runs on random startups
-Sometimes Windows won't start up, and I only get a black screen until I repair it
-I get messages about delayed writes failing
-I have gotten "The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume"
-I have gotten "The system failed to flush data to the transaction log. Corruption may occur"


Turning both the controller write cache and the disks' write caches on seems to cause the most instability. Running the computer with either write cache on still causes instability, but maybe a little less often. Running with both write caches off seems to be most stable, but I still get problems. I would want to have write caching on, for better performance. All problems started when I installed and set up the new arrays a week ago, and continue to this moment. I haven't found out how to "make" the problems appear at will, they are random, but probably worsened when I do writes to the disks. My arrays randomly get degraded no matter what the write cache settings are.

System specs:
Asus M5A99FX Pro R2.0 motherboard
Adaptec 5805 with 512mb cache, running latest firmware and driver
4x New Kingston SSDnow V300 240GB in RAID 10
4x Manufacturer Refurbished Seagate Barracuda ST1000DM003 1TB in RAID 10
2x Norco SS-500 hot-swap backplanes/cages
Dual boot Windows XP Pro x64 and Server 2008 x64
Adaptec 29320A SCSI card connected to LTO-3 tape drive


Does anybody know the cause or how to fix this? Thanks
 

VirtualLarry

No Lifer
Aug 25, 2001
56,452
10,120
126
I have been having problems with both arrays:
-I get frequent alarms on my RAID controller
-Both arrays will randomly get degraded/rebuild on system startup after a proper shutdown

System specs:
Asus M5A99FX Pro R2.0 motherboard
Adaptec 5805 with 512mb cache, running latest firmware and driver
4x New Kingston SSDnow V300 240GB in RAID 10
4x Manufacturer Refurbished Seagate Barracuda ST1000DM003 1TB in RAID 10

Using refurb drives? In a RAID array?
 

mh321

Member
Jul 16, 2013
33
0
0
Yes, the computer is only for home use, and it was cheaper. I only intend to store media and other random non-critical things on it. If one disk fails, then the array is supposed to keep going. I have a tape drive for backups. Back when I had raid 0, all my drives were pre-owned, and they never gave me problems, or I got lucky. Why not with a raid 10?


I currently have the refurb seagate barracudas disconnected from the computer. Only the brand new SSDs are connected right now, and I still have problems even with write caches disabled.

Could I have a bad raid controller?

Could my backplanes be causing problems?

Any ideas?

Thanks
 

mh321

Member
Jul 16, 2013
33
0
0
What is the best way to test the controller and backplanes? I don't have a spare controller, and both backplanes are the same model.
 

mh321

Member
Jul 16, 2013
33
0
0
So I took one of the backplanes out of the equation. I connected all 4 new SSDs directly to the raid controller, and I was able to get a degraded array after a few startup/shutdown cycles. Controller write-caching was enabled, and disks' write cache disabled in the controller BIOS. The seagate hard drives are currently disconnected to rule them out.

What could be the problem?

Thanks
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
Hmm, what slot have you got the card in? You've got what two i/o cards and a gpu? I'm not familiar with these boards. Which slots are off teh south bridge?
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
There should be logs showing WHY the array is degraded.


Logs aren't really helpful in diagnosing the issue that "caused" the dropped drive. It's usually more for event tracking for ex. oh hey drive dropped then it started up the rebuild, not there was x error which caused...
 

mh321

Member
Jul 16, 2013
33
0
0
The motherboard has a AMD 990FX/SB950 chipset.
The GPU is in PCIEx16 slot 1.
The 5805 is in PCIEx16 slot 4. All hard drives are connected to this card.
The 29320A is in PCI slot 5.


The logs are very long, so I can't paste everything, but they show "Logical device is degraded", followed by a scrub/rebuild, and then completion. In the deaddrives log, I have failiure reason codes 1,2, or 8 (I don't know if that includes drives I intentionally disconnect/remove).
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
What are the x4 slots connected to, the southbridge? How much bandwidth is there? Are those slots shared with the other peripherals usb, sata, etc? I have a 5805z too btw, but I am on x79 with ample pcie lanes and no issues. I run this card and a 12 port card, total of 20 ports on tap.
 

BonzaiDuck

Lifer
Jun 30, 2004
15,785
1,500
126
What are the x4 slots connected to, the southbridge? How much bandwidth is there? Are those slots shared with the other peripherals usb, sata, etc? I have a 5805z too btw, but I am on x79 with ample pcie lanes and no issues. I run this card and a 12 port card, total of 20 ports on tap.

I think I agree with that.

Acquire, grab, or raise your local file for motherboard manual and look for a table of acceptable bandwidth and lanes available for PCI-E slots and combinations with other motherboard devices like internal USB 3.0, auxiliary e-SATA controllers, etc. IF you find from the table that you have a problem, test it and move on. Otherwise, turn off in the BIOS menus everything on the motherboard you can have unavailable for either hours or days. Even abjure using certain e-SATA ports and so on.

Then test again.
 

mh321

Member
Jul 16, 2013
33
0
0
What are the x4 slots connected to, the southbridge? How much bandwidth is there? Are those slots shared with the other peripherals usb, sata, etc? I have a 5805z too btw, but I am on x79 with ample pcie lanes and no issues. I run this card and a 12 port card, total of 20 ports on tap.

Where do I find this information? I checked the Asus website, the BIOS, and the owner's manual, and I cannot find anything saying what is connected to what.

The manual has a PCIE operating modes table:
Single PCIe card: (slot1 x16 Recommended for single) (Slot2 x4) (Slot3 x16) (Slot4 x4)
Dual PCIe card: (slot1 x16) (Slot2 x4) (Slot3 x16) (Slot4 x4)

There is also a IRQ assignments table that tells me PCIe 1+2+3+HD audio have IRQ A (shared), and PCIe 4+PCI+ASM SATA+OnchipUSB_2 have IRQ E (shared).
 

XavierMace

Diamond Member
Apr 20, 2013
4,307
450
126
Logs aren't really helpful in diagnosing the issue that "caused" the dropped drive. It's usually more for event tracking for ex. oh hey drive dropped then it started up the rebuild, not there was x error which caused...

You don't think knowing which drive is dropping would be helpful?
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
You don't think knowing which drive is dropping would be helpful?


Did I write that it's not helpful to know which drive went down?

There should be logs showing WHY the array is degraded.

The problem is the log only tells you which drive is down, not why. Or you could say the array is down because of X drive, but you're still only chasing the result not the cause.

In the OP's case it's looking more and more like the card is bad or its a software issue. I don't know what he's running but it looks trim from his comments.

The file system structure on the disk is corrupt and unusable.

This error can be anything from bad memory to db issues.
 

yinan

Golden Member
Jan 12, 2007
1,801
2
71
Plus, a lot of times, you need a REALLY controller to handle SSD raid. The drives are simply too fast for the controller and it doesn't know how to handle it.
 

mh321

Member
Jul 16, 2013
33
0
0
The graphics card is in slot #1
Slot #2 is covered up by the GPU
Slot #3 is unused
The RAID controller is in slot #4. It is the only x16 slot available once one GPU is installed.
I have the SCSI card in slot 5
I have a SIIG Texas Instruments firewire card in slot #6.

The entire system was built in September 2015.

The card is a 3gb/s card, however the SSDs are advertised as compatible with 3gb/s. The 5805 is designed to work with both SSDs and HDDs.

The card has the latest firmware and driver installed.

The only external things I have connected to the PC is a PS2 keyboard and mouse, a USB floppy drive, monitor, ethernet, and a MOTU 828 MKII firewire audio interface that I only power on once in a while.


I am having a feeling that something is wrong with this card, and I am having serious thoughts about replacing it. I should be able to have write caching enabled without serious problems like this, and even with it disabled, I still have problems. Does anyone else think that I should get a different card (or what else I can try?)? Thanks.
 

thesmokingman

Platinum Member
May 6, 2010
2,307
231
106
You have the card the only slot that you could use it in.

Hmm, when you created your array, what did you use for stripe size and cluster size?

Have you tried as a test just using the ssd as a single, just initializing the drive allows it to be used as a regular drive? I wonder if the drives will drop in jbod config as well?
 

mh321

Member
Jul 16, 2013
33
0
0
The stripe size was 256 (the default). I don't think there was a cluster size setting in the raid card bios, but if it were, it would be the default. I will try using the drives in different configurations.
 

heymrdj

Diamond Member
May 28, 2007
3,999
63
91
The fact you mention that turning on caching makes the issue more prevalent makes me think it's a card setup issue.

My initial thought is the card is just too old to handle RAID 10 with SSDs. It's a 3Gbs per port card. RAID calculations for a card are complex and vary by vendor, but essentially with RAID 0 the card leverages fewer checks than on RAID 10. Once you enable RAID 10 the card now really cares about how in sync the drives are, and depending on how you're wired your card's ports may simply be too saturated before the second set of drives can syncronize and this flags an alarm that your drives are out of sync and rebuild has to be done. RAID 10 puts alot of write stress on the card cache. Disabling the cache (which is also slow, that card uses DDR2, I don't know of any SSD card that's not running DDR3) disables the high burst capability of the card and makes the OS wait to write or call more data. This allows the card to keep up longer until it finally still trips up.

The fact is the card was compatible with old Intel X-25E 60GB SSD's and even then had issues with throughput in RAID 10 arrays back in 2009. As of 2013 the Kingston V300 series was nearly twice as fast as the Intel X-25E in the Anandtech bench. The card is just not up to that throughput on things that run that low latency. The only thing I could think is if Adaptec put out any updates for it? But from the website I don't see that the 5805 was ever made compatible with SSD storage (only SSD caches).
 

mh321

Member
Jul 16, 2013
33
0
0
What specs should I look for in a replacement card? What should I avoid in a replacement card? Any brands to avoid? Can I go for OEM Dell, HP, etc? Which cards have quicker startup times? I am looking to find a decent new/used card for under $200.
Thanks
 

mh321

Member
Jul 16, 2013
33
0
0
Can I go with a 6gb/s card with DDR2 memory, or is it necessary to have a card with DDR3 memory when I use SSDs?
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |