One write will have to write to all drives (edit for posterity: not necessarily, it could be only 2 for a small in-place edit). A very small 4K-multiple 4K-aligned stripe size should be mandatory. Other than, it aught to be OK.raid-5 probably will amplify writes, why not just do jbod or spanning?
One write could cause all drives to write a new sector.
Have you tracked down the causes? I can't say that I can vouch for that, even with home servers not protected by so much as a UPS. You might want to see if the drives are in a more extreme power-saving mode, causing false dropouts.Rebuilds and verifications are a lot more common than you might expect.
remember ALL of the drives are going to fail at the same write age, so you can take an arrow to all drives at once.
hhhd1, how much does intel 710 overprovision for mlc?
Rebuilds and verifications are a lot more common than you might expect.
With any decent hardware controller, a write on a RAID5 array will only write to two of the drives in the array. If writes were performed on all drives for a single write operation, there's no way you'd see a performance increase for both reads and writes in RAID5.
How much writing are you looking at doing? I honestly wouldn't worry about write endurance unless you're looking at several hundred gigabytes per day. StorageReviewdid a test on a Patriot Wildfire comparing the performance between a fresh drive and one that they had written 270TB of data to. There's little to no performance degredation with 20% life left.
The total data set is {A,B,P}, where A is drive n%3's portion, B is drive (n+1)%3's portion, and P is drive (n+2)%3's portion, the parity.How do you figure? Two data disks and one parity disk = three writes.
The total data set is {A,B,P}, where A is drive n%3's portion, B is drive (n+1)%3's portion, and P is drive (n+2)%3's portion, the parity.
Modifiy A -> {A,B,P} must be read, and P must be recalculated, but only {A,P} must be written. B never changed, so A xor B = P is still true with the old value of B.
Modify B -> {A,B,P} must be read, and P must be recalculated, but only {B,P} must be written. A never changed, so A xor B = P is still true with the old value of A.
Modify A and B -> {A,B,P} must be read, P must be recalculated, and all of {A,B,P} must be written.
Any write is a modify, to any fully initialized array. That it has been initialized is the only assumption I made, and I'd say it's a pretty good one to assume. You may also notice that I short-circuited my brain in my first post, then edited after Zxian's posting, having reasoned it out, which is also why I took the time to explicate on it.So you're making the assumption that a write is a modify, not new data...
Any write is a modify, to any fully initialized array. That it has been initialized is the only assumption I made, and I'd say it's a pretty good one to assume. You may also notice that I short-circuited my brain in my first post, then edited after Zxian's posting, having reasoned it out, which is also why I took the time to explicate on it.
If the array is initialized, there is no longer any such thing as new data. All writes are replacing existing data.
If mdadm keeps bitmaps of used stripes, then there may be new writes as the FS gets used, and so it would be possible to need to write new data to the array. However, that would be a special optimization (easy to extend for TRIM support, if that is ever accepted), I don't know if it is actually done, and it could only occur once per stripe during the array's life time.
Pretty much. The traditional method is to assume unused stripes are zeroes (though, any values would work, as long as the parity matches), and make them so if they aren't (0 xor 0 = 0).Hmmm. I guess I've never gained that deep of an understanding of how RAID operates. So if I understand correctly, because there's existing 1's and 0's on the disk, parity has already been calculated for those even if they're "empty" space as far as the file system is concerned? That would make sense.
Yes. Hard drives seek slowly. Traditionally, stripe sizes have been very large, relative to data, because RAID 5 must read all stripes to be able to verify the data, and to change it. 64KB, 128KB, and 256KB were/are very common, to reduce drive thrashing with random access, as a larger stripe size forces more nearby data into RAM. Most random access tends to occur nearby, in real workloads on real files [that aren't swap files]. With SSDs, small sizes make a lot more sense.Is it really that often that A or B is written but not both? If the changes you make can be made by modifying A and not B, maybe your stripe size is too large?