Yes. Latency is the minimum time (milliseconds or microseconds) it takes to do something
I would put it differently. I would say: latency is the
actual time, not the minimum time. When people talk about latency, they usually mean latency for random I/O like 4K. But latency is more universal and also applies to sequential I/O with larger transfer sizes, meaning the latency is a bigger number not the lowest number. For example, you can easily calculate the latency for sequential reads with 1MiB request size:
Assuming the SSD can do 500MB/s and assume the host sends 1MiB requests, then the latency is 1/500th second or 2.0ms.
For random reads, the latency is usually much lower on SSDs. Assuming 250MB/s of multiqueue random read performance, assuming 4KiB requests. The latency now is 0,016ms.
So for a single disk, they correlate pretty closely.
Even a single SSD is already a 15 or 16-way interleaved (RAID0) of NAND, so you would have the comparable performance characteristics as a RAID0 of 16 harddrives, for example. That is: you need multiple queue depth to saturate a single SSD in random reads.
When you RAID a bunch of disks together, typically, there is a an increase in potential IOPS, but latency doesn't budge.
True. This is why a single NAND SSD using AHCI always has about 20MB/s of blocking random read performance -- RAID0 cannot improve this.
This is because if you request a particular block of data, it is still sitting on a single drive somewhere (so minimum "do something" time doesn't change) but all the other drives can, theoretically, be servicing other requests. So as long as all your read/write requests don't hammer a particular drive unfairly, you get a pretty good (close to linear, but never perfectly so) increase in IOPS.
This is true to explain why increasing the queue depth does not linearly increase IOps - the I/O requests are not evenly distributed. That is why you need 16 or 32 queued I/O's to saturate a 10-channel controller.
The problem for home users putting SSDs in RAID-0 is that many (cheap) RAID controllers actually create a latency bottleneck. So for many users (particularly with older on-board motherboard RAID) it's a tradeoff: you can get a single SSD that provides 40k peak IOPS and ~500MB/sec sequential throughput. However, when you stick a pair of those same SSDs into a RAID-0, you get the expected ~900MB/sec sequential throughput but maybe only ~30k peak IOPS. Or less.
I disagree. You seem to blame the controller. But you know that onboard RAID is FakeRAID and many cheap addon controllers are also FakeRAID. FakeRAID means that the controller is just a SATA controller acting as HBA. The (Windows-only) drivers do the actual RAID part.
The real reason RAID0 might not provide increased IOps or increased throughput:
1. Using a PCI or PCI-express addon FakeRAID controller, the controller is limited in PCI(e) bandwidth. PCI is already a huge bottleneck, PCI-express x1 runs at 250MB/s or 500MB/s and you can calculate 15% overhead/inefficiency depending on PCIe Payload.
This has nothing to do with the RAID part. The controller itself simply cannot provide enough bandwidth to the memory because of interface bottlenecks.
2. Using an addon FakeRAID card means you need to use their drivers to provide the RAID functionality. Like: ASMedia, Promose, Silicon Image, Marvell, JMicron, etc. These drivers are not at all efficient of properly engineered. Some always transfer the entire stripe block when only a fraction of the stripe block has been requested.
3. People configure their RAID0 the wrong way, using a too low stripesize. People somehow have been taught that you need a high stripesize to have good MB/s scores and a low stripesize to have good IOps scores. It is actually the other way around: lower stripesizes are good for throughput, higher stripesizes are good for IOps. Stripesizes of 1MiB and above are quite common when optimizing for IOps, as can be done on Linux/BSD software RAID.
4. People use old-fashioned operating systems like Windows XP that start the partition at sector 63 offset, meaning an offset of 31.5KiB. This causes a misalignment issue which is not so bad for throughput (MB/s) but kills the IOps performance potential.
Today, if one uses Windows 7+ and uses Intel onboard RAID, most of the above points are not applicable and the user will get doubled IOps performance. But because Windows has single threaded storage backend, this can be bottlenecked by single core CPU performance. But for two SSDs and a decent CPU this should not happen easily.
It didn't matter for HDDs, since most of them were hard pressed to push more than a couple hundred IOPS each - even a cheap RAID controller would never bottleneck a handful of those.
That is not true. The RAID0 reviews where Anandtech and StorageReview said that RAID0 had no place on a desktop, were based on bad testing and bad setup with FakeRAID PCI card and misalignment issues. This impacted RAID0 HDD performance to the point where there was only a small benefit in throughput and all benefit in IOps was penalized. In these tests, the latency actually became worse due to all bottlenecks. This is why RAID0 has such a bad name for the desktop.
The funny thing is that RAID0 today is what gives SSDs their speed - without it all the numbers in benchmarks would be very low.
Adding another few hundred bucks for a "good" RAID controller adds additional complexity and cost to a system that rarely benefits from the increased potential performance.
Actually the true hardware RAID controllers do exactly what you say: increase latency. This is because they have their own Intel IOP ARM processor with own memory. This increases latency because basically the request needs to be processed twice. People were taught that Hardware RAID was better because they were good at XOR-ing calculations for RAID5 and RAID6. This is not true, your own CPU can do XOR at multiple GB/s fairly easy - XOR is one of the easiest instructions for your CPU that is bound by memory bandwidth.
The real reason Hardware RAID was better in RAID5 and RAID6 was of the better firmware that provided a true Split&Combine-engine. This is required to cut host I/O into pieces that are optimal - namely the optimal stripe block. For RAID5 of 4 disks and 128KiB stripesize that is 128K * (4 - 1) = 384KiB. Only writes of this size will be fast, other writes require a read-modify-write phase where data needs to be read before the write can begin.
Software RAID is superior to Hardware RAID. But in reality, Hardware RAID in the past had better engineered firmware than the software RAID engines available at that time. Today, GEOM RAID is the best, followed by Linux MD-raid. Windows continues to provide very poor software RAID features. But Intel actually has good features and decent performance. It also provides 'Volume Write-back caching' which introduces a nice RAM buffercache which accelerates writes regardless of RAID level.