A lot of topics have been discussed here regarding performance so apologies if I stray a little from what you require.
1. You want guidelines on performance from CV.
This is kind of hard to do when CV does not control the hardware aspect of the backup process. From our development perspective, we need to be robust enough that the product works under a whole multitude of hardware configurations as well as trying to do this at a decent speed.
To achieve this, we reply heavily on the OS abstraction layer. To perform the reads, we call read requests to the OS which then performs the requested job etc. You can imagine a lot of performance variation can occur here when customers have completely different hardware. That's why baseline performance figures are rarely given out as this may set the wrong expectation over something we do not control.
2. Random vs sequential reads/writes
Deduped CV chunk data writes references down into the chunk metadata files on where the deduped chunk resides. In a way all requests to read a particular chunk may actually traverse a number of directories in order to get there to perform the read. In most cases, raw throughput is not the correct figure to be judging how good/bad a RAID array would perform. I would recommend performing IOPS tests to get a more accurate picture of how this RAID array would perform in real world use.
E.g. During a DASH copy, the source disk array of 12 drives in RAID6 may only see actual throughput of 2-3MB/s per disk but IOPS has already reached 80-100 per disk. The number of operations a single disk can handle has a direct and usually inversly proportional relationship to the throughput. This is also the reason why most storage vendors quote on IOPS and not throughput MB/s as it is not a true indication of real world usage.
Ok so I understand there is a lot of confusion out there on streams. Phillipe is correct in saying that a single backup using 1 stream will only be able to use 1 stream during aux copy. I know this causes issues with large backup sets but unfortunately the only way I see around it is by using many readers for the backup job in the first place and potentially turn on "Allow multiple readers per mount point" option.
Eg. Your 8TB backup job using 1 reader will be assigned 1 stream ID as the job is written down. When Aux Copy comes in, because the job has 1 stream, this entire stream will be allocated within one Aux Copy stream. Now if your backup job used 4 readers, it will have 4 stream ID's assigned. Depending on the size of each stream, the Aux Copy job can now break this apart and allocate an appropriate amount of streams to copy this job UP TO 4.
This same stream number lock applies to Media Refresh. If two tapes were written to with a different stream numbers, during refresh it cannot combine these into 1 tape and stream number. It has to stay consistent. A workaround for this would be to create a additional secondary copy, select the jobs from both tapes to copy and set this copy to combine to 1 stream. This way the destination tape will now have a stream number of 1.
If you performed the Aux Copy with combine to streams in the first place, you will limit the amount of stream numbers used onto tapes. (eg. combine to 2 streams means there will only exist tapes with stream numbers 1 and 2 greatly alleviatiion the problems during media refresh).
I hope this clears up a few things for everyone. Let me know if you have further questions. It's almost midnight where I am so apologies if I don't respond quickly.