Fighting Slow Aux Copies Between Datacenters

Last post 11-01-2019, 4:29 PM by oalexis. 11 replies.
Sort Posts: Previous Next
  • Fighting Slow Aux Copies Between Datacenters
    Posted: 10-25-2019, 6:46 PM

    Folks,

    I've been dealing with this issue since August and haven't gotten anyware. We Aux copy from one library in one datacenter to the other (DASH Optimized).Support seems to think that's due to slow disk reads which are about 1-2 MBs. I've reached out to my hardware vendor and they are not seeing any disk latency issues. We are using direct attach storage connected via fiber.

    We used to get like 3-4 TB/hr of processing speed. It's now down to <1TB/hr

    We've also played with the Advanced settings on the MAs such as LookAheadLinkReaderSlots etc but no dice

    We've ran disk performance tools like diskspd from MS and not seeing horrible performance per se. Backups are running fine as we're using client side dedupe so not much data is being written to library

    However we are seeing somewhat slow performance for disk read intensive jobs like Synthetic fulls and DDB Verifications. But again we're not seeing slow disk performance outside of CV

    It's so slow that the backup sets expire before they get over. So it's pointless at this time to Aux Copy to protect the data

    I am at my end here as I am out of ideas. I stumbled upon the attached setting and wanted to ask if anyone has enabled this on their secondary copy that is involved in a DASH optimized Aux Copy job over WAN with high latency?

    Frustrated

     

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-25-2019, 7:15 PM

    Hi oalexis

    The setting you are referring to is it:

    Enable source side disk cache: You can optimize the signature lookup process by setting up the local source-side cache on the client or source MediaAgent (for DASH Copy). After you set up the local source-side cache, the signatures are first looked up in the local source-side cache. A remote lookup is initiated when the signatures are not available in the local source-side cache. The remote lookup reduces the response time for a signature comparison in a network with high latency

    This will definitely be helpful when there are high latency across the network involved in the DASH Copy. If the Storage is not the bottleneck then using the above option might help to reduce the lookup resulting in better DASH Copy performance

    Side Note:

    In regards to the slow read performance that you have mentioned, when carrying out Disk performance test (outside of Commvault), are we also taking into account the Storage's Cache capability. In most scenario when testing Storage Performance Vendors might not take into account the Storage Cache (using in-memory or SSD), so initiail writes/read will be cached and will perform relatively good.

    However when reading from the Disk/RAID, the associated Blocks needs to be promoted to the Cache and then traversed to the Front-end bus and then presented to the Host. If there are latency at the Disk/RAID Level then the read IOPS will be poor. 

    Regards

    Winston 

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-27-2019, 11:18 PM

    Thank you so much for responding. Yes that's what's killing me. When running the test with the cache disable option we don't get horrible performance like what you see on the stream reads for the Aux Copy in the logs or using the cvdiskperf tool.

    What is even more confusing is I took the same 'slow storage' and created another Aux Copy job to another datacenter but this one had 1ms WAN latency and the Aux copy performance was through the roof! That tells it's network but support says otherwise.

    I am going to play with the cache options to see if this will make any difference.

    thank you

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 2:30 AM

    Hi oalexis

    Base on your testing and deducation by moving to another network, I do strongly believe the latency of the WAN is the cause of the slowness. 

    Is there anyway to improve the bandwidth for the primary WAN?

    Regards

    Winston 

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 9:09 AM

    Hi oalexis,

     

    Can you please escalate the issue so we can check the slow disk read issue especially for Synthetic fulls and DDB Verification.

     

    Thanks,

    Ankur

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 9:22 AM
    Can you please escalate this so we can take a look at the setup and figure out what is causing the bottleneck.
  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 10:30 AM

    Are you using Commvault software encryption on the source copy? If so, what option are you using for the secondary copy (preserve/re-encrypt/plain text)? We have an open issue where there are so many encryption keys needing to be queried from the CSDB that it slows down any type of read operation (Aux copy/DV2/synth fulls/restores), basically spending most of the time waiting for keys to be pulled.

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 11:51 AM

    No we're not using compression on the source copy.

     

    Thank you

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 12:00 PM

    Yes that's what i say! I have ticket with my network team and it doesn't seem to be a bandwidth issue. It's a 5Gbs circuit. When using iperf I am able to pump 800 Mbs. But when you change the window size to something like 1000000 I am able to pump 3Gbs consistently.

    I wonder if there is a similar setting in CommVault? We have the stream count set to 200 which isn't helping.

    Also enabling source cache didn't help at all with performance.

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 12:03 PM

    Support isn't willing to escalate as they are focusing on the slow disk reads. Yet I say it's network latency.

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 10-28-2019, 8:38 PM

    Hi oalexis

    Can you PM me the Incident number, I can take a look and have the right party engaged to further assist.

    Regards

    Winston 

  • Re: Fighting Slow Aux Copies Between Datacenters
    Posted: 11-01-2019, 4:29 PM

    So I discovered that it takes a few days for the cache to build..After that I could see the throughput almost tripple. While the we still have sloew disk reads enabling Source Side cache proves the network latency to be a factor. We're in the process of building another datacenter and will move the Aux copies to that one instead.

    Thank you all for the responses

The content of the forums, threads and posts reflects the thoughts and opinions of each author, and does not represent the thoughts, opinions, plans or strategies of Commvault Systems, Inc. ("Commvault") and Commvault undertakes no obligation to update, correct or modify any statements made in this forum. Any and all third party links, statements, comments, or feedback posted to, or otherwise provided by this forum, thread or post are not affiliated with, nor endorsed by, Commvault.
Commvault, Commvault and logo, the “CV” logo, Commvault Systems, Solving Forward, SIM, Singular Information Management, Simpana, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, SnapProtect, ROMS, and CommValue, are trademarks or registered trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
Close
Copyright © 2020 Commvault | All Rights Reserved. | Legal | Privacy Policy