Pruning and Amazon S3

Last post 02-11-2015, 11:19 AM by Ali. 4 replies.
Sort Posts: Previous Next
  • Pruning and Amazon S3
    Posted: 02-05-2015, 12:59 PM

    Hi,

    Am not sure if am posting this question in the right section. If this is in wrong place i request admin to move this to right section.

     

    I want to understand how pruning of Simpana works, am totally new to this product, while i was working on using Amazon S3 cloud storage with Simpana, i came across warning on using this. From what i understand the problem is only with S3 storage and no other cloud storage.

    Could someone please explain me in detail what is pruning (micro/macro) and how this affects when we use S3 storage, is this something Commvault is working on to fix ? or this is a problem with S3. It would really help in my solution.

     

    Thanks

    Vijay

    dishvijay@yahoo.com

  • Re: Pruning and Amazon S3
    Posted: 02-05-2015, 3:08 PM

    Good Day Vijay,

    Multi layered question here.

    Question\Comment:I want to understand how pruning of Simpana works. Could someone please explain me in detail what is pruning (micro/macro)

    Data Aging is the process of removing old data from secondary storage to allow the associated media to be reused for future backups. Basic explaination of micro vs macro below.Its alot more technical then this but this is the basic.

    Micro pruning is a term that is specifc to using the DDB (Deduplicated database) and removing specific references and data from disk storage based on retention criteria, logical pruning occurs and decrements refernce count in the DDB once all reference to an archive file ( the way we track our data) become zero we then physically delete the data.

    Macro pruning is the method used always for non-deduplicated data. Once archive files associated to jobs meet retention the data or volume dirs are immediately removed from disk or storage. Macro pruning also seen when in deduplciation the DDB is sealed and all jobs tied to that DDB have met retention ( the aging process is then similiar to non-dedupe data becuase we no longer use the DDB process as noted in micro prune.

    Question\Comment: working on using Amazon S3 cloud storage with Simpana, i came across warning on using this.

    What warning specifically are you referring to? Assuming that micro pruning is not supported? If so explained below.

    Question\Comment:From what i understand the problem is only with S3 storage and no other cloud storage. and how this affects when we use S3 storage,is this something Commvault is working on to fix ? or this is a problem with S3.

    The problem I beleive you are referring to or limitation is surrounding using this type storage with deduplication data and that the data can not be micro pruned ( pretty sure this happens with other cloud vendors as well). This is discussed in short in the CV link provided below as well>>>If you have Cloud Storage configured with deduplication, the pruning of the data is not done until the DDB is sealed and all of the backup jobs that are associated with that DDB meet the retention rules for the DDB to become aged. This is MACRO PRUNE This limitation is typically a cloud thing and we are always looking to enhance our part when possible.

    Question\Comment:It would really help in my solution.

    This CV link may help you.

    http://documentation.commvault.com/commvault/v10/article?p=features/cloud_storage/cloud_storage_faq.htm#Why_is_the_Data_Aging_operation_not_pruning_the_data_for_the_cloud_storage_library

    Considerations

    • You must have enough space on the Amazon S3 storage to hold one full-sealed store (that is, 30 days of deduplication backup data).
    • For Silo restores, you must ask Amazon to migrate the data from Amazon Glacier to Amazon S3. Therefore, you must have enough space on the Amazon S3 storage to perform restores.
    • Make sure to include the cost of uploading the full store in your overall cloud storage costs.
    • Reading data from Amazon Glacier is extremely slow. Reading large amounts of data can be very expensive. Therefore, you should choose Amazon Glacier storage for data that has a long-term retention period and when the probability of recovering the data is very low.

    Does Amazon Glacier support encryption?

    Yes, but data must be encrypted on the local disk before it is moved to Amazon S3, and then migrated to Amazon Glacier.

    Is job-based (micro) pruning available on Amazon Glacier?

    No. The entire Silo must be aged at the same time. Since Amazon Glacier should be used only for long-term retention copies, it would take a long time before you could start deleting data.

    Is Cache Indexing available on Amazon Glacier?

    No. You must index the local copy.

    What should I consider when I calculate the cost of storing data on Amazon Glacier?

    1. Estimate how much data is stored locally for 30 days, or whatever your local retention period is.

      This is the space that you require on Amazon S3.

    2. Multiply the space requirement by the number of months you will store the data on Amazon Glacier.

      This is the cost of storing the data on Amazon Glacier.

    3. Add the cost of storing data on Amazon S3 and the cost of uploading the data to Amazon S3 to your overall calculations.

    Can I restore deduplicated data that has been moved to Amazon Glacier?

    If deduplicated data was moved to Amazon Glacier, you must ask Amazon to move it back to the main cloud storage location first. When all of the required data is moved back to the main cloud storage location, then you can restore the data.

    Can I perform Auxiliary Copy or Synthetic Full backup operations on data that is stored on Amazon Glacier?

    No. In this case, you must restore the data to the main cloud storage location. When all of the required backup data is restored to the main cloud storage location, then you can run Auxiliary Copy or Synthetic Full backup operations on the data.

    Supplemental

    http://documentation.commvault.com/commvault/v10/article?p=features/data_aging/data_aging_advanced.htm

     

    Best Regards

    SG1

  • Re: Pruning and Amazon S3
    Posted: 02-10-2015, 8:50 AM

    SG1,

     

    I really appreciate the detailed explanation. From your post i understood what is macro and micro pruning. Adding to this further.

    If we are enabling the deduplication feature on a cloud based storage what are the best practices if any needs to be followed from technical and cost stand point?

    Thanks

    Vijay

    dishvijay@yahoo.com

  • Re: Pruning and Amazon S3
    Posted: 02-10-2015, 9:04 AM

    Hi Vijay,

    Review these most frequesntly asked questions, they may answer things your thinking about from a technical perspective.

    http://documentation.commvault.com/commvault/v10/article?p=features/cloud_storage/cloud_storage_faq.htm

    Cost >>

    What should I consider when I calculate the cost of storing data on Amazon Glacier?

    1. Estimate how much data is stored locally for 30 days, or whatever your local retention period is.

      This is the space that you require on Amazon S3.

    2. Multiply the space requirement by the number of months you will store the data on Amazon Glacier.

      This is the cost of storing the data on Amazon Glacier.

    3. Add the cost of storing data on Amazon S3 and the cost of uploading the data to Amazon S3 to your overall calculations.

    Regards

    SG1

  • Re: Pruning and Amazon S3
    Posted: 02-11-2015, 11:19 AM
    • Ali is not online. Last active: 06-22-2019, 9:40 PM Ali
    • Top 10 Contributor
    • Joined on 08-05-2010

    Hi Vijay,

    I would also suggest reaching out to your account team.  Though much of the info you will receive here is experiential, there may be a unique use case or business requirement which we may be missing clarity on, which the account team may be able to provide an insight into.

    Backup and data aging is easy to understand and implement, however you must also consider the restore times and your SLAs are being met, or the entire solution will fall short, this is a common pitfall which hopefully you will test and validate prior to going live.  Hope that makes sense.

The content of the forums, threads and posts reflects the thoughts and opinions of each author, and does not represent the thoughts, opinions, plans or strategies of Commvault Systems, Inc. ("Commvault") and Commvault undertakes no obligation to update, correct or modify any statements made in this forum. Any and all third party links, statements, comments, or feedback posted to, or otherwise provided by this forum, thread or post are not affiliated with, nor endorsed by, Commvault.
Commvault, Commvault and logo, the “CV” logo, Commvault Systems, Solving Forward, SIM, Singular Information Management, Simpana, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, SnapProtect, ROMS, and CommValue, are trademarks or registered trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
Close
Copyright © 2019 Commvault | All Rights Reserved. | Legal | Privacy Policy