Good Day Vijay,
Multi layered question here.
Question\Comment:I want to understand how pruning of Simpana works. Could someone please explain me in detail what is pruning (micro/macro)
Data Aging is the process of removing old data from secondary storage to allow the associated media to be reused for future backups. Basic explaination of micro vs macro below.Its alot more technical then this but this is the basic.
Micro pruning is a term that is specifc to using the DDB (Deduplicated database) and removing specific references and data from disk storage based on retention criteria, logical pruning occurs and decrements refernce count in the DDB once all reference to an archive file ( the way we track our data) become zero we then physically delete the data.
Macro pruning is the method used always for non-deduplicated data. Once archive files associated to jobs meet retention the data or volume dirs are immediately removed from disk or storage. Macro pruning also seen when in deduplciation the DDB is sealed and all jobs tied to that DDB have met retention ( the aging process is then similiar to non-dedupe data becuase we no longer use the DDB process as noted in micro prune.
Question\Comment: working on using Amazon S3 cloud storage with Simpana, i came across warning on using this.
What warning specifically are you referring to? Assuming that micro pruning is not supported? If so explained below.
Question\Comment:From what i understand the problem is only with S3 storage and no other cloud storage. and how this affects when we use S3 storage,is this something Commvault is working on to fix ? or this is a problem with S3.
The problem I beleive you are referring to or limitation is surrounding using this type storage with deduplication data and that the data can not be micro pruned ( pretty sure this happens with other cloud vendors as well). This is discussed in short in the CV link provided below as well>>>If you have Cloud Storage configured with deduplication, the pruning of the data is not done until the DDB is sealed and all of the backup jobs that are associated with that DDB meet the retention rules for the DDB to become aged. This is MACRO PRUNE This limitation is typically a cloud thing and we are always looking to enhance our part when possible.
Question\Comment:It would really help in my solution.
This CV link may help you.
- You must have enough space on the Amazon S3 storage to hold one full-sealed store (that is, 30 days of deduplication backup data).
- For Silo restores, you must ask Amazon to migrate the data from Amazon Glacier to Amazon S3. Therefore, you must have enough space on the Amazon S3 storage to perform restores.
- Make sure to include the cost of uploading the full store in your overall cloud storage costs.
- Reading data from Amazon Glacier is extremely slow. Reading large amounts of data can be very expensive. Therefore, you should choose Amazon Glacier storage for data that has a long-term retention period and when the probability of recovering the data is very low.
Does Amazon Glacier support encryption?
Yes, but data must be encrypted on the local disk before it is moved to Amazon S3, and then migrated to Amazon Glacier.
Is job-based (micro) pruning available on Amazon Glacier?
No. The entire Silo must be aged at the same time. Since Amazon Glacier should be used only for long-term retention copies, it would take a long time before you could start deleting data.
Is Cache Indexing available on Amazon Glacier?
No. You must index the local copy.
What should I consider when I calculate the cost of storing data on Amazon Glacier?
- Estimate how much data is stored locally for 30 days, or whatever your local retention period is.
This is the space that you require on Amazon S3.
- Multiply the space requirement by the number of months you will store the data on Amazon Glacier.
This is the cost of storing the data on Amazon Glacier.
- Add the cost of storing data on Amazon S3 and the cost of uploading the data to Amazon S3 to your overall calculations.
Can I restore deduplicated data that has been moved to Amazon Glacier?
If deduplicated data was moved to Amazon Glacier, you must ask Amazon to move it back to the main cloud storage location first. When all of the required data is moved back to the main cloud storage location, then you can restore the data.
Can I perform Auxiliary Copy or Synthetic Full backup operations on data that is stored on Amazon Glacier?
No. In this case, you must restore the data to the main cloud storage location. When all of the required backup data is restored to the main cloud storage location, then you can run Auxiliary Copy or Synthetic Full backup operations on the data.