questions about DDB

Last post 05-16-2018, 8:45 PM by Wwong. 3 replies.
Sort Posts: Previous Next
  • questions about DDB
    Posted: 05-15-2018, 11:59 PM

    v11 SP10

     

    Hi,

     

    1. I am confused about the locations of the data based on the below description. So DDB contains block signatures. Is the index also stored in DDB? What about the unique blocks of data itself?

    ~~~

    https://documentation.commvault.com/commvault/v11/article?p=12401.htm

    Comparing signatures

    The new signature is compared against a database of existing signatures for previously backed up data blocks on the destination storage. The database that contains the signatures is called the Deduplication Database (DDB).

    • If the signature exists, the DDB records that an existing data block is used again on the destination storage. The associated MediaAgent writes the index information to the DDB on the destination storage, and the duplicate data block is discarded.
    • If the signature does not exist, the new signature is added to the DDB. The associated MediaAgent writes both the index information and the data block to the destination storage.

    ~~~

    2. I was told even if the last reference to a unque block is deleted, e.g. as a result of deleting backups, the data block will not be freed up because it can be used for future comparisons. If so, doesn't it mean wherever the unique data blocks are stored, it will only grow and eventually I will run out of space?

    I am currently facing a challenge that the storage used by my media agent is running out because one of the big jobs turns out did not have dedup turned on. But even after deleting many of such jobs, I still don't see significant space freed up.

     

    Thanks,

  • Re: questions about DDB
    Posted: 05-16-2018, 1:05 AM

    Hi acpp

    With CommVault Deduplication there are two major components:

    -DDB Volume on the MediaAgent -> this stores the Primary and Secondary Database Table, that keeps track of where the actual data (sigantures) are written on the Disk Library (Storage Array or Cloud -> depends on your setup).

    -Disk Library -> will then store the actual metadata reference and signature that build the job. 

    When you run your first Backup, these are all unique Blocks, and the references will be inserted in the DDB Volume to identify the location on your Storage, where it is written to. 

    Deduplication is the process of referencing existing blocks on the Storage Array to reduce data consumption. 

    So although you delete jobs (Deduplicated), the blocks written on the Storage Array, could be referenced for other Jobs. 

    However if the original Job that you ran was non-deduplicated, when you do delete that Job, it should free up all the space. 

    So in short when you do delete Deduplicated Jobs, the amount of space being freed will depend. 

    Advantages of having DDB Volume to keep track of the signature and the Disk Library actually storing the metadata and signature -> in the worst case scenario if the DDB Volume is corrupted, we are still able to restore data from the Disk Library, as additional metadata reference is also written to the Disk Library, which removes the need to interact with the DDB. 

    Thank you 

    Winston 

  • Re: questions about DDB
    Posted: 05-16-2018, 1:34 PM

    Hi Winston,

     

    Thanks for the explanation.

     

    >>So although you delete jobs (Deduplicated), the blocks written on the Storage Array, could be referenced for other Jobs.

    When there are no jobs referencing the blocks any more, will Commvault remove the unque data blocks from the storage array? I am curious whether the storage consumption only grows over time even with deduplication.

     

    Thanks,

  • Re: questions about DDB
    Posted: 05-16-2018, 8:45 PM

    Hi acpp

    That's correct, when the unique blocks stored on the Storage Array does not have any more additional references, CommVault will automatically prune out the blocks to reclaim the space. 

    Technically in most Deduplication Environment, the consumed space will not reduce much, because you will always have backup running (at all times), so while the old jobs meet retention, new jobs could come in and associate themselves with the unique blocks stored in the Storage Array. 

    Unless you stop running backup and manaully delete jobs, its hard to determine space reclaimation at a Storage Perspective.

    Note - if you stop running jobs, you will also not see much space returned, and that is because Jobs will not meet basic cycle and retention and will just not age out. 

    So technically in most cases Deduplication will most likely see Storage consumption stay the same or grow slowly. 

    The only time you will see space being reclaimed at an exponential rate is:

    • Deleting a large amount of Jobs -> then run Data Aging
    • Dis-associate multiple clients backing up to a specific DDB and deleting a large amount of Jobs -> then run Data Aging.
    The above is an example, and we don't usually recommend to delete jobs and would like jobs to age out naturally, when it meets retention. 
    Hopefully the above provides a more clear overview on Deduplication.
    Thank you 
    Winston
The content of the forums, threads and posts reflects the thoughts and opinions of each author, and does not represent the thoughts, opinions, plans or strategies of Commvault Systems, Inc. ("Commvault") and Commvault undertakes no obligation to update, correct or modify any statements made in this forum. Any and all third party links, statements, comments, or feedback posted to, or otherwise provided by this forum, thread or post are not affiliated with, nor endorsed by, Commvault.
Commvault, Commvault and logo, the “CV” logo, Commvault Systems, Solving Forward, SIM, Singular Information Management, Simpana, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, SnapProtect, ROMS, and CommValue, are trademarks or registered trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
Close
Copyright © 2018 Commvault | All Rights Reserved. | Legal | Privacy Policy