We have had some performance problems with backups (very high Q&I time) and when contacting Commvault support, they suggested that we should do a compact om the GDDB database so we could enable garbage collection. The compact however failed which had as result that both our partitions in the GDDB got corrupt and went offline.
This happened on 23 january and between then and now, we managed to recover one of the partitions. The other one however is very slow on the Prune Records state. It has been running since 29/01/2020 12:45 and is now at 87 replayed records and has 13 pending records. After multiple contact with Commvault support, the only response we get is: you need to wait untill the job is finished.
We have made some additional workarounds (disabled dedupe when we still hadn't any partition back and after we got the first partition back, we have allowed jobs to run to the copy with at least 1 partition available) but right now we are running out of storage and it seems that the pruning job is going to take 4 days until it is ready (if it ever gets there because I'm not that sure).
Multiple escalations have not helped, changing the priority to critical has not helped because the only response we get is: you just have to wait until it is done. I understand that may be the case, but there should be some kind of workaround so we can still have our backups running? We can't free any more storage and sealing the db isn't an option anymore since we don't have enough space. It used to be an option but support thought it better to wait until the job is complete
Does anyone here have any ideas to survive the weekend? Would be an option to disable the MA which has still the pruning records job running so that all jobs go to the online MA?
Incident number @Commvault is 200115-309