US 11,704,037 B2
Deduplicated storage disk space utilization
Anirvan Duttagupta, San Jose, CA (US); Shreyas Talele, Santa Clara, CA (US); and Anubhav Gupta, Sunnyvale, CA (US)
Assigned to Cohesity, Inc., San Jose, CA (US)
Filed by Cohesity, Inc., San Jose, CA (US)
Filed on Mar. 30, 2020, as Appl. No. 16/834,136.
Prior Publication US 2021/0303192 A1, Sep. 30, 2021
Int. Cl. G06F 3/06 (2006.01); G06F 16/901 (2019.01); H04L 9/40 (2022.01)
CPC G06F 3/0641 (2013.01) [G06F 3/067 (2013.01); G06F 3/0608 (2013.01); G06F 16/9027 (2019.01); H04L 63/20 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A method, comprising:
traversing a plurality of different views of data associated with a first storage domain stored on a deduplicated storage to determine data chunks belonging to each view of the plurality of different views of data associated with the first storage domain, wherein the deduplicated storage separately deduplicates the data associated with the first storage domain from data associated with one or more other storage domains, wherein at least a first copy and a second copy of a first data chunk are stored by the deduplicated storage and the first copy of the first data chunk is included in the data associated with the first storage domain and the second copy of the first data chunk is included in the data associated with a second storage domain;
receiving a request for a metric associated with disk space utilization of a group of one or more selected views of data associated with the first storage domain included in the plurality of different views of data associated with the first storage domain that are stored on the deduplicated storage;
identifying data chunks belonging to the one or more selected views of data associated with the first storage domain of the group but not other views of the plurality of different views of data associated with the first storage domain stored on the deduplicated storage, wherein the identified data chunks includes the first copy of the first data chunk;
determining an incremental disk space utilization of the group including by determining a total size of the identified data chunks;
providing the metric associated with disk space utilization based on the determined incremental disk space utilization of the group; and
using the metric associated with disk space utilization to manage the deduplicated storage, wherein managing the deduplicated storage comprises:
determining a trajectory associated with disk space utilization associated with the group;
adjusting one or more policies associated with the deduplicated storage, wherein the one or more policies associated with the deduplicated storage reduce a frequency at which one or more protection jobs associated with the group are performed; and
determining a modified trajectory associated with the disk space utilization associated with the group based on the one or more adjusted policies associated with the deduplicated storage.