US 9,811,532 B2
Executing a cloud command for a distributed filesystem
Brian Christopher Parkison, Santa Cruz, CA (US); Andrew P. Davis, Santa Cruz, CA (US); John Richard Taylor, Tiburon, CA (US); and Randy Yen-pang Chou, San Jose, CA (US)
Assigned to PANZURA, INC., Campbell, CA (US)
Filed by Panzura, Inc., Campbell, CA (US)
Filed on Sep. 5, 2013, as Appl. No. 14/19,247.
Application 14/019,247 is a continuation in part of application No. 13/971,621, filed on Aug. 20, 2013.
Application 13/971,621 is a continuation in part of application No. 12/772,806, filed on May 3, 2010, granted, now 8,719,444.
Application 12/772,806 is a continuation in part of application No. 13/782,729, filed on Mar. 1, 2013.
Application 13/782,729 is a continuation in part of application No. 13/769,185, filed on Feb. 15, 2013.
Application 13/769,185 is a continuation in part of application No. 13/725,738, filed on Dec. 21, 2012, granted, now 8,799,413.
Application 13/725,738 is a continuation in part of application No. 12/772,927, filed on May 3, 2010, granted, now 8,341,363, issued on Dec. 25, 2012.
Application 12/772,927 is a continuation in part of application No. 13/225,194, filed on Sep. 2, 2011, granted, now 8,356,016, issued on Jan. 15, 2013.
Application 13/225,194 is a continuation in part of application No. 13/295,844, filed on Nov. 14, 2011, granted, now 8,788,628.
Prior Publication US 2014/0006354 A1, Jan. 2, 2014
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 15/16 (2006.01); G06F 17/30 (2006.01); G06F 3/06 (2006.01)
CPC G06F 17/30194 (2013.01) [G06F 3/065 (2013.01); G06F 3/067 (2013.01); G06F 3/0611 (2013.01); G06F 3/0635 (2013.01); G06F 17/30097 (2013.01); G06F 17/30132 (2013.01); G06F 17/30159 (2013.01); G06F 17/30203 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method for performing a distributed-filesystem-specific action, the method comprising:
collectively managing the data of the distributed filesystem using two or more cloud controllers, wherein collectively managing the data comprises:
storing the data for the distributed filesystem in a cloud storage system, wherein the cloud controllers cache and ensure data consistency for data stored in the cloud storage system;
maintaining in each cloud controller a metadata hierarchy that reflects the current state of the distributed filesystem, wherein the metadata hierarchy is stored in the local storage device, wherein changes to the metadata for the distributed filesystem are synchronized across the cloud controllers for the distributed filesystem to ensure that the clients of the distributed filesystem share a consistent view of the files in the distributed filesystem; and
collectively presenting a unified namespace for the distributed filesystem to the clients of the distributed filesystem via the two or more cloud controllers, wherein the clients access the distributed filesystem via the cloud controllers, wherein the file data for the distributed filesystem is stored in the cloud storage system, wherein cloud controllers cache in their local storage devices a subset of the file data from the cloud storage system that is being actively accessed by each respective cloud controller's clients, wherein new file data received by each cloud controller from its clients is written to the cloud storage system;
presenting a distributed-filesystem-specific capability to a client system as a file in the distributed filesystem using a file abstraction;
receiving at a cloud controller a request to perform a distributed-filesystem-specific action in response to the file access, wherein the request comprises a cloud-aware copy operation and specifies a source file and a destination file in the distributed filesystem, wherein the cloud controller is currently not caching the data blocks for the source file;
using the metadata hierarchy on the cloud controller to create the destination file on the cloud controller, wherein the metadata for the destination file references the same data blocks that are associated with the source file in a set of cloud files that are stored in the cloud storage system, wherein the cloud controller creates the destination file without accessing any of the data blocks or cloud files that are associated with the source file, wherein the cloud controller updates the deduplication reference counts for the data blocks of the source file to account for the newly created destination file; and
distributing a metadata snapshot that includes the metadata for the destination file and the updated deduplication information to the other cloud controllers that collectively manage the distributed filesystem to notify of the creation of the destination file;
wherein no data blocks for the source file need to be transmitted from the cloud storage system or the cloud controller for the cloud-aware copy operation, thereby substantially reducing the network bandwidth and latency associated with copying large files on the cloud controller and substantially reducing the perceived command execution time for the client system.