US 9,811,391 B1
Load balancing and conflict processing in workflow with task dependencies
Ryan Barrett, San Francisco, CA (US); Taylor Sittler, San Francisco, CA (US); Krishna Pant, San Jose, CA (US); and Zhenghua Li, San Jose, CA (US)
Assigned to COLOR GENOMICS, INC., Burlingame, CA (US)
Filed by Ryan Barrett, San Francisco, CA (US); Taylor Sittler, San Francisco, CA (US); Krishna Pant, San Jose, CA (US); and Zhenghua Li, San Jose, CA (US)
Filed on Mar. 3, 2017, as Appl. No. 15/449,579.
Claims priority of provisional application 62/303,529, filed on Mar. 4, 2016.
Int. Cl. G06F 19/24 (2011.01); G06F 19/22 (2011.01); G06F 19/18 (2011.01); G06F 9/52 (2006.01); G06F 9/48 (2006.01)
CPC G06F 9/52 (2013.01) [G06F 9/4881 (2013.01); G06F 19/18 (2013.01); G06F 19/22 (2013.01); G06F 19/24 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving a plurality of reads, each read of the plurality of reads being associated with a client;
performing, for each read of the plurality of reads, an alignment process using the read and a reference data set to generate an alignment result, the alignment result identifying one or more portions of the reference data set to which the read is to be aligned, the reference data set including, at each position of a plurality of positions, a reference-data identifier corresponding to the position;
identifying a first pre-identified portion of the reference data set, the first pre-identified portion being related to a second pre-identified portion of the reference data set, and each of the first pre-identified portion and the second pre-identified portion corresponding to a subset of the plurality of positions;
for each position of one or more positions of the first pre-identified portion:
identifying a subset of the plurality of reads, each read in the subset having an identifier aligned to the position;
for each of one or more identifiers:
determining a quantity of the subset of reads that include the identifier at a read position aligned to the position;
determining that a downstream-processing criterion is satisfied based on the quantity;
in response to determining that the downstream-processing criterion is satisfied:
determining that the identifier matches a reference-data identifier in the reference data set at the position; and
in response to determining that the identifier does not match a reference-data identifier at the reference data set at the position:
defining a sparse indicator that represents a difference between the identifier and the reference-data identifier; and
assigning the sparse indicator to a bucket representative of a state-transition likelihood attributable to the sparse indicator;
determining that a data verification condition is satisfied based on the bucket assignments; and
in response to determining that the data verification condition is satisfied, transmitting a communication to a data generation system that identifies the client.