US 11,816,243 B2
Preserving user-entity differential privacy in natural language modeling
Thi Kim Phung Lai, Kearny, NJ (US); Tong Sun, San Jose, CA (US); Rajiv Jain, Vienna, VA (US); Nikolaos Barmpalios, Palo Alto, CA (US); Jiuxiang Gu, College Park, MD (US); and Franck Dernoncourt, Sunnyvale, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Aug. 9, 2021, as Appl. No. 17/397,407.
Prior Publication US 2023/0059367 A1, Feb. 23, 2023
Int. Cl. G06F 21/62 (2013.01); G06N 20/00 (2019.01); G06F 40/295 (2020.01)
CPC G06F 21/6245 (2013.01) [G06F 40/295 (2020.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. In a digital medium environment for natural language processing, a computer-implemented method for implementing differential privacy that protects data owners and sensitive textual information within textual datasets comprising:
determining a set of sensitive data points based on sampled users and sampled sensitive entities from a natural language dataset, wherein each sensitive data point is associated with at least one sampled user and comprises at least one sampled sensitive entity; and
generating, utilizing the set of sensitive data points, a natural language model that simultaneously provides protection for users and sensitive entities represented within the natural language dataset via user-entity differential privacy by:
determining an average gradient corresponding to the set of sensitive data points using a user-entity estimator;
determining a noise scale for the user-entity estimator based on the sampled users and the sampled sensitive entities associated with the set of sensitive data points; and
generating parameters for the natural language model using the average gradient and the noise scale.