US 11,704,386 B2
Multi-stage feature extraction for effective ML-based anomaly detection on structured log data
Amin Suzani, Vancouver (CA); Saeid Allahdadian, Vancouver (CA); Milos Vasic, Zurich (CH); Matteo Casserini, Zurich (CH); Hamed Ahmadi, Burnaby (CA); Felix Schmidt, Vancouver (CA); Andrew Brownsword, Vancouver (CA); and Nipun Agarwal, Saratoga, CA (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Mar. 12, 2021, as Appl. No. 17/199,563.
Prior Publication US 2022/0292304 A1, Sep. 15, 2022
Int. Cl. G06F 18/214 (2023.01); G06N 20/00 (2019.01); G06V 10/75 (2022.01); G06F 18/23 (2023.01)
CPC G06F 18/214 (2023.01) [G06F 18/23 (2023.01); G06N 20/00 (2019.01); G06V 10/758 (2022.01)] 23 Claims
OG exemplary drawing
 
1. A method comprising:
extracting a plurality of fields from a log message, wherein each field of the plurality of fields specifies: a name, a text value, and a type;
for each field of the plurality of fields:
a) dynamically selecting a field transformer for the field, wherein the selecting the field transformer is based on at least one selected from the group consisting of: the name of the field and the type of the field;
b) converting, by the field transformer, the text value of the field into a value of the type of the field;
c) dynamically selecting a feature encoder for the value of the type of the field, wherein the selecting the feature encoder is based on at least one selected from the group consisting of: the type of the field and a range of values of the field that occur in a training corpus of a machine learning model; and
d) storing, from the feature encoder, an encoding of the value of the type of the field into a feature vector;
detecting, based on the machine learning model and the feature vector, whether the log message is anomalous.