US 7,539,690 B2
Data mining method and system using regression clustering
Bin Zhang, Fremont, Calif. (US)
Assigned to Hewlett-Packard Development Company, L.P., Houston, Tex. (US)
Filed on Oct. 27, 2003, as Appl. No. 10/694,367.
Prior Publication US 2005/0091189 A1, Apr. 28, 2005
Int. Cl. G06F 7/00 (2006.01); G06F 17/00 (2006.01)
U.S. Cl. 707—101  [707/7; 707/10; 707/100; 707/102; 707/200] 22 Claims
OG exemplary drawing
 
1. A method, comprising:
a processor which performs the following:
selecting a set number of functions correlating variable parameters of a dataset; and
clustering the dataset by iteratively applying a regression algorithm and a K-Harmonic Means performance function on the set number of functions to determine a pattern in said dataset;
wherein said clustering comprises determining distances between data points of the dataset and values correlated with the set number of functions, regressing the set number of functions using data point probability and weighting factors associated with the determined distances, calculating a difference of harmonic averages for the distances determined prior to and subsequent to said regressing, and repeating said regressing, determining and calculating upon determining the difference of harmonic averages is greater than a predetermined value.