US 7,453,816 B2
Method and apparatus for automatic recovery from a failed node concurrent maintenance operation
James Stephen Fields, Jr., Austin, Tex. (US); Michael Stephen Floyd, Austin, Tex. (US); Benjiman Lee Goodman, Cedar Park, Tex. (US); Paul Frank Lecocq, Cedar Park, Tex. (US); and Praveen S. Reddy, Austin, Tex. (US)
Assigned to International Business Machines Corporation, Armonk, N.Y. (US)
Filed on Feb. 09, 2005, as Appl. No. 11/54,288.
Prior Publication US 2006/0187818 A1, Aug. 24, 2006
Int. Cl. G01R 31/08 (2006.01); G06F 13/00 (2006.01)
U.S. Cl. 370—241  [370/248; 370/252; 710/302; 710/304] 11 Claims
OG exemplary drawing
 
1. A method in a data processing system for automatic recovery from a failed node concurrent maintenance add operation, the method comprising:
disabling communications between a plurality of processor nodes in the data processing system;
sending, by a control logic, a first test command to processors of a new processor node to be added to the data processing system;
detecting, by the control logic, one of a response from the processors of the new processor node and a timeout;
responsive to detecting the response from the processors of the new processor node, determining, by the control logic, if the response is a correct response matching the first test command;
responsive to a correct response, updating, by the control logic, values of a current mode register, wherein the values comprise configuration settings of processors of the new processor node;
updating, by the control logic, all processors in the plurality of processor nodes to use values of the current mode register;
sending, by the control logic, a second test command to all processors in the plurality of processor nodes;
detecting, by the control logic, one of a response from all processors in the plurality of processors and a timeout;
determining, by the control logic, if the response is a correct response matching the second test command; and
responsive to a correct response, initializing, by the control logic, communications between the plurality of processor nodes in the data processing system including the new processor node.