ON STEPWISE MULTIPLE LINEAR REGRESSION
Abstract:
Stepwise multiple linear regression has proved to be an extremely useful computational technique in data analysis problems. This procedure has been implemented in numerous computer programs and overcomes the acute problem that often exists with the classical computational methods of multiple linear regression. This problem manifests itself through the excessive computation time involved in obtaining solutions to the 2 to the Nth power -1 sets of normal equations that arise when seeking an optimum linear combination of variables from the subsets of the N variables. The procedure takes advantage of recurrence relations existing between covariances of residuals, regression coefficients, and inverse elements of partitions of the covariance matrix. The application of these recurrence formulas is equivalent to the introduction or deletion of a variable into a linear approximating function which is being sought as the solution to a data analysis problem. This report contains derivations of the recurrence formulas, shows how they are implemented in a computer program and includes an improved algorithm which halves the storage requirements of previous algorithms. A computer program for the BRLESC computer which incorporates this procedure is described by the author and others in a previous report, BRL Report No. 1330, July 1966. The present report is an amplification of the statistical theory and computational procedures presented in that report in addition to the exposition of the improved algorithm.