Multiple Linear Regression

Suppose we have a large number of data points giving the value of v
as a function of x and y, and we wish to perform a least-squares fit
of the data to a function of the form

             v = A + Bx + Cy + Dx^2 + Ey^2

This is called multiple linear regression, and can be applied to give 
the least-squares fit to any polynomial function in any number of 
variables.  Essentially we just find the coefficients such that the 
sum of the squares of the errors is a minimum.

Consider the first data point, (x,y,v), where "v" is our dependent
variable.  For any given choice of coefficients A,B,..,E the square 
of the "error" for the first data point is

  [ A + Bx + Cy + Dx^2 + Ey^2  - v ]^2

         =  A^2 + 2ABx    + 2ACy    + 2ADx^2  + 2AEy^2    - 2Av
                + B^2 x^2 + 2BCxy   + 2BDx^3  + 2BExy^2   - 2Bxv
                          + C^2 y^2 + 2CDyx^2 + 2CEy^3    - 2Cyv
                                    + D^2 x^4 + 2DE(xy)^2 - 2Dvx^2
                                              + E^2 y^4   - 2Evy^2
                                                          + v^2

The squared error for each of the N data points is of this form, so we
can add them all together to give the total sum-of-squares of all the
errors. Clearly this yields an expression identical to the one above, 
except that "x" is replaced by the sum of x_i  (i=1 to N), and "yx^2" 
is replaced by the sum of  (y_i)(x_i)^2  (i=1 to N),  and so on.  For
convenience, let [*] denote the sum of the bracketed expression over 
all N data points.

Now we can take the partial derivatives of this total sum of squares 
with respect to each coefficient A,B...E in turn, and set each of 
these partials to zero to give the minimum sum of squares.  This 
results in the following set of simultaneous equations

  2AN     + 2B[x]    + 2C[y]    + 2D[x^2]    + 2E[y^2]    =  2[v]
  2A[x]   + 2B[x^2]  + 2C[xy]   + 2D[x^3]    + 2E[xy^2]   =  2[xv]
  2A[y]   + 2B[xy]   + 2C[y^2]  + 2D[yx^2]   + 2E[y^3]    =  2[yv]
  2A[x^2] + 2B[x^3]  + 2C[yx^2] + 2D[x^4]    + 2E[(xy)^2] =  2[vx^2]
  2A[y^2] + 2B[xy^2] + 2C[y^3]  + 2D[(xy)^2] + 2E[y^4]    =  2[vy^2]

Dividing all terms by 2 and putting these equations into matrix form 
gives the 5x5 system of equations
  _                                          _   _   _     _       _
 |                                            | |     |   |         |
 |   N     [x]     [y]     [x^2]     [y^2]    | |  A  |   |  [v]    |
 |  [x]    [x^2]   [xy]    [x^3]     [xy^2]   | |  B  |   |  [xv]   |
 |  [y]    [xy]    [y^2]   [yx^2]    [y^3]    |.|  C  | = |  [yv]   |
 |  [x^2]  [x^3]   [yx^2]  [x^4]     [(xy)^2] | |  D  |   |  [vx^2] |
 |  [y^2]  [xy^2]  [y^3]   [(xy)^2]  [y^4]    | |  E  |   |  [vy^2] |
 |_                                          _| |_   _|   |_       _|

Solve this system in the usual way (e.g., multiply the right-hand 
column vector by the inverse of the 5x5 matrix) to give the best 
fit coefficients A, B, C, D, and E.  So all we need to do is compute 
all the sums [*] from our N data points, then perform one matrix 
inversion and one matrix multiplication, and we have our least 
squares fit.

Obviously this same method can be applied to find the constant 
coefficients c1,c2,..,ck that minimize the sum of squares of the 
error of any given set of data points v(x1,x2,..,xj) in accord 
with the model 

 v(x1,...,xj) 

    =  c1 f1(x1,...,xj)  +  c2 f2(x1,..,xj)  +  ...  + ck fk(x1,..,xj)

where f1,f2,..,fk are arbitrary functions of the independent variables
x1,x2,..,xj.  Proceding exactly as before, we square the difference
between the modeled value and the actual value of v for each data point,
then sum these squared errors over all N data points, and then set
the partial derivatives of the resulting expression with respect to
each of the k coefficients to give a set of k linear simultaneous 
equations which can be written in matrix form as
  _                                         _   _   _     _       _
 |                                           | |     |   |         |
 |  [f1 f1]  [f1 f2]  [f1 f3] ...   [f1 fk]  | |  c1 |   |  [v f1] |
 |  [f2 f1]  [f2 f2]  [f2 f3] ...   [f2 fk]  | |  c2 |   |  [v f2] |
 |                  .                        |.|  .  | = |     .   |
 |                  .                        | |  .  |   |     .   |
 |                  .                        | |  .  |   |     .   |
 |                                           | |     |   |         |
 |  [fk f1]  [fk f2]  [fk f3] ...   [fk fk]  | |  ck |   |  [v fk] |
 |_                                         _| |_   _|   |_       _|

where [fm fn] signifies the sum of the products of fm(x1,..,xj) and
fn(x1,..,xj) over all N data points.  Multiplying the right-hand
column vector by the inverse of the square matrix gives the least
squares fit for the coefficients c1,c2,..,ck.  So, in general, if
we model the dependent variable v as a linear combination of k
arbitrary functions of an arbitrary number of independent variables,
we can find the combination that minimizes the sum of squares of the
errors on a set of N (no less than k) data points by evaluating the
inverse of a k x k matrix.

Return to MathPages Main Menu
Сайт управляется системой uCoz