Solved: Re: Least Squares solution for simple model

edasdfasdfasdfa · Posted 07-04-2021 10:57 AM

I'm new to PROC IML.

I'm trying to develop a scenario/example on why Least Squares is needed. As part of that I am building a simple example.

I just have a basic Matrix A (design matrix) ,2 by 2 (the columns of A are not linearly independent).

The constant b (or y) produces a no solution/inconsistent system. Therefore, we have to find the projection of b onto C(A) and then I use the projection instead of the b(y). So I want to solve for Ax=p.

So, couple questions.

1. First, is it possible to calculate a projection of b on to C(A) or /and an orthogonal complement vector b on the C(A)?

2. What code would let us do a simple ordinary least squares solution?

Rick_SAS · Posted 07-06-2021 09:50 AM

If you want to reproduce Chad's example, here it is:

proc iml;
A = {4  8,
     6 12};
b = {5, 1};

/* find p as orthogonal projection onto first column of A */
A1 = A[,1];
p = (A1`*b)/norm(A1)**2 * A1;
print p;

/* use a generalized inverse to find ANY x such that 
   A*x = p */
x = ginv(A)*p;
print x;

/* Check that A*x = p */
z = A*x;
print z p;

View solution in original post

Rick_SAS · Posted 07-04-2021 02:00 PM

Sure. Can you let us know what course you are taking and what methods you have recently studied so we can craft an appropriate answer? For example, in a linear algebra course, operations are generally row operations that reduce a matrix to row echelon form. In a statistics course, you might use X` as the projection matrix. Look back over your notes and tell us what techniques you know about.

1. Yes, the SAS/IML language enables you to perform most basic linear algebraic operations. A projection is just a matrix multiplication (a linear transformation), so you can use matrix multiplication to project a vector onto any subspace. For example, w = X`*y is the projection of y onto the column space of X.

2. The getting started example for the SAS/IML User's Guide discusses the linear algebra of least square regression. The least-squares solution of the matrix equation X*b = y is the vector b that solves the so-called normal equations, which is the linear system

(X`*X)*b = X`*y

As you point out, the solution is not unique if (X`*X) is singular, so statisticians use the idea of a generalized inverse to solve the problem. You can read about generalized inverses in the article, "Generalized inverses for matrices."

edasdfasdfasdfa · Posted 07-04-2021 04:07 PM

Hello,

I am more familiar with the Linear Algebra and Geometry lingo than Stats.

So I am trying to solve for Ax=b.

1. As mentioned I have a simple matrix A (2 * 2) and the set is linearly dependent. The b vector (outcome vector), when reducing to R-REF shows a no solution/inconsistent system. Therefore, I am used to then dealing with it geometrically.

2. I find the orthogonal projection using the projection formula.

3. I solve for Ax=p . Find a vector x such that Ax=p.

4. I then find the vector component of b that is orthogonal to C(A) by simply subtracting p (projection) from vector b. That is the error. (I think in stats speak that by y-hat - y).

I'm not sure there is even a need for proc iml, as data step + proc reg might do the trick?

edasdfasdfasdfa · Posted 07-04-2021 04:19 PM

Something as simple like this?

OPTIONS NOCENTER NODATE;
data ls;
input x1 x2 p;
datalines;
4 8 2
6 12 3
;

PROC REG DATA=ls;
MODEL p = x1 x2;
output out=simp1;
RUN;

proc reg data=ls;
model p = x1 x2;
OUTPUT OUT=simp1 PREDICTED=YHAT RESIDUAL=R;
RUN;

PaigeMiller · Posted 07-04-2021 04:26 PM

Regression residuals are not orthogonal to the regression line, and in your case, the predictions are all exactly on the line. If you had run the code you provided, you would have realized that you will get zero residuals from the regression in your case. And also, when you say: "That is the error. (I think in stats speak that by y-hat - y).", that is the regression error, not perpendicular to the regression line, but parallel to the y-axis.

You can get perpendicular projections and residuals from PROC PRINCOMP from the first principal component. Although, to be honest, I'm not sure if that's what you are asking for.

--
Paige Miller

edasdfasdfasdfa · Posted 07-04-2021 05:42 PM

I am not quite understanding it.

I'm sure it will help if I point you to the example I am following and trying to replicate via coding.

The particular scenario is the answer given by Chad.

https://math.stackexchange.com/questions/1298261/difference-between-orthogonal-projection-and-least-...

PaigeMiller · Posted 07-05-2021 06:04 AM

Thanks.

You have two people in this thread who aren't quite sure what you are trying to do, it would have been very helpful to provide this reference in your initial post, and perhaps you'd already be much closer to an answer. I will look this over and see if I understand your question better.

--
Paige Miller

edasdfasdfasdfa · Posted 07-05-2021 08:41 AM

Thanks!

Chad, in the referenced link, seems to have found a least squares solution for the scenario he set up.

I'm just not sure how to do this programmatically. As you say, what I did doesn't make much sense as p the projection is on the line of the column space of A.

edasdfasdfasdfa · Posted 07-06-2021 09:02 AM

Just to bump this!

Really appreciate any solution where data step or iml.

Rick_SAS · Posted 07-06-2021 09:50 AM

If you want to reproduce Chad's example, here it is:

proc iml;
A = {4  8,
     6 12};
b = {5, 1};

/* find p as orthogonal projection onto first column of A */
A1 = A[,1];
p = (A1`*b)/norm(A1)**2 * A1;
print p;

/* use a generalized inverse to find ANY x such that 
   A*x = p */
x = ginv(A)*p;
print x;

/* Check that A*x = p */
z = A*x;
print z p;

edasdfasdfasdfa · Posted 07-06-2021 10:40 AM

Thank you so much!

2 follow-ups.

1. Based on my research, I thought I may have to go down the SVD route because we have a rank deficient matrix. Why is it that we can use the generalized inverse here instead?

2. The least squares solution seems to be 1/2 (0.5)? But based on his solution set isn't there also free variable which leads to an infinite set?

Rick_SAS · Posted 07-06-2021 10:56 AM

1. The generalized inverse is equivalent to using SVD. In fact, the documentation for GINV says it is computed by using the SVD.

2. Yes. Please read the article, "Generalized inverses for matrices," which shows how to get the family of general solutions.

edasdfasdfasdfa · Posted 07-06-2021 11:19 AM

Thanks so much.

Is the reason that a unique x works is because it is a square matrix?

Rick_SAS · Posted 07-06-2021 11:42 AM

I encourage you to answer your own question. You have a program that computes a particular solution. What happens if you add a new column to A? For example, use

A = {4  8 3,
     6 12 1};

Does the program still work? If so, what is the new solution x? If not, can you modify it so that it works for the new example?

edasdfasdfasdfa · Posted 07-06-2021 12:07 PM

No problem. I will test that. But I have a proc iml question, how do you select 2 columns? I was able to find how to select a column (you did that in the above example anyways) and how to select a row, and how to select a particular element, but not sure how to do the 1st and 3rd column).

Edit: Just to add. I'm reason I'm asking this is that we now have 2 columns that are not simply multiples of each other, so thinking it has to be added? I also of course made vector b 3 elements.

SAS Innovate 2025: Call for Content