BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I'm new to PROC IML.

 

I'm trying to develop a scenario/example on why Least Squares is needed. As part of that I am building a simple example.

 

I just have a basic Matrix A (design matrix) ,2 by 2 (the columns of A are not linearly independent).

The constant b (or y) produces a no solution/inconsistent system. Therefore, we have to find the projection of b onto C(A) and then I use the projection instead of the b(y). So I want to solve for Ax=p.

 

So, couple questions.

 

1. First, is it possible to calculate a projection of b on to C(A) or /and an orthogonal complement vector b on the C(A)?

2. What code would let us do a simple ordinary least squares solution?

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

If you want to reproduce Chad's example, here it is:

proc iml;
A = {4  8,
     6 12};
b = {5, 1};

/* find p as orthogonal projection onto first column of A */
A1 = A[,1];
p = (A1`*b)/norm(A1)**2 * A1;
print p;

/* use a generalized inverse to find ANY x such that 
   A*x = p */
x = ginv(A)*p;
print x;

/* Check that A*x = p */
z = A*x;
print z p;

View solution in original post

20 REPLIES 20
Rick_SAS
SAS Super FREQ

Sure. Can you let us know what course you are taking and what methods you have recently studied so we can craft an appropriate answer? For example, in a linear algebra course, operations are generally row operations that reduce a matrix to row echelon form. In a statistics course, you might use X` as the projection matrix.  Look back over your notes and tell us what techniques you know about.

 

1. Yes, the SAS/IML language enables you to perform most basic linear algebraic operations. A projection is just a matrix multiplication (a linear transformation), so you can use matrix multiplication to project a vector onto any subspace. For example, w = X`*y is the projection of y onto the column space of X.

2. The getting started example for the SAS/IML User's Guide discusses the linear algebra of least square regression. The least-squares solution of the matrix equation X*b = y is the vector b that solves the so-called normal equations, which is the linear system 

(X`*X)*b = X`*y

As you point out, the solution is not unique if (X`*X) is singular, so statisticians use the idea of a generalized inverse to solve the problem. You can read about generalized inverses in the article, "Generalized inverses for matrices."

edasdfasdfasdfa
Quartz | Level 8

Hello,

 

I am more familiar with the Linear Algebra and Geometry lingo than Stats.

 

So I am trying to solve for Ax=b.

 

1. As mentioned I have a simple matrix A (2 * 2) and the set is linearly dependent. The b vector (outcome vector), when reducing to R-REF shows a no solution/inconsistent system. Therefore, I am used to then dealing with it geometrically.

 

2. I find the orthogonal projection using the projection formula. CKI4U.png

3. I solve for Ax=p . Find a vector x such that Ax=p. 

4. I then find the vector component of b that is orthogonal to C(A) by simply subtracting p (projection) from vector b. That is the error. (I think in stats speak that by y-hat - y).

 

 

I'm not sure there is even a need for proc iml, as data step + proc reg might do the trick?

edasdfasdfasdfa
Quartz | Level 8

Something as simple like this?

 

OPTIONS NOCENTER NODATE;
data ls;
input x1 x2 p;
datalines;
4 8 2
6 12 3
;

PROC REG DATA=ls;
MODEL p = x1 x2;
output out=simp1;
RUN;

 

proc reg data=ls;
model p = x1 x2;
OUTPUT OUT=simp1 PREDICTED=YHAT RESIDUAL=R;
RUN;

PaigeMiller
Diamond | Level 26

Regression residuals are not orthogonal to the regression line, and in your case, the predictions are all exactly on the line. If you had run the code you provided, you would have realized that you will get zero residuals from the regression in your case. And also, when you say: "That is the error. (I think in stats speak that by y-hat - y).", that is the regression error, not perpendicular to the regression line, but parallel to the y-axis.

 

You can get perpendicular projections and residuals from PROC PRINCOMP from the first principal component. Although, to be honest, I'm not sure if that's what you are asking for.

--
Paige Miller
edasdfasdfasdfa
Quartz | Level 8

I am not quite understanding it.

 

I'm sure it will help if I point you to the example I am following and trying to replicate via coding.

 

The particular scenario is the answer given by Chad. 

https://math.stackexchange.com/questions/1298261/difference-between-orthogonal-projection-and-least-...

PaigeMiller
Diamond | Level 26

Thanks.

 

You have two people in this thread who aren't quite sure what you are trying to do, it would have been very helpful to provide this reference in your initial post, and perhaps you'd already be much closer to an answer. I will look this over and see if I understand your question better.

--
Paige Miller
edasdfasdfasdfa
Quartz | Level 8

Thanks!

 

Chad, in the referenced link, seems to have found a least squares solution for the scenario he set up.

I'm just not sure how to do this programmatically. As you say, what I did doesn't make much sense as p the projection is on the line of the column space of A.

 

 

edasdfasdfasdfa
Quartz | Level 8

Just to bump this!

 

Really appreciate any solution where data step or iml.

Rick_SAS
SAS Super FREQ

If you want to reproduce Chad's example, here it is:

proc iml;
A = {4  8,
     6 12};
b = {5, 1};

/* find p as orthogonal projection onto first column of A */
A1 = A[,1];
p = (A1`*b)/norm(A1)**2 * A1;
print p;

/* use a generalized inverse to find ANY x such that 
   A*x = p */
x = ginv(A)*p;
print x;

/* Check that A*x = p */
z = A*x;
print z p;
edasdfasdfasdfa
Quartz | Level 8

Thank you so much! 

 

2 follow-ups.

1. Based on my research, I thought I may have to go down the SVD route because we have a rank deficient matrix. Why is it that we can use the generalized inverse here instead?

2. The least squares solution seems to be 1/2 (0.5)? But based on his solution set isn't there also free variable which leads to an infinite set?

Rick_SAS
SAS Super FREQ

1. The generalized inverse is equivalent to using SVD. In fact, the documentation for GINV says it is computed by using the SVD.

2. Yes. Please read the article, "Generalized inverses for matrices," which shows how to get the family of general solutions.

 

edasdfasdfasdfa
Quartz | Level 8

Thanks so much.

 

Is the reason that a unique x works is because it is a square matrix?

Rick_SAS
SAS Super FREQ

I encourage you to answer your own question. You have a program that computes a particular solution. What happens if you add a new column to A? For example, use

A = {4  8 3,
     6 12 1};

Does the program still work? If so, what is the new solution x? If not, can you modify it so that it works for the new example?

 

edasdfasdfasdfa
Quartz | Level 8

No problem. I will test that. But I have a proc iml question, how do you select 2 columns? I was able to find how to select a column (you did that in the above example anyways) and how to select a row, and how to select a particular element, but not sure how to do the 1st and 3rd column).

 

Edit: Just to add. I'm reason I'm asking this is that we now have 2 columns that are not simply multiples of each other, so thinking it has to be added? I also of course made vector b 3 elements.

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 20 replies
  • 2717 views
  • 7 likes
  • 3 in conversation