BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NKormanik
Barite | Level 11

Here are a few rows of data (actual dataset has over a million rows):

 

 

id c1 c2 c3
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033

The objective is to compute the slope for each row, and place that slope computation into column 4.

 

Assume the X values are 1, 2, 3.

 

Y values are those given in each row.

 

The code below is what I have been trying, but something seems incorrect, as the resulting slope computations don't appear correct.

 

Please take a look at the code and tell me if you see any problem with it.

 

Thanks much!

 

 

data nicholas.n_slope_means__7;
set nicholas.n_slope_means__7;
array ys(3) _50501 _50502 _50503;
array vals(3) (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
OK. Assuming you want that Beta parameter of PROC REG. 
The following is two kind of code , one is data step, another is IML. notice there is some difference.


data have;
input id $ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc transpose data=have out=temp;
by id;
var c:;
run;
data temp;
 set temp;
 by id;
 if first.id then x=0;
 x+1;
 drop _name_;
run;
proc reg data=temp outest=wantt noprint;
 by id;
 model col1=x ; 
/* model col1=x /noint;  <-- Which  is the same as IML code*/
quit;

 








data have;
input id $ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc iml;
use have;
read all var _num_ into y[r=id c=vnames];
close;
x={1 ,2 ,3};
beta=j(nrow(y),1,.);
do i=1 to nrow(y);
 beta[i]=solve(x`*x,x`*y[i,]`);
end;
want=y||beta;
create want from want[r=id c=(vnames||'Beta')];
append from want[r=id];
close;
quit;

 



View solution in original post

4 REPLIES 4
Ksharp
Super User
OK. You have three variables not two , So can you explain how to get row slope ?
What is your logic to get that slope (you mean PROC REG  to get that parameter(slope)?).


data nicholas.n_slope_means__7;
set nicholas.n_slope_means__7;
array ys(3) _50501 _50502 _50503;
array vals(3) (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;
 


Ksharp
Super User
OK. Assuming you want that Beta parameter of PROC REG. 
The following is two kind of code , one is data step, another is IML. notice there is some difference.


data have;
input id $ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc transpose data=have out=temp;
by id;
var c:;
run;
data temp;
 set temp;
 by id;
 if first.id then x=0;
 x+1;
 drop _name_;
run;
proc reg data=temp outest=wantt noprint;
 by id;
 model col1=x ; 
/* model col1=x /noint;  <-- Which  is the same as IML code*/
quit;

 








data have;
input id $ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc iml;
use have;
read all var _num_ into y[r=id c=vnames];
close;
x={1 ,2 ,3};
beta=j(nrow(y),1,.);
do i=1 to nrow(y);
 beta[i]=solve(x`*x,x`*y[i,]`);
end;
want=y||beta;
create want from want[r=id c=(vnames||'Beta')];
append from want[r=id];
close;
quit;

 



NKormanik
Barite | Level 11

Been having some computer problems.  Thanks very much for your continued assistance with this slope matter.  I'll try and use your code.

 

 

gergely_batho
SAS Employee

I see no problem with your code.

I tried it, and it works. The 2 types of slope calculations give the same results.

You probably have missing values in the _505xx variables. That can cause differences.

 

 

data have;
input _50501 _50502 _50503;
datalines;
3 4 5
7 7 7
3 2 3
;
run;
data want;
set have;
array ys(3) _50501 _50502 _50503;
array vals(3) _temporary_ (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1952 views
  • 1 like
  • 3 in conversation