## Once again, row slope revisited

Solved
Regular Contributor
Posts: 238

# Once again, row slope revisited

[ Edited ]

Here are a few rows of data (actual dataset has over a million rows):

``id c1 c2 c3a 0.87396 2.00827 2.81477b 0.97002 2.00064 2.81468c 0.68026 1.86006 2.81403d 0.58230 1.84271 2.81402e 0.73355 1.59606 2.81368f 1.20452 2.07169 2.81365g 0.91387 1.19560 2.81352h 1.19418 2.10696 2.81306i -0.10435 1.34213 2.81296j -0.06286 1.50670 2.81225k 1.73225 2.37420 2.81185l 0.53130 1.87948 2.81164m 0.86563 1.65322 2.81151n 1.05598 2.40835 2.81065o 0.15833 1.55971 2.81035p 0.00912 0.61380 2.81033``

The objective is to compute the slope for each row, and place that slope computation into column 4.

Assume the X values are 1, 2, 3.

Y values are those given in each row.

The code below is what I have been trying, but something seems incorrect, as the resulting slope computations don't appear correct.

Please take a look at the code and tell me if you see any problem with it.

Thanks much!

``````data nicholas.n_slope_means__7;
set nicholas.n_slope_means__7;
array ys(3) _50501 _50502 _50503;
array vals(3) (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;
``````

Accepted Solutions
Solution
‎08-14-2016 06:19 AM
Super User
Posts: 10,778

## Re: Once again, row slope revisited

```OK. Assuming you want that Beta parameter of PROC REG.
The following is two kind of code , one is data step, another is IML. notice there is some difference.

data have;
input id \$ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc transpose data=have out=temp;
by id;
var c:;
run;
data temp;
set temp;
by id;
if first.id then x=0;
x+1;
drop _name_;
run;
proc reg data=temp outest=wantt noprint;
by id;
model col1=x ;
/* model col1=x /noint;  <-- Which  is the same as IML code*/
quit;

data have;
input id \$ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc iml;
use have;
read all var _num_ into y[r=id c=vnames];
close;
x={1 ,2 ,3};
beta=j(nrow(y),1,.);
do i=1 to nrow(y);
beta[i]=solve(x`*x,x`*y[i,]`);
end;
want=y||beta;
create want from want[r=id c=(vnames||'Beta')];
append from want[r=id];
close;
quit;

```

All Replies
Super User
Posts: 10,778

## Re: Once again, row slope revisited

```OK. You have three variables not two , So can you explain how to get row slope ?
What is your logic to get that slope (you mean PROC REG  to get that parameter(slope)?).

data nicholas.n_slope_means__7;
set nicholas.n_slope_means__7;
array ys(3) _50501 _50502 _50503;
array vals(3) (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;

```
Solution
‎08-14-2016 06:19 AM
Super User
Posts: 10,778

## Re: Once again, row slope revisited

```OK. Assuming you want that Beta parameter of PROC REG.
The following is two kind of code , one is data step, another is IML. notice there is some difference.

data have;
input id \$ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc transpose data=have out=temp;
by id;
var c:;
run;
data temp;
set temp;
by id;
if first.id then x=0;
x+1;
drop _name_;
run;
proc reg data=temp outest=wantt noprint;
by id;
model col1=x ;
/* model col1=x /noint;  <-- Which  is the same as IML code*/
quit;

data have;
input id \$ c1 c2 c3;
cards;
a 0.87396 2.00827 2.81477
b 0.97002 2.00064 2.81468
c 0.68026 1.86006 2.81403
d 0.58230 1.84271 2.81402
e 0.73355 1.59606 2.81368
f 1.20452 2.07169 2.81365
g 0.91387 1.19560 2.81352
h 1.19418 2.10696 2.81306
i -0.10435 1.34213 2.81296
j -0.06286 1.50670 2.81225
k 1.73225 2.37420 2.81185
l 0.53130 1.87948 2.81164
m 0.86563 1.65322 2.81151
n 1.05598 2.40835 2.81065
o 0.15833 1.55971 2.81035
p 0.00912 0.61380 2.81033
;
run;
proc iml;
use have;
read all var _num_ into y[r=id c=vnames];
close;
x={1 ,2 ,3};
beta=j(nrow(y),1,.);
do i=1 to nrow(y);
beta[i]=solve(x`*x,x`*y[i,]`);
end;
want=y||beta;
create want from want[r=id c=(vnames||'Beta')];
append from want[r=id];
close;
quit;

```
Regular Contributor
Posts: 238

## Re: Once again, row slope revisited

Been having some computer problems.  Thanks very much for your continued assistance with this slope matter.  I'll try and use your code.

SAS Employee
Posts: 340

## Re: Once again, row slope revisited

I see no problem with your code.

I tried it, and it works. The 2 types of slope calculations give the same results.

You probably have missing values in the _505xx variables. That can cause differences.

data have;
input _50501 _50502 _50503;
datalines;
3 4 5
7 7 7
3 2 3
;
run;
data want;
set have;
array ys(3) _50501 _50502 _50503;
array vals(3) _temporary_ (1 2 3);
xbar = mean(of vals(*));
ybar = mean(of ys(*));
do i=1 to dim(vals);
s_xy=sum(s_xy, i*ys(i));
s_y=sum(s_y, ys(i));
s_x2=sum(s_x2, vals(i)**2);
num=(vals(i)-xbar)*(ys(i)-ybar);
den=(vals(i)-xbar)**2;
num_tot=sum(num, num_tot);
den_tot=sum(den, den_tot);
end;
s_x=sum(of vals(*));
n1=dim(vals);
slope2 = (n1*s_xy - s_x*s_y)/(n1*s_x2 - s_x**2);
Slope_5050x_3 = num_tot/den_tot;
run;

☑ This topic is solved.