BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asasha
Obsidian | Level 7

Hi,

 

I want to do a linear regression on every row and attach the slope as a new column: I have four measurements with unequal intervals, and I need to see the trend. How can I do this in SAS? My desired columns are as follows:

 

time_1 time_3 time_4 time_5 calculated_slope

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Putting variable values in variable names is bad practice. Anyhow, here is a way to do this:

 

data have;
input id time1 time3 time4 time5;
datalines;
1 1 2 3 4
2 3 2 1 1
;

proc transpose data=have out=temp;
by id;
var time:;
run;

data regr;
set temp;
x = input(compress(_name_,,"kd"), best.);
drop _name_;
run;

proc reg data=regr outest=estm noprint;
by id;
model col1 = x;
run;

data want;
merge have estm(keep=id x rename=x=slope);
by id;
run;

proc print data=want noobs; run;
id 	time1 	time3 	time4 	time5 	slope
1 	1 	2 	3 	4 	0.74286
2 	3 	2 	1 	1 	-0.54286
PG

View solution in original post

9 REPLIES 9
ballardw
Super User

"Slope" implies a change in y as x changes. You don't show any y value(s), or X values; one is missing.

If you values shown are the measurement at the given time then we need to know what the actual intervals are, especially since you say the intervals are unequal.

 

 

asasha
Obsidian | Level 7

Hi, the intervals are (1,3,4,5) as in the column names. Here is how it is done in R:

https://stackoverflow.com/questions/21783871/calculating-a-linear-trend-line-for-every-row-of-a-tabl...

 

The exception is: I have 4 points, the days are not (1,2,3) but (1,3,4,5), and I only care about the slope. Does that make sense? Thank you.

PaigeMiller
Diamond | Level 26

@asasha wrote:

Hi, the intervals are (1,3,4,5) as in the column names. Here is how it is done in R:

https://stackoverflow.com/questions/21783871/calculating-a-linear-trend-line-for-every-row-of-a-tabl...

 

The exception is: I have 4 points, the days are not (1,2,3) but (1,3,4,5), and I only care about the slope. Does that make sense? Thank you.


The functions in R have no corresponding function in SAS. As I said above, you can convert the formula for slope to a SAS formula, using the USS() function, the SUM() function, and other functions as appropriate.

--
Paige Miller
PGStats
Opal | Level 21

Please show an example of inputs.

PG
asasha
Obsidian | Level 7

This is what I want:

time1 time3 time4 time5        slope
    1     2     3     4      0.00000
    3     2     1     1      4.00000
    1     1     5     1     -1.66667

The slope variable is not currently in the data. The example above does not show correct slopes. Thank you.

PGStats
Opal | Level 21

Putting variable values in variable names is bad practice. Anyhow, here is a way to do this:

 

data have;
input id time1 time3 time4 time5;
datalines;
1 1 2 3 4
2 3 2 1 1
;

proc transpose data=have out=temp;
by id;
var time:;
run;

data regr;
set temp;
x = input(compress(_name_,,"kd"), best.);
drop _name_;
run;

proc reg data=regr outest=estm noprint;
by id;
model col1 = x;
run;

data want;
merge have estm(keep=id x rename=x=slope);
by id;
run;

proc print data=want noobs; run;
id 	time1 	time3 	time4 	time5 	slope
1 	1 	2 	3 	4 	0.74286
2 	3 	2 	1 	1 	-0.54286
PG
PaigeMiller
Diamond | Level 26

You can convert the regression formula to be functions of the variables, for example in a linear regression part of the formula for slope involves a sum of squares, you could use the USS() function.

 

Or you can transpose the data so that each row is now a column of four numbers; and of course there needs to also be a column of the x values; and then run PROC REG with a BY statement.

--
Paige Miller
Reeza
Super User

I agree with everyone else that this isn't a good idea. That being said, it can be implemented.

data test;
input x1 x3 x4 x5 ;
datalines;
-0.069965723 0.492749371 0.955245597 1.346963522 
;
run;

data slope;
set test;
array ys(4) x1 x3 x4 x5;
array vals(6) (1 3 4 5);
xbar = mean(of vals(*));
ybar = mean(of ys(*));



do i=1 to dim(vals);


num=sum(num, (vals(i)-xbar)*(ys(i)-ybar));
den=sum(den, (vals(i)-xbar)**2);


end;

slope = num/den;
run;

proc transpose data=test out=test2(rename=col1=y);
run;

data test2;
set test2;
x=_n_;
run;

proc reg data=test2;
model y=x;
run;

https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Regression-type-problem-data-in-a-ro...

 


@asasha wrote:

Hi,

 

I want to do a linear regression on every row and attach the slope as a new column: I have four measurements with unequal intervals, and I need to see the trend. How can I do this in SAS? My desired columns are as follows:

 

time_1 time_3 time_4 time_5 calculated_slope

 

Thank you!


 

PaigeMiller
Diamond | Level 26

I point out that the PROC TRANSPOSE method will handle missing Y values properly. The method of writing your own formula for slope, that I and others have mentioned, will not handle missing Y values properly without additional attention.

 
--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2071 views
  • 3 likes
  • 5 in conversation