turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Regression data are across rows

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 06:14 AM

Row 1 is 1, 2, 3, 4, 5, 8 (representing days).

Row 2 are various values going out several decimal points.

I want to run a regression on Row 1 vs. Row 2.

I need the intercept and slope. Ideally in columns following the values in Row 2.

There are many more rows besides Row 2. Hundreds. Each row needs a regression run, to find intercept and slope.

But Row 1 remains the same throughout all the regressions.

Since the SAS worksheet is already set up as described above it would be preferable to be able to use it as is.

Any possibilities going about this??

Help and suggestions greatly appreciated.

Nicholas

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 07:28 AM

Hi Nicholas,

You will have to normalize your data before trying to do anything with it. No proc will ever be able to make sense of the current shape it's in. Regression is performed on values in rows, not columns. It is one of the common mistakes described in the excellent article "3 common messy data problems and how to tidy them in SAS".

Use a datastep to combine two consecutive rows and create an identifier for that pair of rows. Take care to rename the variables on the two reads so not to overwrite values. Then output every variable pair (from row1-var1, row2-var1 etc) plus the identifier to a new dataset. Then you can run your regression on that dataset using a BY on the identifier. If row 1 is not repeated (please provide sample data) then retain that first row and loop over the rest. It could in fact make the datastep easier. But we'd have to know. Please provide sample data before I can have a shot at coding.

And I do assume that what you call a SAS data sheet is in fact a SAS dataset.

Regards,

- Jan.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 07:58 AM

Oke so I couldn't resist myself. Here is one way to go about it:

```
data have;
input var1-var7;
cards;
1 2 3 4 5 6 8
0.0 0.1 0.2 0.3 0.4 0.5 0.6
1.1 1.2 1.2 1.3 1.3 1.4 1.4
;
run;
data want;
if _n_=1 then set have(rename=(var1=day1 var2=day2 var3=day3 var4=day4 var5=day5 var6=day6 var7=day7));
array days{*} day1-day7;
array var{*} var1-var7;
do until(eod);
set have end=eod;
id+1;
do i=1 to 7;
day=days{i};
value=var{i};
output;
end;
end;
keep id day value;
run;
proc print data=want;
run;
```

You can run proc reg by id on dataset "want" and have reg create an output dataset.

And again, there is no way this can be done on your dataset as it is. You need to rearrange it.

Hope this helps,

- Jan.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 08:51 AM

You can also get the desired ouput by flipping the data and using GLM to fit the model. This will handle missing values properly.

```
data wide;
input y1-y5;
cards;
60 61 62 63 65
3.1 3.6 3.8 4 4.1
4.5 3.4 2.3 5 6
3.5 . 2.3 5 6
;;;;
run;
proc print;
run;
proc transpose data=wide out=flip(drop=_name_ rename=(col1=X col2-col5=y1-y4));
var y:;
run;
proc print;
run;
ods select none;
ods output ParameterEstimates=ParameterEstimates;
proc glm;
model y: = x;
run;
ods select all;
proc print;
run;
```

You don't have to rename the new columns and just know that COL1 is X and the others are Y.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 08:21 AM

If you really only need slope and intercept you "can" do it in a data step. This assumes NO missing values. Proper handling of missing values if just one of may advantages to using PROC REG and with proper data structure.

```
data wide;
input y1-y5;
cards;
60 61 62 63 65
3.1 3.6 3.8 4 4.1
4.5 3.4 2.3 5 6
;;;;
run;
data reg(keep=y: slope intercept);
if _n_ eq 1 then set wide(rename=(y1-y5=x1-x5));
array x[*] x:;
set wide;
array y[*] y:;
array x_y[5];
do i = 1 to dim(y);
x_y[i] = x[i]*y[i];
end;
n = n(of y[*]);
sx = sum(of x[*]);
sy = sum(of y[*]);
sxy = sum(of x_y[*]);
sx2 = uss(of x[*]);
slope = ((n*sxy)-(sx*sy)) / ((n*sx2)-sx**2);
intercept = (sy - slope*sx) / n;
run;
proc print;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 07:00 PM

Because of the apparent complexity (for me, that is) I'm wondering if we could just take the one-row case.

Here is the first row, y:

-1.6264 | -0.2521 | 1.2354 | 1.2354 | 2.148 | 2.148 |

The x, then, is 1, 2, 3, 4, 5, 8

I'm hoping the code to do the transformation and regression will be more understandable.

Another related question: Is there another SAS procedure that will provide a 'forecast' to a succeeding value. Say, for instance, if x=10?

Thanks to all here for the assistance.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2016 09:45 PM

It sounds that you are doing Time Series Analysis.

Better post it at SAS Forecasting and Econometrics

There are tons of PROC can do that, such as PROC USM , PROC ARIMA ........