Help using Base SAS procedures

How to capture the regression coeffecient for certain time interval in panel data.

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

How to capture the regression coeffecient for certain time interval in panel data.

Hello,

 

I have a dataset like this.

data temp.database ;
input
YEAR ID SALE;
datalines;
2000 001 23
2001 001 25
2002 001 28
2003 001 30
2004 001 32
2005 001 40
2000 002 23
2001 002 25
2002 002 28
2003 002 30
2004 002 32
2005 002 40
;
run;

What I want is like this

data temp.database ;
input
YEAR ID SALE BETA;
datalines;
2000 001 23     /* here it is empty as there are no value for previous year */
2001 001 25 
2002 001 28 
2003 001 30 0.3 /*the value I just make up, but it should be the coefficient from regressing SALE in 2000,2001,2002,2003 on the YEAR value (2000,2001,2002,2003)*/
2004 001 32 0.4 /*Again, this BETA should be the coefficient from regressing SALE in 2001,2002,2003,2004 on YEAR value 2001,2002,2003,2004 */
2005 001 40 0.5 /*Again, this BETA should be the coefficient from regressing SALE in 2002,2003,2004,2005 on YEAR value 2002,2003,2004,2005 */
2000 002 23 
2001 002 25 
2002 002 28 
2003 002 30 0.2
2004 002 32 0.3
2005 002 40 0.4
;
run;

If it is in a regression model, it will be like this (use obs in 2003 as example)

[SALE in 2000                    [2000

SALE in 2001   =    BETA*  2001     +    residual

SALE in 2002                     2002

SALE in 2003]                    2003]

Hope this makes sense to you.

How can I capture this BETA and residual respectively?


Accepted Solutions
Solution
‎03-16-2018 05:09 PM
Esteemed Advisor
Posts: 5,543

Re: How to capture the regression coeffecient for certain time interval in panel data.

This is how to achieve what you want with proc reg

 

data have;
input YEAR ID SALE;
datalines;
2000 001 23
2001 001 25
2002 001 28
2003 001 30
2004 001 32
2005 001 40
2000 002 23
2001 002 25
2002 002 28
2003 002 30
2004 002 32
2005 002 40
;

%let window=4;

data temp;
set have;
do y = 1 to &window.;
    grp = year + &window. - y;
    output;
    end;
run;

proc sql;
create table reg as
select *
from temp
group by id, grp
having count(*) = &window.
order by id, grp, y;
quit;

proc reg data=reg noprint outest=regEst;
by id grp;
model sale=y;
output out=regOut(where=(y=&window)) r=residual;
run;

proc sql;
create table want as
select 
    a.*,
    c.y as beta,
    b.residual
from
    have as a left join 
    regOut as b on a.id=b.id and a.year=b.grp left join
    regEst as c on a.id=c.id and a.year=c.grp
order by id, year;
select * from want;
quit;
                      YEAR        ID      SALE      beta  Residual
                  ------------------------------------------------
                      2000         1        23         .         .
                      2001         1        25         .         .
                      2002         1        28         .         .
                      2003         1        30       2.4      -0.1
                      2004         1        32       2.3      -0.2
                      2005         1        40       3.8       1.8
                      2000         2        23         .         .
                      2001         2        25         .         .
                      2002         2        28         .         .
                      2003         2        30       2.4      -0.1
                      2004         2        32       2.3      -0.2
                      2005         2        40       3.8       1.8
PG

View solution in original post


All Replies
Esteemed Advisor
Posts: 5,543

Re: How to capture the regression coeffecient for certain time interval in panel data.

Why don't you regress on all years for a given ID?

PG
Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

This is how the previous researchers did. Only care about the previous certain years. Sometimes it is previous 4 years, sometimes 5 years.

And there are more than 900 ID, so I cannot do this manually.

Esteemed Advisor
Posts: 5,543

Re: How to capture the regression coeffecient for certain time interval in panel data.

I assume that BETA is the regression slope? Where is the INTERCEPT?

PG
Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

Yea, coefficient is the slope.

Constant value is not needed in this research. The regression result will have it, but I will not use it.

Super User
Posts: 23,787

Re: How to capture the regression coeffecient for certain time interval in panel data.


@yanshuai wrote:

This is how the previous researchers did. Only care about the previous certain years. Sometimes it is previous 4 years, sometimes 5 years.

And there are more than 900 ID, so I cannot do this manually.


How do you know which to do when?

When do you do a 4 year window or 5 year window. 

 

There’s a data step solution on here from mkeintz a few weeks ago about a moving slope calculation. 

 

Do you have SAS / ETS license, if so PROC EXPAND may be useful. 

Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

OK.

In this case, I will do 4 year window.

4 year or 5 year are usually based on previous studies.

No matter 4 or 5, I am just curious about how to acheive this.

Solution
‎03-16-2018 05:09 PM
Esteemed Advisor
Posts: 5,543

Re: How to capture the regression coeffecient for certain time interval in panel data.

This is how to achieve what you want with proc reg

 

data have;
input YEAR ID SALE;
datalines;
2000 001 23
2001 001 25
2002 001 28
2003 001 30
2004 001 32
2005 001 40
2000 002 23
2001 002 25
2002 002 28
2003 002 30
2004 002 32
2005 002 40
;

%let window=4;

data temp;
set have;
do y = 1 to &window.;
    grp = year + &window. - y;
    output;
    end;
run;

proc sql;
create table reg as
select *
from temp
group by id, grp
having count(*) = &window.
order by id, grp, y;
quit;

proc reg data=reg noprint outest=regEst;
by id grp;
model sale=y;
output out=regOut(where=(y=&window)) r=residual;
run;

proc sql;
create table want as
select 
    a.*,
    c.y as beta,
    b.residual
from
    have as a left join 
    regOut as b on a.id=b.id and a.year=b.grp left join
    regEst as c on a.id=c.id and a.year=c.grp
order by id, year;
select * from want;
quit;
                      YEAR        ID      SALE      beta  Residual
                  ------------------------------------------------
                      2000         1        23         .         .
                      2001         1        25         .         .
                      2002         1        28         .         .
                      2003         1        30       2.4      -0.1
                      2004         1        32       2.3      -0.2
                      2005         1        40       3.8       1.8
                      2000         2        23         .         .
                      2001         2        25         .         .
                      2002         2        28         .         .
                      2003         2        30       2.4      -0.1
                      2004         2        32       2.3      -0.2
                      2005         2        40       3.8       1.8
PG
Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

Thank you so much!

An interesting thing found: sale=y here y is only a series of number like 1 2 3 4. So I tried replace y with YEAR, but the result BETA is the same. So I guess the 2001 2002 2003 2004 is the same with 1 2 3 4 in regression model.

proc reg data=reg noprint outest=regEst;
by id grp;
model sale=YEAR;
output out=regOut(where=(y=&window)) r=residual;
run;

 By the way, if I would like to capture the BETA in previous 5 years rather than 4 years, can I simply switch to this?

%window=5

 

Esteemed Advisor
Posts: 5,543

Re: How to capture the regression coeffecient for certain time interval in panel data.


@yanshuai wrote:

 By the way, if I would like to capture the BETA in previous 5 years rather than 4 years, can I simply switch to this?

%window=5

 


That's the purpose of creating the macro variable window. Don't forget the semicolon at the end.

PG
Super User
Posts: 23,787

Re: How to capture the regression coeffecient for certain time interval in panel data.

I didn't do the residual portion, not sure how you'll do that without the intercept for starters or how you'd do it with multiple models. If you have 4 models you then have at least 4 residuals for each observations which would get cumbersome quickly. 

 

Screen Shot 2018-03-15 at 8.49.16 PM.png

Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

Thank you Reeza.

I just want to run a simple OLS regression. So there might be only one model used I guess.

Thank you so much anyway. I will figure out your code.

Super User
Posts: 10,788

Re: How to capture the regression coeffecient for certain time interval in panel data.

It is very easy for IML code.

 

data database ;
input
YEAR ID SALE;
datalines;
2000 001 23
2001 001 25
2002 001 28
2003 001 30
2004 001 32
2005 001 40
2000 002 23
2001 002 25
2002 002 28
2003 002 30
2004 002 32
2005 002 40
;
run;
proc iml;
use database;
read all var{id};
level=unique(id);

do i=1 to ncol(level);
 read all var{id year sale} where(id=(level[i]));
 beta=j(nrow(year),1,.);
 k=0;
 do j=4 to nrow(year);
  k=k+1;
  _year=year[k:j]; 
  _sale=sale[k:j];
  xx=j(4,1)||_year;
  beta[j]=solve(xx`*xx,xx`*_sale)[2];
 end;
 temp=id||year||sale||beta;
 want=want//temp;
end;
close;
create want from want[c={id year sale beta}];
append from want;
close;
quit;
Contributor
Posts: 50

Re: How to capture the regression coeffecient for certain time interval in panel data.

Thank you. This is cool. First time heard about IML.

But is this code trying to figure out the BETA manually? Because this code looks really fimiliar with the slope formula in Econometrics.

beta[j]=solve(xx`*xx,xx`*_sale)[2]
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 510 views
  • 7 likes
  • 4 in conversation