BookmarkSubscribeRSS Feed
wriccar
Fluorite | Level 6

Hello fellow SAS users!

I need some quick help writing a code that can produce estimated coefficients from an OLS Regression model each year using only the preceeding years' data. 

 

The model is:
DV = var1 + var2

 

The code is relatively straightforward:

PROC REG DATA=WRDS_OUTPUT PLOTS=NONE OUTEST=PARAM;
MODEL DV = VAR1 VAR2;
BY YEAR INDUSTRY; 
RUN;

My dilemma is that I need to estimate the constant and coefficients on VAR1 and VAR2 using data through year t-1, which means the sample period will differ each year. 

 

Is there an easy way to incorporate this? 

 

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hello @wriccar,

 

Let's assume, your WRDS_OUTPUT dataset contains data from 2001 through 2004. Now you could create three models per value of variable INDUSTRY:


1. one for 2002 using the data with YEAR=2001 of the respective INDUSTRY
2. one for 2003 using the data with YEAR in (2001, 2002) of the respective INDUSTRY
3. one for 2004 using the data with YEAR in (2001, 2002, 2003) of the respective INDUSTRY

 

If this is what you want, I suggest:

proc sort data=wrds_output;
by industry;
run;

data _null_;
do year=2002 to 2004;
  call execute(cats('proc reg data=wrds_output(where=(.z<year<',year,')) plots=none outest=param',year,';'));
  call execute('model dv = var1 var2;');
  call execute('by industry;');
  call execute('run;');
end;
run;

The above code creates work datasets PARAM2002, PARAM2003 and PARAM2004 corresponding to items 1 - 3 above. You can concatenate these datasets easily, if needed:

data param;
length year 8;
set param2002-param2004 indsname=dsn;
year=input(compress(dsn,,'kd'), 4.);
run;
wriccar
Fluorite | Level 6

Hi @FreelanceReinh

 

Thanks for the reply. This code worked perfectly and I thank you for it. I have one question, though you may not be able to answer without having access to the data. 

 

First, keep in mind I have already removed observations with missing values from the data set. 

Second, my actual data set is from 1957 - 2016; I want to begin regressions as of 1972, so I adjusted the "do fyear=" statement of your code. It still worked. 

 

The problem, though, is that until 1999 the regression doesn't run, saying there are no valid observations for each BY group. I thought this could be a problem of not having enough observations per industry classification, but it is giving the message for ALL industry groups. (Also, it gives the same message if I leave out the BY statement altogether.) 

 

Obviously this could be an issue for me to resolve looking through the data a bit more. But I suppose my question is, does your code inherently require a number of prior years in order to run? 

 

 

FreelanceReinh
Jade | Level 19

Hi @wriccar,

 

This must be a data issue. Even a single observation with non-missing values of DV, VAR1 and VAR2 (in the respective BY group) would prevent the message "ERROR: No valid observations are found." (The results based on a single observation would not be very useful, though.) So, in your case for many YEARs and apparently all values of INDUSTRY all observations must have missing values of DV, VAR1 or VAR2. But if you "have already removed observations with missing values from the data set" (as you wrote), this situation should not occur.

 

My test data had at least four observations without missing values in each BY group (which is the minimum non-degenerate case).

wriccar
Fluorite | Level 6

This was my thought, too. Just wanted to make sure.

I will give the data a thorough going-over.

Thanks again.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1078 views
  • 0 likes
  • 2 in conversation