DATA Step, Macro, Functions and more

Skip Over Missing Variables

Occasional Contributor
Posts: 14

Skip Over Missing Variables

I've been working on a list of linear regressions and have run into a problem when I am missing variables. Currently my code looks like this

proc reg data=work.danreg outest=work.regoutput edf;

'11/15/14'n:model lnBeansY0M10 = lnBeansY0M11/noprint;

'12/15/14'n:model lnBeansY0M10 = lnBeansY0M12/noprint;

'01/15/15'n:model lnBeansY0M10 = lnBeansY1M01/noprint;

'02/15/15'n:model lnBeansY0M10 = lnBeansY1M02/noprint;

'03/15/15'n:model lnBeansY0M10 = lnBeansY1M03/noprint;

'04/15/15'n:model lnBeansY0M10 = lnBeansY1M04/noprint;

'05/15/15'n:model lnBeansY0M10 = lnBeansY1M05/noprint;

'06/15/15'n:model lnBeansY0M10 = lnBeansY1M06/noprint;

'07/15/15'n:model lnBeansY0M10 = lnBeansY1M07/noprint;

'08/15/15'n:model lnBeansY0M10 = lnBeansY1M08/noprint;

'09/15/15'n:model lnBeansY0M10 = lnBeansY1M09/noprint;

'10/15/15'n:model lnBeansY0M10 = lnBeansY1M10/noprint;

'11/15/15'n:model lnBeansY0M10 = lnBeansY1M11/noprint;

'12/15/15'n:model lnBeansY0M10 = lnBeansY1M12/noprint;

'01/15/15'n:model lnBeansY0M10 = lnBeansY2M01/noprint;

'03/15/15'n:model lnBeansY0M10 = lnBeansY2M03/noprint;

'04/15/15'n:model lnBeansY0M10 = lnBeansY2M04/noprint;

'05/15/15'n:model lnBeansY0M10 = lnBeansY2M05/noprint;

'06/15/15'n:model lnBeansY0M10 = lnBeansY2M06/noprint;

'07/15/15'n:model lnBeansY0M10 = lnBeansY2M07/noprint;

This runs fine if all of the independent variables exist. But occasionally one of the independent variables doesn't get pulled in for a particular day (ex. if lnBeansY2M06 didn't exist) and the entire proc statement cancels out. Is there any way around this to just have the proc statement skip over regressions with missing variables?

Super User
Posts: 13,563

Re: Skip Over Missing Variables

Posted in reply to dangoebel

I'm fairly sure it is up to you to ensure the variable exists and/or has non-missing values for independent variables in model statement.

If the variable doesn't exist then how does it end up in the code? if you are generating the code using a macro then you might want to add a step to skip creating that line when the variable doesn't exist.

Respected Advisor
Posts: 3,852

Re: Skip Over Missing Variables

Posted in reply to dangoebel

I would transpose work.danreg to have one independent X variable and a BY variable with values INBEANSYnMmm.

Show example of WORK.DANREG.

Occasional Contributor
Posts: 14

Re: Skip Over Missing Variables

Posted in reply to data_null__

My data set is

data work.danreg;

  set prod_stg.&lastdat;

BeansY0M10 = S_00998_Y0_M10 / lag(S_00998_Y0_M10);

lnBeansY0M10 = log(BeansY0M10);

BeansY0M11 = S_00998_Y0_M11 / lag(S_00998_Y0_M11);

lnBeansY0M11 = log(BeansY0M11);

BeansY0M12 = S_00998_Y0_M12 / lag(S_00998_Y0_M12);

lnBeansY0M12 = log(BeansY0M12);

BeansY1M01 = S_00998_Y1_M01 / lag(S_00998_Y1_M01);

lnBeansY1M01 = log(BeansY1M01);

BeansY1M02= S_00998_Y1_M02 / lag(S_00998_Y1_M02);

lnBeansY1M02 = log(BeansY1M02);

BeansY1M03 = S_00998_Y1_M03 / lag(S_00998_Y1_M03);

lnBeansY1M03 = log(BeansY1M03);

BeansY1M04 = S_00998_Y1_M04 / lag(S_00998_Y1_M04);

lnBeansY1M04 = log(BeansY1M04);

BeansY1M05 = S_00998_Y1_M05 / lag(S_00998_Y1_M05);

lnBeansY1M05 = log(BeansY1M05);

BeansY1M06 = S_00998_Y1_M06 / lag(S_00998_Y1_M06);

lnBeansY1M06 = log(BeansY1M06);

BeansY1M07 = S_00998_Y1_M07 / lag(S_00998_Y1_M07);

lnBeansY1M07 = log(BeansY1M07);

BeansY1M08 = S_00998_Y1_M08 / lag(S_00998_Y1_M08);

lnBeansY1M08 = log(BeansY1M08);

BeansY1M09 = S_00998_Y1_M09 / lag(S_00998_Y1_M09);

lnBeansY1M09 = log(BeansY1M09);

BeansY1M10 = S_00998_Y1_M10 / lag(S_00998_Y1_M10);

lnBeansY1M10 = log(BeansY1M10);

BeansY1M11 = S_00998_Y1_M11 / lag(S_00998_Y1_M11);

lnBeansY1M11 = log(BeansY1M11);

BeansY1M12 = S_00998_Y1_M12 / lag(S_00998_Y1_M12);

lnBeansY1M12 = log(BeansY1M12);

BeansY2M01 = S_00998_Y2_M01 / lag(S_00998_Y2_M01);

lnBeansY2M01 = log(BeansY2M01);

BeansY2M03 = S_00998_Y2_M03 / lag(S_00998_Y2_M03);

lnBeansY2M03 = log(BeansY2M03);

BeansY2M04 = S_00998_Y2_M04 / lag(S_00998_Y2_M04);

lnBeansY2M04 = log(BeansY2M04);

BeansY2M05 = S_00998_Y2_M05 / lag(S_00998_Y2_M05);

lnBeansY2M05 = log(BeansY2M05);

BeansY2M06 = S_00998_Y2_M06 / lag(S_00998_Y2_M06);

lnBeansY2M06 = log(BeansY2M06);

BeansY2M07 = S_00998_Y2_M07 / lag(S_00998_Y2_M07);

lnBeansY2M07 = log(BeansY2M07);

Respected Advisor
Posts: 3,852

Re: Skip Over Missing Variables

Posted in reply to dangoebel

I wanted to a representative sample of the rows and columns in work.DANREG but I think I would rather see a sample of the row and columns in prod_stg.&lastdat  maybe both.

Occasional Contributor
Posts: 14

Re: Skip Over Missing Variables

Posted in reply to data_null__

I have attached an old data file I use for my regression and will paste my entire code below. What I'd like to do is to take the natural log of the daily change for every month but have the proc statement ignore months that do not come through in the data set.

%let lastdat=;

/*Macro that automatically pulls in most recent data*/

proc sql noprint;

select memname into :lastdat from dictionary.tables where

upcase(libname) = 'PROD_STG' and upcase(memname) like 'PRICES_TRANSPOSE_%'

order by input(scan(memname,-1,'_'),date.) desc;


/*taking the natural log of the daily change for each commodity month*/

data work.danreg;

  set prod_stg.&lastdat;

cornY0M10 = CYELLOW_00998_Y0_M10 / lag(CYELLOW_00998_Y0_M10);

lncornY0M10 = log(cornY0M10);

cornY0M11 = CYELLOW_00998_Y0_M11 / lag(CYELLOW_00998_Y0_M11);

lncornY0M11 = log(cornY0M11);

cornY0M12 = CYELLOW_00998_Y0_M12 / lag(CYELLOW_00998_Y0_M12);

lncornY0M12 = log(cornY0M12);

cornY1M01 = CYELLOW_00998_Y1_M01 / lag(CYELLOW_00998_Y1_M01);

lncornY1M01 = log(cornY1M01);

cornY1M02= CYELLOW_00998_Y1_M02 / lag(CYELLOW_00998_Y1_M02);

lncornY1M02 = log(cornY1M02);

cornY1M03 = CYELLOW_00998_Y1_M03 / lag(CYELLOW_00998_Y1_M03);

lncornY1M03 = log(cornY1M03);

cornY1M04 = CYELLOW_00998_Y1_M04 / lag(CYELLOW_00998_Y1_M04);

lncornY1M04 = log(cornY1M04);

cornY1M05 = CYELLOW_00998_Y1_M05 / lag(CYELLOW_00998_Y1_M05);

lncornY1M05 = log(cornY1M05);

cornY1M06 = CYELLOW_00998_Y1_M06 / lag(CYELLOW_00998_Y1_M06);

lncornY1M06 = log(cornY1M06);

cornY1M07 = CYELLOW_00998_Y1_M07 / lag(CYELLOW_00998_Y1_M07);

lncornY1M07 = log(cornY1M07);

cornY1M08 = CYELLOW_00998_Y1_M08 / lag(CYELLOW_00998_Y1_M08);

lncornY1M08 = log(cornY1M08);

cornY1M09 = CYELLOW_00998_Y1_M09 / lag(CYELLOW_00998_Y1_M09);

lncornY1M09 = log(cornY1M09);

cornY1M10 = CYELLOW_00998_Y1_M10 / lag(CYELLOW_00998_Y1_M10);

lncornY1M10 = log(cornY1M10);

cornY1M11 = CYELLOW_00998_Y1_M11 / lag(CYELLOW_00998_Y1_M11);

lncornY1M11 = log(cornY1M11);

cornY1M12 = CYELLOW_00998_Y1_M12 / lag(CYELLOW_00998_Y1_M12);

lncornY1M12 = log(cornY1M12);

cornY2M01 = CYELLOW_00998_Y2_M01 / lag(CYELLOW_00998_Y2_M01);

lncornY2M01 = log(cornY2M01);

cornY2M03 = CYELLOW_00998_Y2_M03 / lag(CYELLOW_00998_Y2_M03);

lncornY2M03 = log(cornY2M03);

cornY2M04 = CYELLOW_00998_Y2_M04 / lag(CYELLOW_00998_Y2_M04);

lncornY2M04 = log(cornY2M04);

cornY2M05 = CYELLOW_00998_Y2_M05 / lag(CYELLOW_00998_Y2_M05);

lncornY2M05 = log(cornY2M05);

cornY2M06 = CYELLOW_00998_Y2_M06 / lag(CYELLOW_00998_Y2_M06);

lncornY2M06 = log(cornY2M06);

cornY2M07 = CYELLOW_00998_Y2_M07 / lag(CYELLOW_00998_Y2_M07);

lncornY2M07 = log(cornY2M07);

cornY2M08 = CYELLOW_00998_Y2_M08 / lag(CYELLOW_00998_Y2_M08);

lncornY2M08 = log(cornY2M08);

cornY2M10 = CYELLOW_00998_Y2_M10 / lag(CYELLOW_00998_Y2_M10);

lncornY2M10 = log(cornY2M10);


/*taking numerous single linear regressions of the front month

against each subsequent month*/

proc reg data=work.danreg outest=work.regoutput edf;

'11/15/14'n:model lncornY0M10 = lncornY0M11/noprint;

'12/15/14'n:model lncornY0M10 = lncornY0M12/noprint;

'01/15/15'n:model lncornY0M10 = lncornY1M01/noprint;

'02/15/15'n:model lncornY0M10 = lncornY1M02/noprint;

'03/15/15'n:model lncornY0M10 = lncornY1M03/noprint;

'04/15/15'n:model lncornY0M10 = lncornY1M04/noprint;

'05/15/15'n:model lncornY0M10 = lncornY1M05/noprint;

'06/15/15'n:model lncornY0M10 = lncornY1M06/noprint;

'07/15/15'n:model lncornY0M10 = lncornY1M07/noprint;

'08/15/15'n:model lncornY0M10 = lncornY1M08/noprint;

'09/15/15'n:model lncornY0M10 = lncornY1M09/noprint;

'10/15/15'n:model lncornY0M10 = lncornY1M10/noprint;

'11/15/15'n:model lncornY0M10 = lncornY1M11/noprint;

'12/15/15'n:model lncornY0M10 = lncornY1M12/noprint;

'01/15/15'n:model lncornY0M10 = lncornY2M01/noprint;

'03/15/15'n:model lncornY0M10 = lncornY2M03/noprint;

'04/15/15'n:model lncornY0M10 = lncornY2M04/noprint;

'05/15/15'n:model lncornY0M10 = lncornY2M05/noprint;

'06/15/15'n:model lncornY0M10 = lncornY2M06/noprint;

'07/15/15'n:model lncornY0M10 = lncornY2M07/noprint;

/*'08/15/15'n:model lncornY0M10 = lncornY2M08/noprint;

this is inactive since it stops the proc step*/

'10/15/15'n:model lncornY0M10 = lncornY2M10/noprint;


data want;

set work.regoutput;

coefficient=coalesce(of lncornY0M11, lncornY0M12, lncornY1M01, lncornY1M02, lncornY1M03, lncornY1M04, lncornY1M05,

lncornY1M06, lncornY1M07, lncornY1M08, lncornY1M09, lncornY1M10, lncornY1M11, lncornY1M12, lncornY2M01, lncornY2M03,

lncornY2M04, lncornY2M05, lncornY2M06, lncornY2M07, lncornY2M08, lncornY2M10);

drop lncornY0M11 lncornY0M12 lncornY1M01 lncornY1M02 lncornY1M03 lncornY1M04 lncornY1M05

lncornY1M06 lncornY1M07 lncornY1M08 lncornY1M09 lncornY1M10 lncornY1M11 lncornY1M12 lncornY2M01 lncornY2M03

lncornY2M04 lncornY2M05 lncornY2M06 lncornY2M07 lncornY2M08 lncornY2M10;


proc print;

VAR _MODEL_ Intercept _RSQ_ coefficient;


Occasional Contributor
Posts: 14

Re: Skip Over Missing Variables

Posted in reply to data_null__

Is there a macro I'd be able to use that would create the natural log of the daily change (currently lnBeans) by what variables are available in the dataset? something like if S_00998_Y0M10 exists create lnBeansY0M10, but have it start at Y0M01 and go though Y2M12? Y0 is current year and M stands for the month within that year if that's any help.

Ask a Question
Discussion stats
  • 6 replies
  • 3 in conversation