DATA Step, Macro, Functions and more

Find columns of dummy variables that correspond to each categorical variable

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Find columns of dummy variables that correspond to each categorical variable

[ Edited ]

Dear All,

 

I am using PROC SYSLIN with dummy variables generated by PROC GLMMOD. However, I must manually intervene as shown below:

 

proc glmmod data=dataset1 outdesign=dummies noprint;

    class customer supplier;     /* Each customer and supplier has a unique number */

    model y = customer supplier; /* Generate dummy variables for customers and suppliers */

run;

 

data dataset2; /* Add dummy variables columns to the data set */

    set dataset1;

    set dummies;

run;

 

/* See which columns are for customers and which columns are for suppliers */

proc contents data=dataset2;

run;

 

/* MANUALLY add customer and suppliers columns */

proc syslin data=dataset2 3sls;

    endogenous y1 y2;

    instruments w z  Col2-Col100 /* assuming 100 customer and supplier dummies */;

customer: model y1 = w Col3 - Col61   /* customer columns only */;

supplier: model y2 = z Col63 - Col100 /* supplier columns only */;

run;

 

My questions is: How can I get the range of columns corresponding to customer and supplier dummies and pass that info as a macro variable to PROC SYSLIN? I do not want to hardcode the column numbers in PROC SYSLIN. Any help will be greatly appreciated.

 

Sincerely,

Cuneyt


Accepted Solutions
Solution
2 weeks ago
SAS Super FREQ
Posts: 4,241

Re: Passing output from DATA step to a PROC

In addition to the OUTDESIGN= option, which writes the dummy variables to a data set, you will want to use the OUTPARM= option, which outputs a table that connects the levels of the original variable to the names of the dummy variables. For an example, see the call to PROC GLMMOD in the article "Create dummy variables in SAS."  You can use a DATA step or PROC SQL to extract the column numbers for the CUSTOMER and SUPPLIER variables. 

 

For example, the following uses the SEX and BP_STATUS variables in the Sashelp.Heart data.  I am not familiar with PROC SYSLIN, but I use PROC MEANS to count the number of levels in each dummy variable.

 

proc glmmod data=sashelp.heart outdesign=GLMDesign outparm=GLMParm noprint;
   class sex BP_Status;
   model Cholesterol = Sex BP_Status;
run;
 
data _NULL_;
set GLMParm(where=(EFFNAME^="Intercept"));
by notsorted EFFNAME;
if first.EFFNAME then do;
   macroName = catt(EFFNAME, "Min");
   call symputx(macroName, _COLNUM_);
end;
if last.EFFNAME then do;
   macroName = catt(EFFNAME, "Max");
   call symputx(macroName, _COLNUM_);
end;
run;
 
%put &=SexMin;       /* first column for SEX */
%put &=SexMax;       /* last column for SEX */
%put &=BP_StatusMin; /* first column for BP_STATUS */
%put &=BP_StatusMax; /* last column for BP_STATUS */

proc means data=GLMDesign sum;
var COL&SexMin - COL&SexMax
    COL&BP_StatusMin - COL&BP_StatusMax;
run;

/* should give the same counts as PROC FREQ */
/*
proc freq data=Sashelp.Heart(where=(Cholesterol^=.));
tables Sex BP_Status;
run;
*/

View solution in original post


All Replies
Solution
2 weeks ago
SAS Super FREQ
Posts: 4,241

Re: Passing output from DATA step to a PROC

In addition to the OUTDESIGN= option, which writes the dummy variables to a data set, you will want to use the OUTPARM= option, which outputs a table that connects the levels of the original variable to the names of the dummy variables. For an example, see the call to PROC GLMMOD in the article "Create dummy variables in SAS."  You can use a DATA step or PROC SQL to extract the column numbers for the CUSTOMER and SUPPLIER variables. 

 

For example, the following uses the SEX and BP_STATUS variables in the Sashelp.Heart data.  I am not familiar with PROC SYSLIN, but I use PROC MEANS to count the number of levels in each dummy variable.

 

proc glmmod data=sashelp.heart outdesign=GLMDesign outparm=GLMParm noprint;
   class sex BP_Status;
   model Cholesterol = Sex BP_Status;
run;
 
data _NULL_;
set GLMParm(where=(EFFNAME^="Intercept"));
by notsorted EFFNAME;
if first.EFFNAME then do;
   macroName = catt(EFFNAME, "Min");
   call symputx(macroName, _COLNUM_);
end;
if last.EFFNAME then do;
   macroName = catt(EFFNAME, "Max");
   call symputx(macroName, _COLNUM_);
end;
run;
 
%put &=SexMin;       /* first column for SEX */
%put &=SexMax;       /* last column for SEX */
%put &=BP_StatusMin; /* first column for BP_STATUS */
%put &=BP_StatusMax; /* last column for BP_STATUS */

proc means data=GLMDesign sum;
var COL&SexMin - COL&SexMax
    COL&BP_StatusMin - COL&BP_StatusMax;
run;

/* should give the same counts as PROC FREQ */
/*
proc freq data=Sashelp.Heart(where=(Cholesterol^=.));
tables Sex BP_Status;
run;
*/
Occasional Contributor
Posts: 10

Re: Passing output from DATA step to a PROC

Hi Rick,

 

Thank you very much. This solution worked like a charm!

 

Sincerely,

Cuneyt

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 56 views
  • 0 likes
  • 2 in conversation