Programming the statistical procedures from SAS

Possible to create dummy variables with proc transpose?

Reply
Frequent Contributor
Posts: 81

Possible to create dummy variables with proc transpose?

Is it possible to create dummy variables with proc transpose using the prefix option and id statement?

 

Using the prefix=race_ and prefix=hospital_ options and id statement, I've created the proper dummy variable names but I can't seem to assign the proper values to the new variables.

 

I'd like the following dataset:

 

data test;
input membno race $ hospital $10.;
cards;
1 white capital
2 black capital
3 hispanic capital
4 asian capital
5 native stmary
6 white stmary
7 black stmary
;
run;

 

 

To be transformed as follows:

membno race_asian race_black race_hispanic race_native race_white hospital_capital hospital_stmary
1
2
3
4
5
6
7

where each member # is assigned the correct race and hospital with a value of some kind (for example, values of "." and " " in place of 1 and 0 would be fine).

 

 

Thanks

 

 

Super User
Posts: 18,514

Re: Possible to create dummy variables with proc transpose?

In your data step add a variable, either a one or a letter that is constant and use that in the VAR statement in your proc transpose.

 

data have;

set have;

Constant='X';

run;

 

I don't know that you can do both variables in a single proc transpose unfortunately. 

Frequent Contributor
Posts: 81

Re: Possible to create dummy variables with proc transpose?

Thanks Reeza - like this?

 

data test;
input membno race $ hospital $10.;
constant=1;
cards;
1 white capital
2 black capital
3 hispanic capital
4 asian capital
5 native stmary
6 white stmary
7 black stmary
;
run;

proc sort data=test;
	by race;
run;
proc transpose data=test out=test_trans prefix=race_;
	by race;
	id race;
	var constant;
run;

 

The output dataset still isn't quite right, I'll fiddle around some more.

Respected Advisor
Posts: 4,745

Re: Possible to create dummy variables with proc transpose?

Here is how to do this with proc glmmod :

 


proc glmmod data=test outdesign=dum outparm=parm;
class race hospital;
model membno = race hospital;
run;

data _null_;
if _n_=1 then 
    call execute("data want; set dum; drop col1;");
set parm(firstobs=2) end=done;
call execute(cats("rename col",_colnum_,"=",effname,"_",race,hospital,";"));
if done then call execute("run;");
run;
PG
Frequent Contributor
Posts: 81

Re: Possible to create dummy variables with proc transpose?

Thanks PGStats, very slick!

Respected Advisor
Posts: 4,745

Re: Possible to create dummy variables with proc transpose?

Added a few bells and whistles...

 

proc glmmod data=test outdesign=want outparm=parm noprint;
class race hospital;
model membno = race hospital / noint;
run;

data _null_;
if _n_=1 then 
    call execute("proc datasets noprint; modify want;");
set parm;
call execute(cats("label col", _colnum_, "='",
    propcase(effname), "=",
    propcase(race), propcase(hospital), "';"));
call execute(cats("rename col", _colnum_, "=", effname, "_",
    race, hospital, ";"));
run;
quit;

Note: the quit; statement at the end terminates the datasets procedure inserted with the execute calls.

PG
SAS Super FREQ
Posts: 3,542

Re: Possible to create dummy variables with proc transpose?

Slick, indeed.

 

But be careful to look at your data before applying this technique. In general the levels of the categorical variables might not form valid SAS variable names. For example, if race="African American" or hospital="St. Mary's", then this automated step will fail and you will need to write the RENAME= statements manually.

 

If you do go with PGStats's automatied method, I suggest making the QUIT statement part of the CALL EXECUTE chain, as he originally suggested. I was confused by having the dangling QUIT at the end.

 

data _null_;

...

set parm end=done;

...

if done then call execute("quit;");

run;

Frequent Contributor
Posts: 81

Re: Possible to create dummy variables with proc transpose?

Agreed, good points.

 

I'm going to hold out on designating a correct answer for a bit longer & see if there's a sneaky way to use proc transpose to create the dummy variables.

SAS Employee
Posts: 51

Re: Possible to create dummy variables with proc transpose?

Hello,

 

No PROC TRANSPOSE solution, sorry.

But with PROC GLMSELECT (unlike GLMMOD) you get the right (design-) variable names immediatly (no renaming needed)!

 

ods html close;

ods preferences;

ods html;

 

proc glmselect data=sashelp.prdsale namelen=50

               OUTDESIGN(ADDINPUTVARS FULLMODEL)=work.DesignVariables;

class        country region division ProdType Product / param=reference ShowCoding split;

model ACTUAL=country region division ProdType Product / /* STEPS=1 */ STOP=1;

output out=new;

run;

QUIT;

/* end of program */

 

Fit an intercept-only model (you must build a dummy model to get the design variables) for whatever target variable.

 

Cheers,

Koen

Frequent Contributor
Posts: 81

Re: Possible to create dummy variables with proc transpose?

Thank you Koen -

 

So running the following code:

 

data test;
input membno y race $ hospital $10.;
cards;
1 0 white capital
2 1 black capital
3 1 hispanic capital
4 0 asian capital
5 0 native stmary
6 1 white stmary
7 0 black stmary
;
run;

ods html close;
ods preferences;
ods html;
proc glmselect data=test namelen=50 OUTDESIGN(ADDINPUTVARS FULLMODEL)=work.DesignVariables;
	class race hospital / param=reference ShowCoding split;
	model y = / /* STEPS=1 */ STOP=1;
	output out=new;
run;
QUIT;

 should generate the dummy coded variables in the "new" dataset, is that correct?

 

SAS Employee
Posts: 51

Re: Possible to create dummy variables with proc transpose?

Hello RobF,

 

You forgot to specify the categorical regressors (class variables) in the right hand side of the model equation.

The dummy variables can be found in the dataset work.DesignVariables!! (I replaced new by _NULL_)

Take care: for p categories you get only p-1 dummies (the alphabetically last class level has no dummy and a 0-value for the p-1 created dummies = reference coding with default reference level). Maybe another parametrization scheme exists that does over-parametrization (p dummies for p categories as PROC GLM does)? You should have a look to the possibilities of the param= option in the CLASS statement of PROC GLMSELECT.

 

Here's the corrected code:

data test;

input membno y race $ hospital $10.;

cards;

1 0 white capital

2 1 black capital

3 1 hispanic capital

4 0 asian capital

5 0 native stmary

6 1 white stmary

7 0 black stmary

;

run;

ods html close;

ods preferences;

ods html;

proc glmselect data=test namelen=50 OUTDESIGN(ADDINPUTVARS FULLMODEL)=work.DesignVariables;

class race hospital / param=reference ShowCoding split;

model y = race hospital / /* STEPS=1 */ STOP=1;

output out=_NULL_;

run;

QUIT;

/* end of program */

 

 

Good luck,

Koen

 

Respected Advisor
Posts: 4,745

Re: Possible to create dummy variables with proc transpose?

Good to know, thanks! Playing with the code, it seems that good results are obtained with:

 


proc glmselect data=test namelen=50 noprint
    OUTDESIGN(FULLMODEL)=want;
class race hospital / param=glm;
model membno = race hospital / include=99 noint;
run;
PG
SAS Super FREQ
Posts: 3,542

Re: Possible to create dummy variables with proc transpose?

I think you'll be waiting a long time. PROC TRANSPOSE is not designed to create dummy variables.  It doesn't create new (0/1) data; it rearranges existing data.

Frequent Contributor
Posts: 81

Re: Possible to create dummy variables with proc transpose?

Thanks for the suggestions, gentlemen - now which response to declare as the official solution?

 

Any one of the solutions would be worth a presentation at the next World SAS Forum if it hasn't already been covered. Smiley Happy

Ask a Question
Discussion stats
  • 13 replies
  • 922 views
  • 9 likes
  • 5 in conversation