Create dummy variable across multiple columns

Reply
New Contributor
Posts: 3

Create dummy variable across multiple columns

I'm using SAS University Edition and I need to create dummy variables for my data, but the data I'm using is spread across multiple columns.  Each row is a patient's hospital visit.  I would like to create a dummy variable for each individual diagnosis code.  

 

Here's an example of what my data looks like:

have.PNG

 

And here's what I want it to look like in the end:

want.PNG

 

Another complicating matter is that there are hundreds of Diagnosis columns so I would prefer a way that doesn't require typing them all out.  Can anyone offer advice?

Super User
Posts: 6,785

Re: Create dummy variable across multiple columns

[ Edited ]

Hundreds of names isn't a problem.  But some other issues are.  For example, what do these diagnosis codes look like?  Are they valid names for variables in SAS?  Perhaps they begin with a number, or perhaps they contain illegal characters such as a decimal point.  Or perhaps they contain so many characters that they would exceed SAS's limit of 32 characters for a name.  Will the spelling be consistent from one instance to the next?

 

If they start with a number, how would you like to handle that?  Could the word "Diag_" be added as the first part of the name?

New Contributor
Posts: 3

Re: Create dummy variable across multiple columns

Posted in reply to Astounding

The diagnosis codes are valid names for variables-- they look like K8001, S722XA, etc. and they would not exceed the limit of 32 characters.  They all start with a letter but are a combinations of letters and numbers after that.  The spelling will be consistent as well.  If it helps, they are ICD-10-CM diagnosis codes.

Super User
Posts: 6,785

Re: Create dummy variable across multiple columns

[ Edited ]

That's a good start. 

 

To process the data, then, begin by getting lists of all possible diagnosis codes:

 

proc sql;

   create table diag1 as select distinct diagnosis1 as diagnosis from have order by diagnosis;

   create table diag2 as select distinct diagnosis2 as diagnosis from have order by diagnosis;

   create table diag3 as select distinct diagnosis3 as diagnosis from have order by diagnosis;

quit;

 

data all_diags;

merge diag1 diag2 diag3;

by diagnosis;

if diagnosis > ' ';

run;

 

This gives you a unique list of all codes found in the data.

 

Next, create a macro variable holding those codes:

 

proc sql;

select diagnosis into : all_codes separated by ' ' from all_diags;

quit;

 

%put &all_codes;

 

You don't need the %PUT statement, but it might help illustrate what is happening at this point.

 

Finally, use that list in a DATA step and search for matches.  Assuming 6 characters are needed for the longest code:

 

data want;

set have;

array diags {*} $ 6 &all_codes;

do _n_ = 1 to dim(diags);

   diags{_n_} = 0;

   if diagnosis1 = vname(diags{_n_}) then diags{_n_} = 1;

   if diagnosis2 = vname(diags{_n_}) then diags{_n_} = 1;

   if diagnosis3 = vname(diags{_n_}) then diags{_n_} = 1;

end;

run;

 

I'm assuming here that you really have just 3 diagnoses per hospital stay.  You could always add a few more to the program if need be.

 

The code is untested.  It looks right, so see if it works for what you need.

 

New Contributor
Posts: 3

Re: Create dummy variable across multiple columns

Posted in reply to Astounding

All of this seems to work up until the last section of code.  Then I get the following error:

 

Screen Shot 2018-07-02 at 9.03.34 AM.png

 

Do you know what is wrong or how to get around this error?

Super User
Posts: 10,784

Re: Create dummy variable across multiple columns

data have;
input id (d1-d3) ($);
cards;
1 A B C
2 Y B D
3 Z B .
;
run;

data temp;
 set have;
 d=d1;v=1;output;
 d=d2;v=1;output;
 d=d3;v=1;output;
run;

proc transpose data=temp(where=(d is not missing)) out=temp1;
by id;
id d;
var v;
run;
proc stdize data=temp1 out=want reponly missing=0;
var _numeric_;
run;
Ask a Question
Discussion stats
  • 5 replies
  • 115 views
  • 0 likes
  • 3 in conversation