Solved: Re: Organizing wide string data into binary columns

Lefty · Posted 09-09-2022 04:21 PM

Hi, I am using SAS 9.4. I have wide data of all the drugs used by patients that looks like this:

ID	Drug1	Drug2	Drug3	Drug4	Drug5	Drug6	Drug7
1	Ceftazidme	Ciprofloxacin
2	Ceftazidme	Colistin	Vancomycin
3	Cefazolin	Ceftazidme	Meropenem
4	Cefepime	Minocycline	Colistin
…

I would like to be able to tell how many people used each type of drug no matter which column the drugs are in, which is hard to do with the way my data are now (e.g., I want to know that 3 people used Ceftazidme, 2 people used Colistin, etc).

I would like my data to look like this:

ID	Ceftazidme	Ciprofloxacin	Colistin	Vancomycin	Cefazolin	Meropenem	Cefepime	Minocycline	…	Drug45
1	1	1	0	0	0	0	0	0
2	1	0	1	0	0	0	0	0
3	1	0	0	0	1	1	0	0
4	0	0	1	0	0	0	1	1
…

Although each ID had a maximum of 7 drugs, there are 45 unique drug names among all the IDs, so it's a little unwieldy to write out arrays for each drug name.

Any ideas would be great! Thanks in advance.

PeterClemmensen · Posted 09-09-2022 04:46 PM

Try this

data have;
input ID (Drug1 - Drug7)(:$20.);
infile datalines missover;
datalines;
1 Ceftazidme Ciprofloxacin        
2 Ceftazidme Colistin Vancomycin  
3 Cefazolin Ceftazidme Meropenem  
4 Cefepime Minocycline Colistin   
; 

data temp(keep = ID v drug);
   set have;
   array d Drug1 - Drug7;
   do over d;
      drug = d;
      v    = 1;
      if drug ne '' then output;
   end;
run;

proc transpose data = temp out = want(drop = _:);
   by ID;
   id drug;
   var v;
run;

The DATA to DATA Step Macro
Blog: SASnrd

View solution in original post

PeterClemmensen · Posted 09-09-2022 04:46 PM

Try this

data have;
input ID (Drug1 - Drug7)(:$20.);
infile datalines missover;
datalines;
1 Ceftazidme Ciprofloxacin        
2 Ceftazidme Colistin Vancomycin  
3 Cefazolin Ceftazidme Meropenem  
4 Cefepime Minocycline Colistin   
; 

data temp(keep = ID v drug);
   set have;
   array d Drug1 - Drug7;
   do over d;
      drug = d;
      v    = 1;
      if drug ne '' then output;
   end;
run;

proc transpose data = temp out = want(drop = _:);
   by ID;
   id drug;
   var v;
run;

The DATA to DATA Step Macro
Blog: SASnrd

Lefty · Posted 09-09-2022 04:48 PM

Amazing! Works perfectly. Thank you!!

HB · Posted 09-09-2022 06:26 PM

I would argue you actually don't want your data in your binary column arrangement.

Add a new drug and you have to add a new column.

I would argue more normalized data like:

ID	Drug	Date or dosage or whatever
1	Ceftazidme
1	Ciprofloxacin
2	Ceftazidme
2	Colistin
2	Vancomycin
3	Cefazolin
3	Ceftazidme
3	Meropenem
4	Minocycline
4	Colistin

which is essentially the format of the TEMP table in the @PeterClemmensen code would you do you more favors in the long run in terms of querying.

proc sql;
	select drug, count(id) as "Number of IDs Taking Drug"n
	from temp
	group by drug;
quit;

drug	Number of IDs Taking Drug
Cefazolin	1
Cefepime	1
Ceftazidme	3
Ciprofloxacin	1
Colistin	2
Meropenem	1
Minocycline	1
Vancomycin	1

Organizing wide string data into binary columns

Re: Organizing wide string data into binary columns

Re: Organizing wide string data into binary columns

Re: Organizing wide string data into binary columns

Re: Organizing wide string data into binary columns

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away