Hi all,
I'm trying to transpose my long data to wide data, but both variables are character variables. I'd like to count the number of drugs identified per study-ID. Some of the study-IDs also have more than 1 drug found. I also want to create a variable that is the total number of drugs per study-ID after its been transposed:
Data have:
Study-ID Drugs_found
ID-1-RMC Opioid
ID-2-RMC None
ID-3-RMC Benzo
ID-4-RMC Cocaine
ID-4-RMC Opioid
ID-5-RMC Meth
Data want:
Study-ID OPIOIDS METH NONE COCAINE BENZOS Total_drugs
ID-1-RMC 1 0 0 0 0 1
ID-2-RMC 0 0 0 0 0 0
ID-3-RMC 0 0 0 0 1 1
ID-4-RMC 1 0 0 1 0 2
ID-5-RMC 0 1 0 0 0 1
data long ;
set have;
count = 1;
run;
proc transpose data=long out=wide prefix=DRUG_;
by studyID;
ID Drugs_Found;
IDLABEL drugs_found;
var Count;
run;
data want;
set wide;
array _drugs(*) drugs_:;
do i=1 to dim(_drugs);
_drugs (i) = coalesce(_drugs(i), 0);
end;
#calculate total drugs;
total_drugs = sum(of drugs_:);
run;
PROC TRANSPOSE works well. If all you need is the output though, PROC TABULATE or PROC REPORT maybe a better option.
data long ;
set have;
count = 1;
run;
proc transpose data=long out=wide prefix=DRUG_;
by studyID;
ID Drugs_Found;
IDLABEL drugs_found;
var Count;
run;
data want;
set wide;
array _drugs(*) drugs_:;
do i=1 to dim(_drugs);
_drugs (i) = coalesce(_drugs(i), 0);
end;
#calculate total drugs;
total_drugs = sum(of drugs_:);
run;
PROC TRANSPOSE works well. If all you need is the output though, PROC TABULATE or PROC REPORT maybe a better option.
Thanks Reeza,
The first part worked, but the array came back in my log as the following error:
74 data want;
475 set wide;
476 array _drugs(*) drugs_:;
WARNING: Defining an array with zero elements.
477 do i=1 to dim(_drugs);
478 _drugs (i) = coalesce(_drugs(i), 0);
479 end;
480
481 **#calculate total drugs;
482 total_drugs = sum(of drugs_:);
---
71
ERROR 71-185: The SUM function call does not have enough arguments.
483
484 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 15 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds
@stancemcgraw wrote:
Thanks Reeza,
The first part worked, but the array came back in my log as the following error:
74 data want;
475 set wide;
476 array _drugs(*) drugs_:;
WARNING: Defining an array with zero elements.
477 do i=1 to dim(_drugs);
478 _drugs (i) = coalesce(_drugs(i), 0);
479 end;
480
481 **#calculate total drugs;
482 total_drugs = sum(of drugs_:);
---
71
ERROR 71-185: The SUM function call does not have enough arguments.
483
484 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 0
observations and 15 variables.
WARNING: Data set WORK.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds
This bit of your code:
476 array _drugs(*) drugs_:; WARNING: Defining an array with zero elements.
almost certainly means that your data set Wide has no variables whose names start with "drugs_".
And no existing variables would be the cause of the error using the Sum function to sum the nonexistent.
It is best to provide data in form of a data step.
Try this:
data have; input StudyID $ Drugs_found $; datalines; ID-1-RMC Opioid ID-2-RMC None ID-3-RMC Benzo ID-4-RMC Cocaine ID-4-RMC Opioid ID-5-RMC Meth ; proc summary data=have nway completetypes; class StudyID drugs_found ; output out=work.summary (drop=_type_ rename=(_freq_=count)) ; run; proc transpose data=work.summary out=work.trans (drop=_name_); by StudyID; var count; id drugs_found; run;
The summary step COMPLETETYPES makes sure that all of the combinations of the class variables that appear are in the output data. The NWAY option only gives us the summaries of both class variables, other wise there would also be summaries overall and for each of the variables only, which is useful sometimes.
Without any VAR variable all we get is a _freq_ variable that I renamed count just for fun.
The default behavior of Proc Summary will sort by the values of the class variables so the output is sorted by StudyId and ready for Proc transpose.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.