I really cannot figure this one out...
I have 15 columns (drug1-drug15) and I need to combine select values from one column with a value from another column (this will be different combinations for every row, and only for select rows, and some rows will need to combine multiple values).
Here is what I have compared to what I need. For example, for Observation 1 I need to combine 'carboxy-thc' from drug1 and 'thc' from drug2 and enter the new value 'thc/carboxy-thc' to drug1 and leave drug2 blank. I think I need to use an array, but I do not have experience using them.
HAVE | |||||
Obs | drug1 | drug2 | drug3 | drug4 | … |
1 | carboxy-thc | thc | |||
2 | benzoylecgonine | cocaethylene | cocaine | ||
3 | carboxy-thc | diazepam | nordiazepam | thc | |
4 | amphetamines | methamphetamine | midazolam | ||
5 | fluoxetine | norfluoxetine | thc | ||
NEED | |||||
Obs | drug1 | drug2 | drug3 | drug4 | … |
1 | thc/carboxy-thc | ||||
2 | cocaine/benzoylecgonine/cocaethlene | ||||
3 | thc/carboxy-thc | diazepam/nordiazepam | |||
4 | methamphetamines/amphetamines | midazolam | |||
5 | fluoxetine/norfluoxetine | thc | |||
Any help is much appreciated! Thank you!
That might be the result of transposing.
But the drug dictionary information you need would look more like:
data drugs ;
input group drug :$30. ;
cards;
1 thc
1 carboxy-thc
2 cocaine
2 benzoylecgonine
2 cocaethlene
3 diazepam
3 nordiazepam
4 methamphetamines
4 amphetamines
5 fluoxetine
5 norfluoxetine
;
So if HAVE contains an ID variable to unique identify the observations and DRUG1 to DRUG15 then you could use PROC TRANSPOSE to first convert the current HAVE dataset from 15 variable to 15 observations.
proc transpose data=have out=tall (drop=_name_ rename=(col1=drug));
by id;
var drug1-drug15;
run;
Now combine that with the drug dictionary and re-order.
proc sql ;
create table tall_groups as
select distinct a.id, b.group, a.drug
from tall a
inner join drugs b
on a.drug=b.drug
order by a.id, b.group, a.drug
;
quit;
And finally you can collapse the multiple drugs for a group into one observation.
data tall_concat ;
do until(last.group);
set tall_group
by id group;
length drug_group $100 ;
drug_group=catx('/',drug_group,drug);
end;
run;
You will need additional information you have not provided to actually be able to code this. In particular how did you know that "thc" and "carboxy-thc" should be combined? Why not combine carboxy-thc with diazepam?
Anyway the first thing you probably need to do is transpose the 15 drug variables into one drug variable with up to 15 observations. Then once you have mapped the drug names to something (drug class? drug generic name?) that will allow you to combine them you can group them and collapse the similar names into one observations. You could then convert that back into the wide format if you need it for a report.
Thanks for the feedback.
The drug pairing are based on parent-metabolite associations. So 'carboxy-thc' is the metabolite of 'thc'. So I want to combine the parent-metabolite drugs into a single drug.
Is this what you are describing?
Obs | drug |
1 | carboxy-thc |
1 | thc |
2 | benzoylecgonine |
2 | cocaethylene |
2 | cocaine |
3 | carboxy-thc |
3 | diazepam |
3 | nordiazepam |
3 | thc |
4 | amphetamines |
4 | methamphetamine |
4 | midazolam |
That might be the result of transposing.
But the drug dictionary information you need would look more like:
data drugs ;
input group drug :$30. ;
cards;
1 thc
1 carboxy-thc
2 cocaine
2 benzoylecgonine
2 cocaethlene
3 diazepam
3 nordiazepam
4 methamphetamines
4 amphetamines
5 fluoxetine
5 norfluoxetine
;
So if HAVE contains an ID variable to unique identify the observations and DRUG1 to DRUG15 then you could use PROC TRANSPOSE to first convert the current HAVE dataset from 15 variable to 15 observations.
proc transpose data=have out=tall (drop=_name_ rename=(col1=drug));
by id;
var drug1-drug15;
run;
Now combine that with the drug dictionary and re-order.
proc sql ;
create table tall_groups as
select distinct a.id, b.group, a.drug
from tall a
inner join drugs b
on a.drug=b.drug
order by a.id, b.group, a.drug
;
quit;
And finally you can collapse the multiple drugs for a group into one observation.
data tall_concat ;
do until(last.group);
set tall_group
by id group;
length drug_group $100 ;
drug_group=catx('/',drug_group,drug);
end;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.