Hi all,
I have a variable called drugs found, and it's in character text with a lot of text per line. I'd like a count of each type of drug found per ID.
Data have;
studyid drugs_found
5 Methamphine, amphetamines, cocaine
6 none
7 THC
data want;
studyid Methamphine Cannabis Amphetamines Cocaine No_drugs
5 1 0 1 1 0
6 0 0 0 0 1
7 0 1 0 0 0
1. Split into a long format using SCAN()
2. Use previously shown methods, from your recent questions, to transpose and create your indicators.
data have;
infile cards dlm='09'x;
length drugs_found $50.;
input studyid $ drugs_found $;
cards;
5 Methamphine, amphetamines, cocaine
6 none
7 THC
;;;;
run;
data long;
set have;
nTerms = countw(drugs_found) ;
do i=1 to nTerms;
word = scan(drugs_found, i, ",");
put word;
if word in ("THC", "CBD") then word = "Cannabis";
else if word in ("none") then call missing(word);
output;
end;
run;
Now use the similar methods shown. If you need to recode values, ie THC/CBD to Cannabis you can add that in the data step above and then transpose your data.
I don't see "Cannabis" in the input data.
Also, are these drugs in the input data always going to be spelled identically each time? Do you know in advance what the possible drugs are, or does the program have to figure it out? Does the final table always have values zero and 1, or could there be integers 2, 3, ...?
It says THC in the input data, thus I wanted it recorded as CANNABIS and yes, I'd like it to be recorded as dummy variables with 0 and 1.
I included a screenshot of what it looks like, because it does not always look the same and some of the words are cut-off.
@stancemcgraw wrote:
It says THC in the input data, thus I wanted it recorded as CANNABIS and yes, I'd like it to be recorded as dummy variables with 0 and 1.
I included a screenshot of what it looks like, because it does not always look the same and some of the words are cut-off.
Ok, thank you, but I asked other questions that you didn't answer.
1. Split into a long format using SCAN()
2. Use previously shown methods, from your recent questions, to transpose and create your indicators.
data have;
infile cards dlm='09'x;
length drugs_found $50.;
input studyid $ drugs_found $;
cards;
5 Methamphine, amphetamines, cocaine
6 none
7 THC
;;;;
run;
data long;
set have;
nTerms = countw(drugs_found) ;
do i=1 to nTerms;
word = scan(drugs_found, i, ",");
put word;
if word in ("THC", "CBD") then word = "Cannabis";
else if word in ("none") then call missing(word);
output;
end;
run;
Now use the similar methods shown. If you need to recode values, ie THC/CBD to Cannabis you can add that in the data step above and then transpose your data.
Thank you I always forget the going from wide to long step, you're right. You've really helped my day, thanks so much!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.