I apologize if a solution has already been presented for a similar question. I have tried using both of the below codes to detect specified strings of text and count the number of occurrences in a new column. The issue I am having is drugs like 'amphetamine' are still being counted for occurrences of 'methamphetamine'. Is there an alternative code that uses leading characters to exclude a string from being counted? I am not able to share the exact data but have included an example of how the 'identified' column appears and how substances are listed. I am using SAS 9.4.
Sample Code 1:
data comb3;
set drugseizures;
if index(identified, 'methylenedioxymethamphetamine')> 0 then MDMA = 1;
else if index(identified, 'methylenedioxyamphetamine')>0 then MDA = 1;
else if index(identified, 'methamphetamine')>0 then methamphetamine = 1;
else if index(identified, 'amphetamine')>0 then amphetamine = 1;
run;
Sample Code 2:
proc sql;
create table comb5 as
select
*
,case when identified contains 'amphetamine' and identified not like 'methamphetamine' then 1 else 0 end as amphetamine
,case when identified contains 'cannabidiol' and identified not like '%ocannabinol' then 1 else 0 end as cannabidiol
,case when (identified contains 'fentanyl') and (identified not like '%acetyl fentanyl%' or identified not like '%furanyl fentanyl%' or identified not like '%carfentanil%') then 1 else 0 end as fentanyl
,case when identified contains 'aspirin' then 1 else 0 end as aspirin
,case when identified contains 'methadone' then 1 else 0 end as methadone
,case when identified contains 'methylphenidate' then 1 else 0 end as methylphenidate
from drugseizures
;
Have you tried INDEXW instead of INDEX?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.