I am trying to find observations that contain one word in a string but NOT another.
For example, my observations look something like this:
Diabetes mellitus
Screening for diabetes mellitus
Type II diabetes
History of diabetes
Family history of type II diabetes
Encounter for screening of diabetes mellitus
I want to return observations that contain 'diabetes' or 'Diabetes' but NOT 'Screening' 'screening' 'History' 'history'
I tried what I thought the most intuitive way to do this would be:
proc freq data=data;
tables diagnosis;
where find(diagnosis, 'diabetes') or find(diagnosis, 'Diabetes') and not find(diagnosis, 'History') or find(diagnosis, 'screening');
run;
but obviously, this did not work and returned observations that DO contain History and Screening, ignoring the 'not.' Is the find function the best way to do this?
data want;
set have;
if find(diagnosis,'diabetes','i')>0 and find(diagnosis,'history','i')=0 and find(diagnosis,'screening','i')=0;
run;
data want;
set have;
if find(diagnosis,'diabetes','i')>0 and find(diagnosis,'history','i')=0 and find(diagnosis,'screening','i')=0;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.