I am trying to find observations that contain one word in a string but NOT another.
For example, my observations look something like this:
Diabetes mellitus
Screening for diabetes mellitus
Type II diabetes
History of diabetes
Family history of type II diabetes
Encounter for screening of diabetes mellitus
I want to return observations that contain 'diabetes' or 'Diabetes' but NOT 'Screening' 'screening' 'History' 'history'
I tried what I thought the most intuitive way to do this would be:
proc freq data=data;
tables diagnosis;
where find(diagnosis, 'diabetes') or find(diagnosis, 'Diabetes') and not find(diagnosis, 'History') or find(diagnosis, 'screening');
run;
but obviously, this did not work and returned observations that DO contain History and Screening, ignoring the 'not.' Is the find function the best way to do this?
data want;
set have;
if find(diagnosis,'diabetes','i')>0 and find(diagnosis,'history','i')=0 and find(diagnosis,'screening','i')=0;
run;
data want;
set have;
if find(diagnosis,'diabetes','i')>0 and find(diagnosis,'history','i')=0 and find(diagnosis,'screening','i')=0;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.