I am trying to create categories from open ended questions. What is the best way to go about this if I want to use full strings? Is there a way to do this without identifying position or case e.g (lowercase)? I have attached code so you guys can get an idea of what I am trying to do, I am sure the index function probably is not the best way to go about this. I basically want to tell say, " if a response in var contains this word and/or string, then var=1 ..."
SAS base 9.4
Two things to consider ...
You are probably better off using INDEXW rather than INDEX. If you are searching for "dense" SAS looks for those characters. If your longer string were to contain "condense" then INDEX will find "dense", but INDEXW will not.
And it's easy to apply UPCASE to both arguments with INDEXW so matches can be found regardless of upper vs. lower case.
Assuming you are testing for whole words and there may be more than one wanted, you can try alternative code:
data want;
set have;
length word $15; /* addapt to max length word to check */
i = 1;
word = lowcase(scan(id01q01txt, i));
do until (word = ' ');
if word in ('cost' 'insurance' 'afford' ...) then reasonmam=1; else
if word in ('busy' 'availability' 'time' ... ) then reasonmam=2; else
... etc. up to ...
if word in ('age' 'old' 'over' ...) reasonmam=10;
i+1;
word = lowcase(scan(id01q01txt, i));
OUTPUT; /* any time a word is fitting */
end;
run;
I will give this a try. Thank You!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.