I am trying to create categories from open ended questions. What is the best way to go about this if I want to use full strings? Is there a way to do this without identifying position or case e.g (lowercase)? I have attached code so you guys can get an idea of what I am trying to do, I am sure the index function probably is not the best way to go about this. I basically want to tell say, " if a response in var contains this word and/or string, then var=1 ..."
SAS base 9.4
Two things to consider ...
You are probably better off using INDEXW rather than INDEX. If you are searching for "dense" SAS looks for those characters. If your longer string were to contain "condense" then INDEX will find "dense", but INDEXW will not.
And it's easy to apply UPCASE to both arguments with INDEXW so matches can be found regardless of upper vs. lower case.
Assuming you are testing for whole words and there may be more than one wanted, you can try alternative code:
data want;
set have;
length word $15; /* addapt to max length word to check */
i = 1;
word = lowcase(scan(id01q01txt, i));
do until (word = ' ');
if word in ('cost' 'insurance' 'afford' ...) then reasonmam=1; else
if word in ('busy' 'availability' 'time' ... ) then reasonmam=2; else
... etc. up to ...
if word in ('age' 'old' 'over' ...) reasonmam=10;
i+1;
word = lowcase(scan(id01q01txt, i));
OUTPUT; /* any time a word is fitting */
end;
run;
I will give this a try. Thank You!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.