I have a data set that looks something like this:
data have;
infile datalines delimiter=',';
input IDnumber String $50.;
datalines;
2477,Cat Dog A blue car
2431,Cat Fish Yelllow Submarine
;
run;
I need to extract each data item from the string to one row so that i get this:
data want;
infile datalines delimiter=',';
input IDnumber String $50.;
datalines;
2477,Cat
2477,Dog
2477,A blue car
2431,Cat
2431,Fish
2431,Yelllow Submarine
;
run;
The text will sometimes be split by a single word, other times several words constitute one phrase, and one phrase can sometimes start with a capital letter followed by a small letter in the next word, but othertimes it has capital letters in both words of one phrase, so I cannot use regexp or the like to do a sort of conditional do loop or array. I basically need to hardcode it so something like this, but cant quite wrap my head around the right syntax:
data try;
set have;
do ... ;
if index(String,'Cat') > 0 then output Phrase = 'Cat' ;
if index(String,'Dog') > 0 then output Phrase = 'Dog' ;
if index(String,'A blue car') > 0 then output Phrase = 'A blue car' ;
if index(String,'Fish') > 0 then output Phrase = 'Fish' ;
if index(String,'Yellow submarine') > 0 then output Phrase = 'Yellow submarine' ;
;
end;
run;
An interesting issue. You don't need a loop, just and do...end and swap output-statement and assignment.
data try;
set have;
if index(String,'Cat') > 0 then do; Phrase = 'Cat' ; output; end;
...
run;
Sorry, I don't see logically how I would know that "A blue car" is a separate from another string. I mean its simple enough to split by word:
data want; length word $200; set have; do i=1 to countw(string," "); word=scan(string,i," "); output; end; run;
But I don't see anything logical about how to get Yellow Submarine or A blue car?
An interesting issue. You don't need a loop, just and do...end and swap output-statement and assignment.
data try;
set have;
if index(String,'Cat') > 0 then do; Phrase = 'Cat' ; output; end;
...
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.