I have a data set that looks something like this:
data have;
infile datalines delimiter=',';
input IDnumber String $50.;
datalines;
2477,Cat Dog A blue car
2431,Cat Fish Yelllow Submarine
;
run;
I need to extract each data item from the string to one row so that i get this:
data want;
infile datalines delimiter=',';
input IDnumber String $50.;
datalines;
2477,Cat
2477,Dog
2477,A blue car
2431,Cat
2431,Fish
2431,Yelllow Submarine
;
run;
The text will sometimes be split by a single word, other times several words constitute one phrase, and one phrase can sometimes start with a capital letter followed by a small letter in the next word, but othertimes it has capital letters in both words of one phrase, so I cannot use regexp or the like to do a sort of conditional do loop or array. I basically need to hardcode it so something like this, but cant quite wrap my head around the right syntax:
data try;
set have;
do ... ;
if index(String,'Cat') > 0 then output Phrase = 'Cat' ;
if index(String,'Dog') > 0 then output Phrase = 'Dog' ;
if index(String,'A blue car') > 0 then output Phrase = 'A blue car' ;
if index(String,'Fish') > 0 then output Phrase = 'Fish' ;
if index(String,'Yellow submarine') > 0 then output Phrase = 'Yellow submarine' ;
;
end;
run;
An interesting issue. You don't need a loop, just and do...end and swap output-statement and assignment.
data try;
set have;
if index(String,'Cat') > 0 then do; Phrase = 'Cat' ; output; end;
...
run;
Sorry, I don't see logically how I would know that "A blue car" is a separate from another string. I mean its simple enough to split by word:
data want; length word $200; set have; do i=1 to countw(string," "); word=scan(string,i," "); output; end; run;
But I don't see anything logical about how to get Yellow Submarine or A blue car?
An interesting issue. You don't need a loop, just and do...end and swap output-statement and assignment.
data try;
set have;
if index(String,'Cat') > 0 then do; Phrase = 'Cat' ; output; end;
...
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.