HELLO
I have a long string field.
I want to check check if 10 strings exits in the string field and create a new field that will contain the string values that contain .
For example:
If the long string value is "London is a nice city and I like Ice cream and I wish summer come soon"
and I check the following sub-strings exits in long string:
London,USA,summer ,New York, football,basketball,winter,Berlin,Autumn,food
then in the new field I should get : "London,summer"
What is the way to do it please?
Array for the variables with the indicators and Index, Indexw, Find or Findw functions.
If the list of values to search for is the same for all the records then likely a temporary array holding them would be a good idea. If they change you have a much more complicated problem and need actual data.
I don't know if this would do the trick, but I coded a simple example that might extend to your case.Essentially, you need a do loop that can scan your string once for each substring you are looking to extract. In the end, the final x-sub-n variable will be the one you keep.
data have;
infile datalines dsd;
input x : $100. ;
cards;
Shall I compare thee to a summer day in Boston?
;
run;
data want;
set have;
do i = 1 to 4;
if index(x, 'Boston') then x1 = 'Boston'; else x1 = '';
if index(x, 'compare') then x2 = catx(', ', 'compare', x1); else x2 = x1;
if index(x, 'practice') then x3 = catx(', ', 'practice', x2); else x3 = x2;
if index(x,'day') then x4 = catx(', ', 'day', x3); else x4 = x3;
end;
run;
CALL PRXNEXT Function, just modify the example slightly and you get this:
data _null_ ;
length result $100 ; /* result variable */
string="London is a nice city and I like Ice cream and I wish summer come soon" ; /* String to search */
searchTerms="/London|USA|summer|New York|football|basketball|winter|Berlin|Autumn|food/" ; /* Search terms */
searchPattern=prxparse(searchTerms); /* Creates the pattern Id for the search terms */
start=1 ; /* Position to start search from */
stop=length(string) ; /* Position to end search */
call prxnext(searchPattern, start, stop, string, position, length); /* Search for the first occurance of a term in the string, return the position and lenght of the found term */
result="" ; /* initialize result to blank */
flag=0 ; /* flag for commas in the result */
do while (position > 0); /* loop while we find matches */
found = substr(string, position, length); /* What did we find */
put found= position= length=;
if flag=1 then /* Build the result variable */
result=trim(result)||","||found ;
else
result=trim(result)||found ;
call prxnext(searchPattern, start, stop, string, position, length); /* search for the next occurance of a term */
flag=1 ;
end;
put result= ; /* write result to log */
run ;
data have; infile datalines dsd; input x : $100. ; cards; London is a nice city and I like Ice cream and I wish summer come soon Shall I compare thee to a summer day in Boston? ; run; %let keys= London,USA,summer ,New York, football,basketball,winter,Berlin,Autumn,food ; data want; set have; length want $200. ; do i=1 to countw("&keys.",','); temp=scan("&keys.",i,','); if findw(x,temp,,'itsp') then want=catx(',',want,temp); end; drop i temp; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.