BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASPhile
Quartz | Level 8

I'm trying to find some words in a variable called notes. There are more than 100 words that I need to look for.Is there a better way to subsitue the following, instead of write 100 like statements?

  proc sql;

   create table want as

  select a.* from have where

    upcase(notes) like '%WORD1%' or upcase(notes) like '%WORD2%' ......

quit;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Wrong Brackets apparently ({ to ()

data want;

set sashelp.class;

array names_to_find(2) $ ("Alfred", "Jane");

array flag(2) flag1-flag2;

do i = 1 to dim(names_to_find);

    if name=names_to_find(i) then flag(i)=1;

    else flag(i)=0;

end;

run;

View solution in original post

9 REPLIES 9
Reeza
Super User

1. Can you put the words in a table and then merge with a like merge?

2. Use a datastep and then put the words in an array and use the whichc function?

Jagadishkatam
Amethyst | Level 16

you can try sounds like operator

proc sql;

   create table want as

  select a.* from have where

    upcase(notes)=*"WORD";

quit;

thanks,

jagadish

Thanks,
Jag
SASPhile
Quartz | Level 8

When I say WORD in my example I'm not looking exactly for WORD. Word1 can be a Lion ,Word2 can be water word3 can be Mars etc etc.

SASPhile
Quartz | Level 8

Notes is a varible that contains complaints from customers. I identified a 100 different words in the complaints column and need to look for them and filter those records.

Reeza
Super User

Do you  want to flag which word matches? what about multiple matches?

I think the array solution above is the simplest one to implement in a datastep from a programming standpoint.

data want;

set have;

array words_to_find(100) $  {"Word1", "word2", "word3"....};

array flag_word(100) flag1-flag100;

do i=1 to 100;

if find(string, words_to_find(i))>0 then flag_word(i)=1;

else flag_word(i)=0;

end;

if sum(of flag_word(*))>0 then output;

run;

SASPhile
Quartz | Level 8

I'm gettin this error after declaring array:

array words_to_find(100) $  {"Word1", "word2", "word3"....};

ERROR 391-185: Expecting a variable length specification

Reeza
Super User

Wrong Brackets apparently ({ to ()

data want;

set sashelp.class;

array names_to_find(2) $ ("Alfred", "Jane");

array flag(2) flag1-flag2;

do i = 1 to dim(names_to_find);

    if name=names_to_find(i) then flag(i)=1;

    else flag(i)=0;

end;

run;

Vince28_Statcan
Quartz | Level 8

Lookup this fairly recent thread with similar topic:

https://communities.sas.com/thread/46818

You would really need to provide additionnal details on why exactly you need to use the like operator and what the 100 words you are searching for resemble.

Many like-operator jobs can be solved fairly easily with PERL (or PRX functions/routines in SAS) but the problem described reads as though you shouldn't use the like operator at all can could use a simple syntax.

you could also use an array _temporary_ to act as a 100 word dictionary that you loop through for syntax.

There are also other solutions than like-operator and PERL such as the use of string distance functions such as complev and compged (levenstein and generalized edit distance)

Worst case scenario, you can always use macro to write your statement with like-operator. I doubt the like operator is very efficient but if your job is small and you are satisfied with the like operator results, that can be a fast and easy solution to writing the programming statement.

Peter_C
Rhodochrosite | Level 12


SASPhile wrote:

I'm trying to find some words in a variable called notes. There are more than 100 words that I need to look for.Is there a better way to subsitue the following, instead of write 100 like statements?

  proc sql;

   create table want as

  select a.* from have where

    upcase(notes) like '%WORD1%' or upcase(notes) like '%WORD2%' ......

quit;

rather than approach a SQL question with a data step solution, consider the benefits of placing those 100+ words to be searched for, into a table of their own. Then the information handling problem moves from the syntax layer  into the data layer, and PROC SQL can provide its solution with a simple join using the FIND() function, like

select * from have

inner join word_list_table

on FIND( have.notes, word_list_table.word, 'IT' ) >0

In that IT parameter the 'I' treats upper and lower case the same. and the 'T' removes trailing blanks from the WORD variable before making that search.

Read the documentation on the FIND() function at  http://support.sas.com/documentation/cdl/en/lefunctionsref/64814/HTML/default/viewer.htm#p00ab6ey29t...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2470 views
  • 4 likes
  • 5 in conversation