BookmarkSubscribeRSS Feed
Emma_at_SAS
Lapis Lazuli | Level 10

In the cars dataset from SASHELP, I wand to find any MODEL observation that contains "Cent" or "Quatt". I tried different find options and FIND looks to work to find one of the strings I am searching for. How may I search for multiple strings using one FIND function?

I think of a practical example when I may want to find articles based on keywords. For example, all articles that have "gene" "genetic" "genetically" "allele" "genes" "inherited" "history" in their title where the title is a character variable in a SAS dataset. 

Thank you for your help!

 

 

data cars;
set sashelp.cars;
run;

data cars_model;
set cars;
Model_up=upcase(Model);
model_var_f = find (Model_up, "CENT");
model_var_fw = findw (Model_up, "CENT");
model_var_i = index (Model_up, "CENT");
model_var_ic = indexc (Model_up, "CENT");
model_var_iw = indexw (Model_up, "CENT");
run;

proc freq data=cars_model; tables model_up; run;
proc freq data=cars_model; tables model_var_f  model_var_fw  model_var_i  model_var_ic  model_var_iw; run;

 

 

7 REPLIES 7
PaigeMiller
Diamond | Level 26

@Emma_at_SAS wrote:

In the cars dataset from SASHELP, I wand to find any MODEL observation that contains "Cent" or "Quatt".


How about this:

 

model_var_f = find (Model, "cent", 'i') or find(model,'quatt','i');

 

 

If you have lots of strings to match, please see the method explained at https://communities.sas.com/t5/SAS-Programming/Check-if-a-list-of-substrings-is-in-a-string/td-p/766...

--
Paige Miller
Ksharp
Super User
data cars;
set sashelp.cars;
if prxmatch('/Cent|Quatt/i',model);
run;
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @PaigeMiller  and @Ksharp . Both your methods work. I have a follow-up question. How do I manage space before the stings? I know of STRIP command but I do not know how to use it in FIND or PRXMATCH. I appreciate your suggestions. Thanks 

PaigeMiller
Diamond | Level 26

Using the FIND command, you don't have to worry about leading blanks. If anywhere in the text string "Cent" or "Quatt" is found, the leading blanks don't interfere.

 

So, using the FIND command I showed, the model_var_f has value 1 for those models with "Cent" or "Quatt" in the variable name.

 

A potential problem is that this also finds the Hyundai Accent models, and maybe you don't want that? Again, easily fixed if you want to exclude Hyundai Accent models.

--
Paige Miller
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @PaigeMiller. Your response was very helpful. 

Patrick
Opal | Level 21

@Emma_at_SAS  Below a coding option which would allow you to only change an informat to change the set of terms to look for.

data cars;
  set sashelp.cars;
run;

proc format;
  invalue myterms_find
    '/\b(cent|quatt)/i' (regexp) = 1
    other=0
    ;
  invalue $myterms_get
    's/^.*?(\b(cent|quatt).*?\b).*$/$1/i' (regexpe) =  _same_
    other= ' '
    ;
  invalue $mycategory_get
    's/^.*?\b(cent|quatt).*$/$1/i' (regexpe) =  _same_
    other= ' '
    ;
run;

data want;
  length category $10 term $20;
  set cars;
  if input(strip(model),myterms_find.)=1;
  /* first term selected if multiple matching terms */
  category=input(strip(model),$mycategory_get.);
  term    =input(strip(model),$myterms_get.);
run;

 

Patrick_0-1631414196299.png

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 938 views
  • 3 likes
  • 4 in conversation