BookmarkSubscribeRSS Feed
Emma_at_SAS
Lapis Lazuli | Level 10

In the cars dataset from SASHELP, I wand to find any MODEL observation that contains "Cent" or "Quatt". I tried different find options and FIND looks to work to find one of the strings I am searching for. How may I search for multiple strings using one FIND function?

I think of a practical example when I may want to find articles based on keywords. For example, all articles that have "gene" "genetic" "genetically" "allele" "genes" "inherited" "history" in their title where the title is a character variable in a SAS dataset. 

Thank you for your help!

 

 

data cars;
set sashelp.cars;
run;

data cars_model;
set cars;
Model_up=upcase(Model);
model_var_f = find (Model_up, "CENT");
model_var_fw = findw (Model_up, "CENT");
model_var_i = index (Model_up, "CENT");
model_var_ic = indexc (Model_up, "CENT");
model_var_iw = indexw (Model_up, "CENT");
run;

proc freq data=cars_model; tables model_up; run;
proc freq data=cars_model; tables model_var_f  model_var_fw  model_var_i  model_var_ic  model_var_iw; run;

 

 

7 REPLIES 7
PaigeMiller
Diamond | Level 26

@Emma_at_SAS wrote:

In the cars dataset from SASHELP, I wand to find any MODEL observation that contains "Cent" or "Quatt".


How about this:

 

model_var_f = find (Model, "cent", 'i') or find(model,'quatt','i');

 

 

If you have lots of strings to match, please see the method explained at https://communities.sas.com/t5/SAS-Programming/Check-if-a-list-of-substrings-is-in-a-string/td-p/766...

--
Paige Miller
Ksharp
Super User
data cars;
set sashelp.cars;
if prxmatch('/Cent|Quatt/i',model);
run;
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @PaigeMiller  and @Ksharp . Both your methods work. I have a follow-up question. How do I manage space before the stings? I know of STRIP command but I do not know how to use it in FIND or PRXMATCH. I appreciate your suggestions. Thanks 

PaigeMiller
Diamond | Level 26

Using the FIND command, you don't have to worry about leading blanks. If anywhere in the text string "Cent" or "Quatt" is found, the leading blanks don't interfere.

 

So, using the FIND command I showed, the model_var_f has value 1 for those models with "Cent" or "Quatt" in the variable name.

 

A potential problem is that this also finds the Hyundai Accent models, and maybe you don't want that? Again, easily fixed if you want to exclude Hyundai Accent models.

--
Paige Miller
Emma_at_SAS
Lapis Lazuli | Level 10

Thank you @PaigeMiller. Your response was very helpful. 

Patrick
Opal | Level 21

@Emma_at_SAS  Below a coding option which would allow you to only change an informat to change the set of terms to look for.

data cars;
  set sashelp.cars;
run;

proc format;
  invalue myterms_find
    '/\b(cent|quatt)/i' (regexp) = 1
    other=0
    ;
  invalue $myterms_get
    's/^.*?(\b(cent|quatt).*?\b).*$/$1/i' (regexpe) =  _same_
    other= ' '
    ;
  invalue $mycategory_get
    's/^.*?\b(cent|quatt).*$/$1/i' (regexpe) =  _same_
    other= ' '
    ;
run;

data want;
  length category $10 term $20;
  set cars;
  if input(strip(model),myterms_find.)=1;
  /* first term selected if multiple matching terms */
  category=input(strip(model),$mycategory_get.);
  term    =input(strip(model),$myterms_get.);
run;

 

Patrick_0-1631414196299.png

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1709 views
  • 3 likes
  • 4 in conversation