BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BernyOsuna
Obsidian | Level 7

I have this list of words and i need to check if those words exist in another variable, in the same dataset, which cointains a set of words separated by ','.

 

data dataset;

set dataset;

x = find(upcase(var1),upcase(var2));

run;

 

this works most of the time, but if i have to look for SUCESS its telling me that the word exist because it reads SUCCESSfully, so need te exact word.

 

So then i moved to FINDW

 

data dataset;

set dataset;

x = find(upcase(var1),upcase(var2));

run;

 

But for 2 words at the same time like: BEST BUY and the string is COCA COLA, BEST BUY it brings 0 meaning it couldnt find it.

 

  • It was to work for both: single words or 2 or more words a the same time.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Usually this is caused by including the trailing spaces in the text that you are searching for.  SAS stores character strings as fixed length.  So you need to remove the trailing spaces before passing the string to the FINDW() function.  You can use the modifiers of the FINDW() command to make this easier.  I like to use SPIT as the modifiers for this.

data test;
  length str $200 word $50 ;
  infile cards dsd dlm='|' truncover ;
  input str word ;
  found_any=index(upcase(str),upcase(strip(word)));
  found_word=findw(str,word,,'spit');
  put (_all_) (=) ;
cards;
Successfully|success
Coca Cola,Best Buy|BEST BUY
;
str=Successfully word=success found_any=1 found_word=0
str=Coca Cola,Best Buy word=BEST BUY found_any=11 found_word=11

 

 

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

data w;
k='SUCCESSfully';
if findw(k,'SUCCESS')>0 then put 'found';
else put'not found';
run;

BernyOsuna
Obsidian | Level 7

Thats ok but it need one variable to be searched in the other one

 

key = find(upcase(var1),upcase(var2));

 

so the issue is that composed words like SUCESSFULLY will be count as 1 when searching for SUCESS

 

and using FINDW would not work all the time.

 

novinosrin
Tourmaline | Level 20

Can you please post a sample of few records ?

BernyOsuna
Obsidian | Level 7

VAR1                           VAR2

COCA COLA               PEPSI, COCA COLA

LANIX                          LANIX, SONY

 

KEY = FIND(VAR1,VAR2);

 

VAR1                           VAR2                                    KEY

COCA COLA               PEPSI, COCA COLA            8

LANIX                          LANIX, SONY                       1

SUCESS                      SUCESSFULLY                    1

 

KEY = FINDW(VAR1,VAR2);

 

VAR1                           VAR2                                    KEY

COCA COLA               PEPSI, COCA COLA            O

LANIX                          LANIX, SONY                       1

SUCESS                      SUCESSFULLY                    0    

 

novinosrin
Tourmaline | Level 20
data have;
input VAR1  & $10.                         VAR2 &  $20.;
cards;
COCA COLA               PEPSI, COCA COLA            8
LANIX                          LANIX, SONY                       1
SUCESS                      SUCESSFULLY                    1
PEPSI				PEPSI
;

data want;
set have;
do _n_=1 to countw(var2,',');
k=findw(var1,strip(scan(var2,_n_,',')));
if k>0 then leave;
end;
run;
Astounding
PROC Star

Do you have control over the format of VAR2?  The easiest way would be to add commas, and take out spaces around the commas.  For example:

 

VAR2=",COCA COLA,BEST BUY,";

 

Second easiest would be if you could just eliminate the spaces around the commas:

 

VAR2="COCA COLA,BEST BUY";

 

In the top version, you could use:

 

found = index(var2, "," || strip(VAR1) || ",");

 

You might have to adjust the value of FOUND by 1, due to the presence of the extra commas.

 

For the bottom version, you would end up with a more complex expression:

 

found = index("," || strip(var2) || "," , "," || strip(VAR1) || "," ) ;

 

Again, you might have to adjust the result by 1.

BernyOsuna
Obsidian | Level 7

I HOPE YOU FIND THIS MORE UNDERSTANDABLE

 

* CREATE MAIN TABLE *;

DATA TABLE1;

INFORMAT CAT $15. VALUE $500.;

INFILE DATALINES DLM = '09X';

INPUT CAT $ VALUE $;

DATALINES;

PAYMENT09XSUNOCO PAY, BEST BUY PAY, SUCESSFUL

;

RUN;

 

*UNIFY FORMATS*;

DATA TABLE1;

RETAIN SOURCE CAT VALUE;

SET TABLE1;

  SOURCE ='EMAIL';

  VALUE=COMPRESS(VALUE,'09X');   * ELIMINATE TABS *;

  VALUE=TRANWRD(VALUE," ,",","); * REMOVE SPACE BETWEEN WORD AND COMMA *;

  VALUE=TRANWRD(VALUE,", ",","); * REMOVE SPACE BETWEEN COMMA AND WORD *;

  VALUE=TRANWRD(VALUE,",",", "); * PUT A  SPACE BETWEEN COMMA AND WORD *;

  USER=UPCASE("&SYSUSERID.");    * RECORD USERNAME *;

  TIMESTAMP = DATETIME();

  FORMAT TIMESTAMP DATETIME16.;

RUN;

 

* TABLE 2 IS IMPORTED BUT FOR TIME REASONS I'LL JUST CREATE IT *;

 

DATA TABLE2;

INFORMAT WORD $15.;

INFILE DATALINES DSD;

DATALINES;

INPUT WORD $; 

SUCESS

BEST BUY PAY

;

RUN;

 

* PREPARE TABLE 2 *;

 

PROC SORT DATA = TABLE2 NODUPKEY OUT=T2; BY WORD; RUN;

DATA T2; RETAIN CAT; SET T2; CAT = "PAYMENT"; RUN;

 

* CREATE TABLE 3 THATS WHERE THE ISSUE HAPPENS *;

 

DATA TABLE3;

LENGTH CAT $15 WORD $50. VALUE $500.;

 IF _N_ = 1 THEN DO;

   DECLARE HASH 3 (DATASET:'TABLE1');

   E.DEFINEKEY('CAT');

   E.DEFINEDATA('CAT','VALUE');

   E.DEFINEDONE();

   CALL MISSING(CAT,VALUE);

 END;

 SET T2;

 DROP RC;

 RC = E.FIND();

 FINDW = FINDW(VALUE,WORD,'I');

 FIND  = FIND(TRIM(VALUE),TRIM(UPCASE(WORD)));

RUN;

 

 

 

 

Tom
Super User Tom
Super User

Usually this is caused by including the trailing spaces in the text that you are searching for.  SAS stores character strings as fixed length.  So you need to remove the trailing spaces before passing the string to the FINDW() function.  You can use the modifiers of the FINDW() command to make this easier.  I like to use SPIT as the modifiers for this.

data test;
  length str $200 word $50 ;
  infile cards dsd dlm='|' truncover ;
  input str word ;
  found_any=index(upcase(str),upcase(strip(word)));
  found_word=findw(str,word,,'spit');
  put (_all_) (=) ;
cards;
Successfully|success
Coca Cola,Best Buy|BEST BUY
;
str=Successfully word=success found_any=1 found_word=0
str=Coca Cola,Best Buy word=BEST BUY found_any=11 found_word=11

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 14479 views
  • 0 likes
  • 4 in conversation