DATA Step, Macro, Functions and more

FIND vs FINDW or anything else

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

FIND vs FINDW or anything else

I have this list of words and i need to check if those words exist in another variable, in the same dataset, which cointains a set of words separated by ','.

 

data dataset;

set dataset;

x = find(upcase(var1),upcase(var2));

run;

 

this works most of the time, but if i have to look for SUCESS its telling me that the word exist because it reads SUCCESSfully, so need te exact word.

 

So then i moved to FINDW

 

data dataset;

set dataset;

x = find(upcase(var1),upcase(var2));

run;

 

But for 2 words at the same time like: BEST BUY and the string is COCA COLA, BEST BUY it brings 0 meaning it couldnt find it.

 

  • It was to work for both: single words or 2 or more words a the same time.

 


Accepted Solutions
Solution
Monday
Super User
Super User
Posts: 8,120

Re: FIND vs FINDW or anything else

[ Edited ]
Posted in reply to BernyOsuna

Usually this is caused by including the trailing spaces in the text that you are searching for.  SAS stores character strings as fixed length.  So you need to remove the trailing spaces before passing the string to the FINDW() function.  You can use the modifiers of the FINDW() command to make this easier.  I like to use SPIT as the modifiers for this.

data test;
  length str $200 word $50 ;
  infile cards dsd dlm='|' truncover ;
  input str word ;
  found_any=index(upcase(str),upcase(strip(word)));
  found_word=findw(str,word,,'spit');
  put (_all_) (=) ;
cards;
Successfully|success
Coca Cola,Best Buy|BEST BUY
;
str=Successfully word=success found_any=1 found_word=0
str=Coca Cola,Best Buy word=BEST BUY found_any=11 found_word=11

 

 

View solution in original post


All Replies
PROC Star
Posts: 1,833

Re: FIND vs FINDW or anything else

Posted in reply to BernyOsuna

data w;
k='SUCCESSfully';
if findw(k,'SUCCESS')>0 then put 'found';
else put'not found';
run;

Occasional Contributor
Posts: 9

Re: FIND vs FINDW or anything else

Posted in reply to novinosrin

Thats ok but it need one variable to be searched in the other one

 

key = find(upcase(var1),upcase(var2));

 

so the issue is that composed words like SUCESSFULLY will be count as 1 when searching for SUCESS

 

and using FINDW would not work all the time.

 

PROC Star
Posts: 1,833

Re: FIND vs FINDW or anything else

Posted in reply to BernyOsuna

Can you please post a sample of few records ?

Occasional Contributor
Posts: 9

Re: FIND vs FINDW or anything else

Posted in reply to novinosrin

VAR1                           VAR2

COCA COLA               PEPSI, COCA COLA

LANIX                          LANIX, SONY

 

KEY = FIND(VAR1,VAR2);

 

VAR1                           VAR2                                    KEY

COCA COLA               PEPSI, COCA COLA            8

LANIX                          LANIX, SONY                       1

SUCESS                      SUCESSFULLY                    1

 

KEY = FINDW(VAR1,VAR2);

 

VAR1                           VAR2                                    KEY

COCA COLA               PEPSI, COCA COLA            O

LANIX                          LANIX, SONY                       1

SUCESS                      SUCESSFULLY                    0    

 

PROC Star
Posts: 1,833

Re: FIND vs FINDW or anything else

Posted in reply to BernyOsuna
data have;
input VAR1  & $10.                         VAR2 &  $20.;
cards;
COCA COLA               PEPSI, COCA COLA            8
LANIX                          LANIX, SONY                       1
SUCESS                      SUCESSFULLY                    1
PEPSI				PEPSI
;

data want;
set have;
do _n_=1 to countw(var2,',');
k=findw(var1,strip(scan(var2,_n_,',')));
if k>0 then leave;
end;
run;
Super User
Posts: 6,785

Re: FIND vs FINDW or anything else

Posted in reply to BernyOsuna

Do you have control over the format of VAR2?  The easiest way would be to add commas, and take out spaces around the commas.  For example:

 

VAR2=",COCA COLA,BEST BUY,";

 

Second easiest would be if you could just eliminate the spaces around the commas:

 

VAR2="COCA COLA,BEST BUY";

 

In the top version, you could use:

 

found = index(var2, "," || strip(VAR1) || ",");

 

You might have to adjust the value of FOUND by 1, due to the presence of the extra commas.

 

For the bottom version, you would end up with a more complex expression:

 

found = index("," || strip(var2) || "," , "," || strip(VAR1) || "," ) ;

 

Again, you might have to adjust the result by 1.

Occasional Contributor
Posts: 9

Re: FIND vs FINDW or anything else

Posted in reply to Astounding

I HOPE YOU FIND THIS MORE UNDERSTANDABLE

 

* CREATE MAIN TABLE *;

DATA TABLE1;

INFORMAT CAT $15. VALUE $500.;

INFILE DATALINES DLM = '09X';

INPUT CAT $ VALUE $;

DATALINES;

PAYMENT09XSUNOCO PAY, BEST BUY PAY, SUCESSFUL

;

RUN;

 

*UNIFY FORMATS*;

DATA TABLE1;

RETAIN SOURCE CAT VALUE;

SET TABLE1;

  SOURCE ='EMAIL';

  VALUE=COMPRESS(VALUE,'09X');   * ELIMINATE TABS *;

  VALUE=TRANWRD(VALUE," ,",","); * REMOVE SPACE BETWEEN WORD AND COMMA *;

  VALUE=TRANWRD(VALUE,", ",","); * REMOVE SPACE BETWEEN COMMA AND WORD *;

  VALUE=TRANWRD(VALUE,",",", "); * PUT A  SPACE BETWEEN COMMA AND WORD *;

  USER=UPCASE("&SYSUSERID.");    * RECORD USERNAME *;

  TIMESTAMP = DATETIME();

  FORMAT TIMESTAMP DATETIME16.;

RUN;

 

* TABLE 2 IS IMPORTED BUT FOR TIME REASONS I'LL JUST CREATE IT *;

 

DATA TABLE2;

INFORMAT WORD $15.;

INFILE DATALINES DSD;

DATALINES;

INPUT WORD $; 

SUCESS

BEST BUY PAY

;

RUN;

 

* PREPARE TABLE 2 *;

 

PROC SORT DATA = TABLE2 NODUPKEY OUT=T2; BY WORD; RUN;

DATA T2; RETAIN CAT; SET T2; CAT = "PAYMENT"; RUN;

 

* CREATE TABLE 3 THATS WHERE THE ISSUE HAPPENS *;

 

DATA TABLE3;

LENGTH CAT $15 WORD $50. VALUE $500.;

 IF _N_ = 1 THEN DO;

   DECLARE HASH 3 (DATASET:'TABLE1');

   E.DEFINEKEY('CAT');

   E.DEFINEDATA('CAT','VALUE');

   E.DEFINEDONE();

   CALL MISSING(CAT,VALUE);

 END;

 SET T2;

 DROP RC;

 RC = E.FIND();

 FINDW = FINDW(VALUE,WORD,'I');

 FIND  = FIND(TRIM(VALUE),TRIM(UPCASE(WORD)));

RUN;

 

 

 

 

Solution
Monday
Super User
Super User
Posts: 8,120

Re: FIND vs FINDW or anything else

[ Edited ]
Posted in reply to BernyOsuna

Usually this is caused by including the trailing spaces in the text that you are searching for.  SAS stores character strings as fixed length.  So you need to remove the trailing spaces before passing the string to the FINDW() function.  You can use the modifiers of the FINDW() command to make this easier.  I like to use SPIT as the modifiers for this.

data test;
  length str $200 word $50 ;
  infile cards dsd dlm='|' truncover ;
  input str word ;
  found_any=index(upcase(str),upcase(strip(word)));
  found_word=findw(str,word,,'spit');
  put (_all_) (=) ;
cards;
Successfully|success
Coca Cola,Best Buy|BEST BUY
;
str=Successfully word=success found_any=1 found_word=0
str=Coca Cola,Best Buy word=BEST BUY found_any=11 found_word=11

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 119 views
  • 0 likes
  • 4 in conversation