DATA Step, Macro, Functions and more

Using MERGE with LIKE command

Accepted Solution Solved
Reply
Contributor
Posts: 43
Accepted Solution

Using MERGE with LIKE command

hello, how can I use LIKE command in a MERGE?

For example I have this code:

data C;

MERGE A B;

BY ID;

WHERE VAR1 LIKE '%X%';

RUN;

 

but var1 is only in A, so sas send me this error - ERROR: Variable var1 is not on file WORK.B.

 

thanks


Accepted Solutions
Solution
‎09-30-2016 12:10 PM
Super User
Posts: 11,343

Re: Using MERGE with LIKE command

[ Edited ]
Posted in reply to Angel_Saenz

If you are trying to subset the result AFTER merging then use

IF VAR1 LIKE '%X%'

 

If you only want to select records from data set A that meet the condtion then use the dataset option to filter before merging:

data C;
   MERGE 
      A  ( WHERE=( VAR1 LIKE '%X%'))
      B
   ;
   BY ID;
  ;
RUN;

The error is because the WHERE statement applies to all the datasets in the SET or MERGE statements. The dataset options only applies to the data set followed by the (). Note that the dataset option doesn't really allow use of functions though, only comparison operators.

 

View solution in original post


All Replies
Solution
‎09-30-2016 12:10 PM
Super User
Posts: 11,343

Re: Using MERGE with LIKE command

[ Edited ]
Posted in reply to Angel_Saenz

If you are trying to subset the result AFTER merging then use

IF VAR1 LIKE '%X%'

 

If you only want to select records from data set A that meet the condtion then use the dataset option to filter before merging:

data C;
   MERGE 
      A  ( WHERE=( VAR1 LIKE '%X%'))
      B
   ;
   BY ID;
  ;
RUN;

The error is because the WHERE statement applies to all the datasets in the SET or MERGE statements. The dataset options only applies to the data set followed by the (). Note that the dataset option doesn't really allow use of functions though, only comparison operators.

 

Contributor
Posts: 43

Re: Using MERGE with LIKE command

THANKS! solution2 works, your solution 1 send me this errors:

 

15         data C;
16         MERGE A
17           B;
18         BY ID;
19         IF VAR LIKE '%X%';
                        ___
                        388
                        202
ERROR 388-185: Expecting an arithmetic operator.
ERROR 202-322: The option or parameter is not recognized and will be ignored.

 

I think because LIKE dosn't work with IF

Super User
Super User
Posts: 7,074

Re: Using MERGE with LIKE command

Posted in reply to Angel_Saenz

VAR LIKE '%X' is the same as INDEX(VAR,'X') or INDEXC(VAR,'X').

Super User
Posts: 11,343

Re: Using MERGE with LIKE command

Posted in reply to Angel_Saenz

LIKE is only allowed in a WHERE statement.

You'll need to use other character functions to find the records you want if subsetting after the merge.

If you are looking with anything with an X then

 

IF index(var1,'X')> 0;

would work. Index searches for any match of the string in quotes in the variable.

PROC Star
Posts: 1,760

Re: Using MERGE with LIKE command

[ Edited ]

Note that from a performance viewpoint, the different solutions are not equivalent.

 

option (where=( VAR1 like '%X%' )) is  faster than

option (where=( index(VAR1,'X') ))   which is in turn faster than

option (where=( find(VAR1,'X') ))  

  

In theory, the IF test is even slower as it evaluates the data once it has been read and loaded rather than at the source when reading the table. This is a common misconception.

In real life, this is not always the case. I suspect that the WHERE clause slows down the read operation whereas the IF clause allows for bulk loading and then discards the superfluous records.

This should not be the case, and there is something not quite right in the way WHERE clauses are executed. That's a place SAS can look at if they want to optimise their data step compiler. Another place is that WHERE clauses don't utilise the same resources when used as an option or as a statement.

 

See below how IF can be twice the speed of WHERE.

 

More performance tips in https://www.amazon.com/High-Performance-SAS-Coding-Christian-Graffeuille/dp/1512397490

 

If you want to benchmark on your system, here are examples I just ran (SAS9.4 on WIN 64) :

 

%* CPU times and real times are equal;

options nofullstimer;

%* Create sample and load it in memory;

data A(compress=no); A='ssssssssssssssssss'; drop I; do I=1 to 1e8; output; end; run;

sasfile A load;

%* Select none;

data _null_; set A ( where=( A like '%X%' )) ; run; %*  5.8 seconds;

data _null_; set A ( where=( index(A,'X') )) ; run; %*  7.9 seconds;

data _null_; set A ( where=( find(A,'X') ))  ; run; %*  8.9 seconds;

data _null_; set A ; if find(A,'X')                 ; run; %*  4.4 seconds;

%* Select all;

data _null_; set A ( where=( A like '%s%' )) ; run; %* 10.6 seconds;

data _null_; set A ( where=( index(A,'s') )) ; run; %* 13.8 seconds;

data _null_; set A ( where=( find(A,'s') ))  ; run; %* 14.0 seconds;

data _null_; set A ; if find(A,'s')                 ; run; %*  3.4 seconds;

sasfile A close;

 

 Also note how the IF statement becomes faster for the "select all" test since the string match happens right away, whereas the WHERE clauses become slower even though the string match also happens sooner. Not good!

It looks like SAS wastes time slow-reading, then assessing, then slow-loading the data.

Contributor
Posts: 43

Re: Using MERGE with LIKE command

Thanks for your answer
Contributor
Posts: 43

Re: Using MERGE with LIKE command

Thanks   I accepted your first answer as solution.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 689 views
  • 5 likes
  • 4 in conversation