Hello,
I am trying to capture all the names that follow a pattern using the LIKE operator in proc sql. I don't want to use a OR command or a UNION in my sql. Is there a better way of doing it like using a IN associated with LIKE.
data test ;
input id name $ ;
datalines ;
12 ANOOP
13 ANEESH
14 REMYA
15 REMY
16 JACK
17 JACOB
19 AMY
20 ROSE
;
run ;
proc sql;
select *
from test
where name like in('%AN%','%RE%') ;
when i execute this I get the below error
ERROR: LIKE operator requires character operands.
so I removed IN and ran it and I got the below errors
43 where name like ('%AN%','%RE%') ;
_
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, ), *, **, +, -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, IN, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, ^, ^=, |, ||, ~, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
Thanks,
If you really are looking for only the prefix AN or RE and prefer to stay with SQL, use:
where substr(name, 1, 2) in ("AN", "RE");
PG
I don't know if there would be any benefit to using :
where prxmatch ("/AN|RE/oi", name) > 0;
PG
Are you sure that you want the LIKE condition with the preceding %. Meaning that you are looking to select rows where the name contains the string AN or RE anywhere in the name, rather than just as a prefix, as your data suggests?
If (1) you really are looking for only the prefix AN or RE, and (2) you are willing to switch from SQL to a DATA step, it becomes easy:
where name in : ('AN', 'RE');
If you really are looking for only the prefix AN or RE and prefer to stay with SQL, use:
where substr(name, 1, 2) in ("AN", "RE");
PG
Thanks. That almost gave me what I am looking for. But if they come between the sentence I may not be able to use the IN and SUBSTR. So in that case is the best approach to shift from proc sql using to the basic data step and using a Where condition?
@Astounding's proposition and mine above are both for the prefix only case. If the substring can occur anywhere in the sentence, go for the PRXMATCH solution proposed earlier. - PG
If you want to search a string for more than one substring anywhere in the string (as your LIKE indicates) then I would go for a regular expression as already proposed.
If you run this SQL against a data base then I would use explicit SQL and use the data base function for regular expressions (a lot of data bases have such an implementation) to avoid downloading all the data first into SAS for sub-setting (prxmatch() can't be pushed to the data base for execution).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.