- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am trying to capture all the names that follow a pattern using the LIKE operator in proc sql. I don't want to use a OR command or a UNION in my sql. Is there a better way of doing it like using a IN associated with LIKE.
data test ;
input id name $ ;
datalines ;
12 ANOOP
13 ANEESH
14 REMYA
15 REMY
16 JACK
17 JACOB
19 AMY
20 ROSE
;
run ;
proc sql;
select *
from test
where name like in('%AN%','%RE%') ;
when i execute this I get the below error
ERROR: LIKE operator requires character operands.
so I removed IN and ran it and I got the below errors
43 where name like ('%AN%','%RE%') ;
_
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, ), *, **, +, -, /, <, <=, <>, =, >, >=, ?, AND, BETWEEN,
CONTAINS, EQ, EQT, GE, GET, GT, GTT, IN, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, ^, ^=, |, ||, ~, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
Thanks,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you really are looking for only the prefix AN or RE and prefer to stay with SQL, use:
where substr(name, 1, 2) in ("AN", "RE");
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't know if there would be any benefit to using :
where prxmatch ("/AN|RE/oi", name) > 0;
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you sure that you want the LIKE condition with the preceding %. Meaning that you are looking to select rows where the name contains the string AN or RE anywhere in the name, rather than just as a prefix, as your data suggests?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If (1) you really are looking for only the prefix AN or RE, and (2) you are willing to switch from SQL to a DATA step, it becomes easy:
where name in : ('AN', 'RE');
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you really are looking for only the prefix AN or RE and prefer to stay with SQL, use:
where substr(name, 1, 2) in ("AN", "RE");
PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. That almost gave me what I am looking for. But if they come between the sentence I may not be able to use the IN and SUBSTR. So in that case is the best approach to shift from proc sql using to the basic data step and using a Where condition?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Astounding's proposition and mine above are both for the prefix only case. If the substring can occur anywhere in the sentence, go for the PRXMATCH solution proposed earlier. - PG
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you want to search a string for more than one substring anywhere in the string (as your LIKE indicates) then I would go for a regular expression as already proposed.
If you run this SQL against a data base then I would use explicit SQL and use the data base function for regular expressions (a lot of data bases have such an implementation) to avoid downloading all the data first into SAS for sub-setting (prxmatch() can't be pushed to the data base for execution).