Hi,
I want to search multiple values in a string field. e.g. currently i am using this code to look for either Val1 or Val2 :
Newfield = Index (Field, in "Val1", "Val2"))
The original field has values separted by comma like this Val1,Val2, ....,Val50
SAS error says too many arguments.
Any ideas how can search more than 1 value in the same field?
Thanks
why not just
where Field in ("Val1", "Val2");
data looks like this:
Looking for VAL1 or VAL2 in "Field"
Have
ID Field
1 VAL1, VAL40, VAL60
2 VAL0, VAL40, VAL61
3 VAL1, VAL40, VAL2
want
ID Field WANT Comments
1 VAL1, VAL40, VAL60 1 has VAL1
2 VAL0, VAL40, VAL61 0 does not have VAL1 OR VAL2
3 VAL1, VAL40, VAL2 2 has both VAL1 OR VAL2
data have;
input ID Field $30.;
cards;
1 VAL1, VAL40, VAL60
2 VAL0, VAL40, VAL61
3 VAL1, VAL40, VAL2
;
data want;
set have;
length want $50;
do _n_=1 to countw(field,',');
temp=strip(scan(field,_n_,','));
if temp in ('VAL1','VAL2') then
want=catx('-',want,temp);
end;
drop temp;
run;
When the data value itself is string that is a comma separated value list you can use the INDEXW function to locate the position of a word in the value. If you increase the number of flag cases the number of comments possibilities increase in a combinatoric fashion.
Example:
data have;
input ID & Field $40.;
datalines;
1 VAL1, VAL40, VAL60
2 VAL0, VAL40, VAL61
3 VAL1, VAL40, VAL2
;
data want;
set have;
flag1 = indexw(field, 'VAL1', ' ,') > 0;
flag2 = indexw(field, 'VAL2', ' ,') > 0;
want = flag1 + flag2;
length comments $50;
select (cats(flag1,flag2));
when ('00') comments = 'Has neither VAL1 nor VAL2';
when ('10') comments = 'Has only VAL1';
when ('01') comments = 'Has only VAL2';
otherwise comments = 'Has both VAL1 and VAL2';
end;
run;
For more flagging based on more complex patterns you can use PRXMATCH function to search using a PERL regular expression.
If you have a list of match terms and you need to know the index position of a match in the list do the following
Place the terms in an array and use the WHICHC function to find the matching index:
data have; input field $; datalines; xyzzy foo bar 123 walter run; data want; set have; length termIndex 8; array terms(15) $20 ( "ralph", "crimson", "zork", "xyzzy", "romeo", "xyzzy", "herald", "beluga", "foo", "snafu", "egbert", "herd", "scope", "bar", "123" ); termIndex = whichC (field, of terms(*)); run;
A non-match will return 0.
If you only need to test for any match just use the IN operator
flag = field in ( "ralph", "crimson", "zork", "xyzzy", "romeo", "xyzzy", "herald", "beluga", "foo", "snafu", "egbert", "herd", "scope", "bar", "123" );
For a list of 50 terms, you might be better off putting the values in a separate data set and querying. The control of the terms is easier when in a data set, and source code does not need to be changed if the terms change.
Example:
data terms;
infile datalines dsd dlm=',';
input term $ @@;
if not missing(term);
datalines;
"ralph", "crimson", "zork", "xyzzy", "romeo",
"xyzzy", "herald", "beluga", "foo", "snafu",
"egbert", "herd", "scope", "bar", "123"
;
proc sql;
create table want as
select
have.field,
exists (select * from terms where terms.term=have.field) as flag
from
have
;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.