Hi,
I want to search multiple values in a string field. e.g. currently i am using this code to look for either Val1 or Val2 :
Newfield = Index (Field, in "Val1", "Val2"))
The original field has values separted by comma like this Val1,Val2, ....,Val50
SAS error says too many arguments.
Any ideas how can search more than 1 value in the same field?
Thanks
why not just
where Field in ("Val1", "Val2");
data looks like this:
Looking for VAL1 or VAL2 in "Field"
Have
ID Field
1 VAL1, VAL40, VAL60
2 VAL0, VAL40, VAL61
3 VAL1, VAL40, VAL2
want
ID Field WANT Comments
1 VAL1, VAL40, VAL60 1 has VAL1
2 VAL0, VAL40, VAL61 0 does not have VAL1 OR VAL2
3 VAL1, VAL40, VAL2 2 has both VAL1 OR VAL2
data have;
input ID Field $30.;
cards;
1 VAL1, VAL40, VAL60
2 VAL0, VAL40, VAL61
3 VAL1, VAL40, VAL2
;
data want;
set have;
length want $50;
do _n_=1 to countw(field,',');
temp=strip(scan(field,_n_,','));
if temp in ('VAL1','VAL2') then
want=catx('-',want,temp);
end;
drop temp;
run;
When the data value itself is string that is a comma separated value list you can use the INDEXW function to locate the position of a word in the value. If you increase the number of flag cases the number of comments possibilities increase in a combinatoric fashion.
Example:
data have; input ID & Field $40.; datalines; 1 VAL1, VAL40, VAL60 2 VAL0, VAL40, VAL61 3 VAL1, VAL40, VAL2 ; data want; set have; flag1 = indexw(field, 'VAL1', ' ,') > 0; flag2 = indexw(field, 'VAL2', ' ,') > 0; want = flag1 + flag2; length comments $50; select (cats(flag1,flag2)); when ('00') comments = 'Has neither VAL1 nor VAL2'; when ('10') comments = 'Has only VAL1'; when ('01') comments = 'Has only VAL2'; otherwise comments = 'Has both VAL1 and VAL2'; end; run;
For more flagging based on more complex patterns you can use PRXMATCH function to search using a PERL regular expression.
If you have a list of match terms and you need to know the index position of a match in the list do the following
Place the terms in an array and use the WHICHC function to find the matching index:
data have; input field $; datalines; xyzzy foo bar 123 walter run; data want; set have; length termIndex 8; array terms(15) $20 ( "ralph", "crimson", "zork", "xyzzy", "romeo", "xyzzy", "herald", "beluga", "foo", "snafu", "egbert", "herd", "scope", "bar", "123" ); termIndex = whichC (field, of terms(*)); run;
A non-match will return 0.
If you only need to test for any match just use the IN operator
flag = field in ( "ralph", "crimson", "zork", "xyzzy", "romeo", "xyzzy", "herald", "beluga", "foo", "snafu", "egbert", "herd", "scope", "bar", "123" );
For a list of 50 terms, you might be better off putting the values in a separate data set and querying. The control of the terms is easier when in a data set, and source code does not need to be changed if the terms change.
Example:
data terms; infile datalines dsd dlm=','; input term $ @@; if not missing(term); datalines; "ralph", "crimson", "zork", "xyzzy", "romeo", "xyzzy", "herald", "beluga", "foo", "snafu", "egbert", "herd", "scope", "bar", "123" ; proc sql; create table want as select have.field, exists (select * from terms where terms.term=have.field) as flag from have ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.