I need to filter data in SAS/IML with a USE or READ statement by specifying rows where a variable DOES NOT start with certain characters. SAS/IML supports a where clause with the =: operand, such as USE mydata where(myID =: "AB"); to select rows where the myID variable starts with AB. In my application, I only know that I do not want rows that begin with AB, but SAS/IML does not seem to support a statement like: USE mydata where(myID ^=: "AB");
I know I can preprocess the data before calling IML, but I would like to find a way to filter the incoming data directly in IML.
> I know I can preprocess the data before calling IML, but I would like to find a way to filter the incoming data directly in IML.
When programmers want to do "filter ...directly in IML," it is often because the value you are trying to filter is not known prior to running the IML program. Rather, it is produced during the program.
As you have discovered, the WHERE option on the USE statement in IML does not support the same options as the WHERE option for the DATA= option in Base SAS. I suggest you use the SUBMIT/ENDSUBMIT block to call the DATA step from within your IML program to mark or filter the data. If necessary, you can pass a parameter that indicates the value you are using to filter the data.
For example, the following program uses the SUBMIT/ENDSUBMIT block to exclude all students whose names begin with the previs "Al":
proc iml;
Str = "Al"; /* string to search for */
Len = nleng(Str); /* length of string */
submit Str Len; /* send parameters to Base SAS */
data _Filter / view=_Filter; /* create data VIEW */
set sashelp.class;
/* use subsetting IF stament to exclude certain obs */
if substr(Name, 1, &Len) ^= "&Str";
run;
endsubmit;
use _Filter;
read all var "name";
close;
print name;
If, fo some reason, you don't want to use the subsetting IF statement to exclude the obs, you can create an indicator variable in the DATA step such as
_Include = (substr(Name, 1, &Len) ^= "&Str");
and then in your IML program use the syntax
use _Filter where(_Include=1);
Why not just use normal SAS dataset options like you would in PROC PRINT instead of trying to use ANY type of IML syntax.
proc print data=sashelp.class(where=(name ^=: 'A'));
run;
So in IML try:
USE mydata(where=(myID ^=: "AB"));
> I know I can preprocess the data before calling IML, but I would like to find a way to filter the incoming data directly in IML.
When programmers want to do "filter ...directly in IML," it is often because the value you are trying to filter is not known prior to running the IML program. Rather, it is produced during the program.
As you have discovered, the WHERE option on the USE statement in IML does not support the same options as the WHERE option for the DATA= option in Base SAS. I suggest you use the SUBMIT/ENDSUBMIT block to call the DATA step from within your IML program to mark or filter the data. If necessary, you can pass a parameter that indicates the value you are using to filter the data.
For example, the following program uses the SUBMIT/ENDSUBMIT block to exclude all students whose names begin with the previs "Al":
proc iml;
Str = "Al"; /* string to search for */
Len = nleng(Str); /* length of string */
submit Str Len; /* send parameters to Base SAS */
data _Filter / view=_Filter; /* create data VIEW */
set sashelp.class;
/* use subsetting IF stament to exclude certain obs */
if substr(Name, 1, &Len) ^= "&Str";
run;
endsubmit;
use _Filter;
read all var "name";
close;
print name;
If, fo some reason, you don't want to use the subsetting IF statement to exclude the obs, you can create an indicator variable in the DATA step such as
_Include = (substr(Name, 1, &Len) ^= "&Str");
and then in your IML program use the syntax
use _Filter where(_Include=1);
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.