BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yao_W
Calcite | Level 5

Hi all,


I would like to do the same thing (see below) in PROC IML. I created two variables and output observations based on values of "b".


data test;
input a b;
cards;
1 0
2 1
3 2
4 0
5 1
6 2
7 0
;
run;

data test1 test2 test3;
      set test;
      if b=0 then output test1;
      else if b=1 output test2;
      else output test3;
run;

With only two groups, I tried the syntax below and it worked.

proc iml;
a={1, 2, 3, 4, 5, 6, 7};                    *a vector with random numbers;
b={0, 1, 0, 0, 1, 0, 1};                     *a vector indicating the group membership;
c=t(remove(a,loc(b)));                     *elements from a when b=0;
d=t(remove(a,loc(element(a,c))));     *elements from a when b=1;
quit;

BUT, I have difficulty if there are three groups such as:
a={1, 2, 3, 4, 5, 6, 7};
b={0, 1, 2, 0, 1, 2, 1};

I can create SAS data set to save those values and then to output to three SAS data sets. But I would like to learn how to achieve this in PROC IML.

Thanks to all!

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Two approaches;

1) Use the UNIQUE-LOC technique  (Search for "unique-loc" at blogs.sas.com/content/iml)

2) Use the UNIQUEBY approach

Assuming that you are using SAS/IML 12.1 or better, you can construct the data set names inside the DO loop.

For example, here is the UNIQUE-LOC approach:

proc iml;
a={1, 2, 3, 4, 5, 6, 7};                  
b={0, 1, 2, 0, 1, 2, 1};

u = unique(b);   
do i = 1 to ncol(u);
   v = a[loc(b=u)];
   /* if you want it written to a data set, do the following */
   dsname = "test" + strip(char(i));
   create (dsname) var {v};  append;  close (dsname);
end;

View solution in original post

9 REPLIES 9
Rick_SAS
SAS Super FREQ

Two approaches;

1) Use the UNIQUE-LOC technique  (Search for "unique-loc" at blogs.sas.com/content/iml)

2) Use the UNIQUEBY approach

Assuming that you are using SAS/IML 12.1 or better, you can construct the data set names inside the DO loop.

For example, here is the UNIQUE-LOC approach:

proc iml;
a={1, 2, 3, 4, 5, 6, 7};                  
b={0, 1, 2, 0, 1, 2, 1};

u = unique(b);   
do i = 1 to ncol(u);
   v = a[loc(b=u)];
   /* if you want it written to a data set, do the following */
   dsname = "test" + strip(char(i));
   create (dsname) var {v};  append;  close (dsname);
end;

Yao_W
Calcite | Level 5

Thanks for you quick reply, Rick.

I tried the UNIQUE-LOC approach. This approach does not create three vectors or data sets. In my understanding, v is a temporary vector overwritten every time in the do loop. I want three separate vectors like v1, v2, and v3. I remember I also tried UNIQUEBY last time but I failed. I will try again to see what I will get.

I also tried WHERE clause before. I failed because the length of the categorical variable is not fixed.

Rick_SAS
SAS Super FREQ

Your post says "I would like to do the same thing (see below) in PROC IML."  That is what I showed. The code I provided creates three data sets (TEST1, TEST2, and TEST3) that contain the information you asked for.

Yao_W
Calcite | Level 5

I might do something wrong then.

I copied and ran your code directly, but got error messages.

186 create (dsname) var {v};
                     -
                    22
                    76

ERROR 22-322: Expecting a name.

ERROR 76-322: Syntax error, statement will be ignored.

So I deleted the parenthesis. Then I got

NOTE: The data set WORK.DSNAME has 12 observations and 1 variables.

NOTE: The data set WORK.DSNAME has 12 observations and 1 variables.

NOTE: The data set WORK.DSNAME has 76 observations and 1 variables.

So only one data set was created, although it should be "test1", "test2", and "test3“.

Rick_SAS
SAS Super FREQ

Apparently you are not running SAS/IML 12.1 or better. What appears in the SAS Log when you submit

%put &SYSVLONG;

Yao_W
Calcite | Level 5

244  %put &SYSVLONG;

9.03.01M1P110211

Rick_SAS
SAS Super FREQ

You have SAS 9.3, which is about 3 years old. (That would have been good information to provide in your question.) The CREATE and CLOSE statements in your version of SAS do not support data set names that are defined at run time, which is why you are getting an error.

With software this old, I would either upgrade or use Base SAS tools to do the splitting. Do an internet search for "sas split dataset by variable" and you'll find dozens of techniques and macros, such as this page.  It's not that hard, but if you get stuck the fine folks at the Base SAS and macro support community can help you over the hurdles.

Yao_W
Calcite | Level 5

Thanks, Rick. I will try it on other computers that are installed SAS 9.4. The UNIQUE function works for me so far.

Rick_SAS
SAS Super FREQ

You can also use the WHERE clause to read only certain observations into SAS/IML within a loop. For example, this blog posts shows an example where the read statement is

read all var {a} where (b=i); /* read i_th category */

The downside of this WHERE clause approach is that you have to know the categories in advance and you end up passing through the data multiple times.  If the data fit in memory, I'd stick with UNIQUE-LOC.  BTW, several useful variations of the UNIQUE-LOC technique are described on pp 60-72 of Statistical Programming with SAS/IML Software.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 9 replies
  • 1448 views
  • 3 likes
  • 2 in conversation