I need to write the contents of a 1-column matrix into a new column in an existing dataset. Is there a way of doing this in IML such that the values being written into the new column line up with the corresponding content in the old data? (The matrix and the old dataset have the same number of rows).
Yes. Here is a simple example that you can study and play with.
data Have;
do i = 1 to 10;
x = i + 1;
y = -2*i -4;
output;
end;
run;
proc iml;
newVar = T(10:1); /* 10, 9, 8,..., 1 */
create Calc var "newVar";
append;
close Calc;
quit;
data Want;
merge Have Calc;
run;
proc print data=Want;
run;
You write the contents of the 1-column matrix to a SAS data set (IML commands to use are CREATE and APPEND). Then in a SAS data step, you merge the original data set and the data set created from PROC IML.
I agree with @PaigeMiller. SAS stores data sets rowwise, so to add a column you actually need to create a new data set. You can append rows to an existing data set, but not columns.
You could do this operation entirely in IML, but it will be more efficient to use the MERGE statement in the DATA step to combine the data (unless all the data are already in IML vectors or matrices).
In other words, with the MERGE statement, the rows in two datasets will line up? (The new dataset is calculated based on the old one, so has the same number of rows, but does not share a common field with it.)
Sorry for these kinds of questions, I'm utterly new to SAS and IML.
Yes. Here is a simple example that you can study and play with.
data Have;
do i = 1 to 10;
x = i + 1;
y = -2*i -4;
output;
end;
run;
proc iml;
newVar = T(10:1); /* 10, 9, 8,..., 1 */
create Calc var "newVar";
append;
close Calc;
quit;
data Want;
merge Have Calc;
run;
proc print data=Want;
run;
Into the Calc dataset, I also want to read the variable "key_num" (or whatever I'll call the common field from the Have dataset) for use in the "BY" statement in the later MERGE. How do I make sure that the values in "newVar" line up with the corresponding values in "key_num"?
You create your matrix in IML from the two columns in your original data set, key_num and the variable of interest.
Then when you use IML to create a new SAS data set, it will have key_num and the computed column. If there is any need to sort in IML, you'd want to sort both key_num and the data.
How do I perform a computation (for example, elementwise multiplication) on only that column of the matrix which holds the variable of interest? In other words, how do I avoid changing the key value during computations?
It is the programmers responsibility to make sure the rows line up. SAS will not guarantee this. If you have done nothing in either IML or outside of IML to change the orders of the rows, then they should line up.
You can guarantee that the rows will line up if you have some sort of row identifier variable, then you can merge the datasets using a BY statement and including this row identifier varialbe in the BY statement. For this to work, you need to have the row identifier in IML as well as in the original data set.
So, to guarantee that MERGE works properly, do I need the new dataset to have two columns: the new content and the common field (to use in the "BY" statement) from the old dataset?
Either that, or by confirming that your code has not changed the order of the data in IML and has not changed the order of the data in your data step.
Are you sure that you need IML for this problem? Perhaps the DATA step would be easier for you,
If you want to learn IML, here are some tips for getting started with IML programming. I think after you complete steps 1-3 you should be able to make a good first attempt at this problem.
I, too, am wondering whether IML is how this is done, since I'm not really doing anything that necessitates matrices (all my calculations are elementwise). What I need to do is multiply every value in a column by a certain number; raise every value in a column to a certain power; multiply each value by the corresponding value in another column; and obtain a sum of all the values in a column. Can all of these be done with the Data step?
@dvtarasov wrote:
I, too, am wondering whether IML is how this is done, since I'm not really doing anything that necessitates matrices (all my calculations are elementwise). What I need to do is multiply every value in a column by a certain number; raise every value in a column to a certain power; multiply each value by the corresponding value in another column; and obtain a sum of all the values in a column. Can all of these be done with the Data step?
So let's place the two constants into macro variables to make it easier to modify later.
%let coeffient=1.5;
%let power=3 ;
Now it sounds like you want to do something like this:
data want ;
set have ;
new_var = &coefficient*VAR1 * (VAR2 ** &power);
run;
proc means data=want sum;
var new_var ;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.