BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
msantosp
Fluorite | Level 6

Dear all,

 

I am using SAS 9.3 and I would like to fill a two-dimensional array from a current table of data in a data step (I do not have proc IML and I want to learn how to proceed using arrays).

 

 

data work.staff;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;
58# 976# 1
78# 571# 2
49# 221# 2
39# 056# 2
40# 592# 2
38# 382# 3
12# 400# 5
21# 237# 1
38# 258# 3 
;

 

I am not able to get it. I can imput then on a one dimension array but it is not enough for my purpose.

 

 

Can you help me?

 

Thank you in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
KachiM
Rhodochrosite | Level 12

Doing Matrix Multiplication in a Data Step is possible. Usually it is not for beginners. But you must be courageous to read and understand the code. You have two Data sets with 3 rows and 3 columns, 3 rows and 2 columns. The product matrix is 3 by 2. Firstly, the Data set (Table1 say T1) is loaded into the Array M[3,3]. Similarly T2 is loaded into Array N[3,2]. Both Loadings are done when _N_ =1. The Product Matrix P[3,2] is computed in the usual way as we do in matrix multiplication in the Data Step. Here is the code. Be patient to understand the code.

data work.t1;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
datalines;
58 # 976 # 1
78 # 571 # 2
49 # 221 # 2
;
run;

data work.t2;
   infile datalines dlm='#';
   input Col1 Col2;
   datalines;
12# 15
748# 48
494# 112
;
run;

data t3;
array m[3,3] _temporary_;
array n[3,2] _temporary_;
array p[3,2] _temporary_;
call missing(of p[*]);
if _n_ = 1 then do;
   /* Load the Data SET T1 into Array sized to m[3,3] */
   do i = 1 by 1 until(z1);
      set t1 end = z1;
      array c[3] col1 - col3;
      do j = 1 to 3;
         m[i, j] = c[j];
      end;
   end;
   /* LOad the Data Set T2 into Array sized to n[3,2] */
   do i = 1 by 1 until(z2);
      set t2 end = z2;
      set t2 end = z2;
      array v[2] col1 - col2;
      do j = 1 to 2;
         n[i,j] = v[j];
      end;
  end;
end;
/* Do the Matrix Product (3*3 X 3*2 = 3*2) */
/* p[i,j] = SUM of m[i,k] * n[k,j] */
do i = 1 to 3;
   do j = 1 to 2;
      do k = 1 to 3;
         p[i,j] + (m[i,k] * n[k,j]);
      end;
   end;
end;
/** Output Matrix **/
do i = 1 to 3;
   do j = 1 to 2;
      put p[i,j] =;
   end;
end;
stop;
run;

This can be done easily using PROC FCMP which also comes with Base SAS. Again you consult the SAS Documentation for Proc FCMP. There are SPECIAL Matrix Functions. I am using CALL MULT(M, N, P) to get the Product Matrix. Read_array and Call Dynamic_array are special FCMP functions which you have to read. Read_array reads a Data Set into a 2-dimensional Array. The Call Dynamic_array creates an array of required dimensions (in this case 3 by 2) at run_time. The second function is available in Proc FCMP and is not supported in Data Step.

 

Here is the magic program:

proc fcmp;
   array m[1] / nosym;
   array n[1] / nosym;
   array p[1] / nosym;
   file log;
   rc = read_array('t1', m);
   rc = read_array('t2', n);
   call dynamic_array(p,3,2);
   call mult(m,n,p);
   put p =;
quit;

Here is the Product Matrix(P)

 

p[1, 1]=731238 p[1, 2]=47830 p[2, 1]=429032 p[2, 2]=28802 p[3, 1]=166884 p[3, 2]=11567

 

 

AFTER 6 Hours of my Reply:

 

There were two unwanted statements in the first solution. They are deleted and the revised code is given below:

 

data _null_;
array m[3,3] _temporary_;
array n[3,2] _temporary_;
array p[3,2] _temporary_;

if _n_ = 1 then do;
   /* Load the Data SET T1 into Array sized to m[3,3] */
   do i = 1 by 1 until(z1);
      set t1 end = z1;
      array c[3] col1 - col3;
      do j = 1 to 3;
         m[i, j] = c[j];
      end;
   end;
   /* LOad the Data Set T2 into Array sized to n[3,2] */
   do i = 1 by 1 until(z2);
      set t2 end = z2;
      array v[2] col1 - col2;
      do j = 1 to 2;
         n[i,j] = v[j];
      end;
  end;
end;
/* Do the Matrix Product (3*3 X 3*2 = 3*2) */
/* p[i,j] = SUM of m[i,k] * n[k,j] */
do i = 1 to 3;
   do j = 1 to 2;
      do k = 1 to 3;
         p[i,j] + (m[i,k] * n[k,j]);
      end;
   end;
end;
/** Output Matrix **/
do i = 1 to 3;
   do j = 1 to 2;
      put p[i,j] =;
   end;
end;
stop;
run;

To the Community Manager:

 

Why the replies are not time-sequenced? Earlier replies come at the bottom and the readability of the thread becomes fragmented. At least the Original Poster could have NOTICED and reacted to the reply. To me replying to Posts become dull and monotonous as there were no reactions shown to the reply. A mere 'LIKE' is the only solace.

View solution in original post

14 REPLIES 14
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

this is a two dimensional that might help.

 

data totals(drop= t_1-t_&upper. x y);
   array N_t(1:&n_m.);
   array t_(1:&n_m.,1:&n_m.);
   set mydata;
   do x = 1 to &n_m.;
      do y = 1 to &n_m.;
         N_t(y)=t_(x,y);
         if N_t(y)=0 then N_t(y)=.;
         if y = &n_m. then output;
      end;
   end;
run;

where you would need to change x and y to the columns of your choice.  I edit because of typo.

 

 

ballardw
Super User

@msantosp wrote:

Dear all,

 

I am using SAS 9.3 and I would like to fill a two-dimensional array from a current table of data in a data step (I do not have proc IML).

 

 

data work.staff;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;
58# 976# 1
78# 571# 2
49# 221# 2
39# 056# 2
40# 592# 2
38# 382# 3
12# 400# 5
21# 237# 1
38# 258# 3 
;

 

I am not able to get it. I can imput then on a one dimension array but it is not enough for my purpose.

 

 

Can you help me?

 

Thank you in advance.

 


A SAS array will only use variables/values available for a single record. So there really isn't a way to place an entire data set into an array as I believe you are contemplating.

 

You might describe what you are attempting to do with the data.

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please explain your question show what you expect out or how you are going to use it.  An array in SAS is not the same as other languages.  Arrays are a shorthand way of referring to collections of variables within a datastep.  At the end of the datastep the array disappears and you are only left with the variables.

msantosp
Fluorite | Level 6

Yes, I know that every array will dissapear when the data step finish...

 

I would like to define a two dimension array because I would like to operate it like a matrix. For example, to compute the product of two dataset...

 

So I think that the best method should be this and, once calculated, output it to a dataset.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, here is the SAS documents on the syntax for multi-dimensional arrays;

http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000739621.htm

 

I am not sure what help you expect us to provide if you do not provide an example?

msantosp
Fluorite | Level 6

Ok, so my example is the next

 

Could you use arrays tu compute the matrix result as the product of the following dataset:

 

data work.table1;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;
58# 976# 1
78# 571# 2
49# 221# 2
;
run;

data work.table2;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;
12# 15
748# 48
494# 112
;
run;

The result should be:

58*12+976*748+1*15

78*12+571*748+ 2*15

49*12+221*748+ 2*15

RW9
Diamond | Level 26 RW9
Diamond | Level 26

There are several parts to this question.

1) How are you going to get that data together, there are no identifying variables by which to merge them.

2) Once you have them merged which elements form your formula, as:

58*12+976*748+1*15   -- 58, 976, 1 all come from table 1 row 1, and 12 and 15 come from table 2 row 1.  But 748 is not on row 1 of either dataset.

3) I believe your test data is incorrect, the second dataset is reading 3 variables, but there are only 2 per row.  You would end up with one row of data with 12,15,78 and the other 3 data items would be lost.

msantosp
Fluorite | Level 6

Yes, you are right. My example is wrong. What I mean was

 

data work.table1;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;58# 976# 1
78# 571# 2
49# 221# 2;
run;

data work.table2;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;12# 15
748# 48
494# 112;
run;

The result should be:

58*12+976*748+1*494       58*15+976*48+1*112

78*12+571*748+ 2*494      78*15+571*48+2*112

49*12+221*748+ 2*494      49*15+221*48+2*112

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Nope, still not right.  There are still 3 variables being read in table2 but only 2 data items per row.

 

Let me add some code to show, here I use observation number as the merging variable which gets all the data into one dataset:

data work.table1;
  infile datalines dlm='#';
  input Col1 Col2 Col3;
  id=_n_;
datalines;
58# 976# 1
78# 571# 2
49# 221# 2
;
run;

data work.table2;
  infile datalines dlm='#';
  input Col1 Col2;
  id=_n_;
datalines;
12# 15
748# 48
494# 112
;
run;

data want;
  merge table1 table2 (rename=(col1=t2col1 col2=t2col2));
  by id;
  result=col1 * t2col1 + col2 * t2col2 + col3 * ???;
run;

In the want dataset I then use the data which I merge in to do the formula, however I am missing one data item - from table 2 as there are only two data items.  

 

Next up is your formula:

58*12+976*748+1*494       58*15+976*48+1*112    -- in this, 58, 12, 15, 976, come from row 1, but 748, 48 come from row 2, and 494 and 112 come from row 3.  So this is not an array, this is lagged or retained data, or all values should be on one row.  In which case we then get to the code:

data work.table1;
  infile datalines dlm='#';
  input Col1 Col2 Col3 col4 col5 col6 col7 col8 col9;
datalines;
58# 976# 1
78# 571# 2
49# 221# 2
;
run;

data work.table2;
  infile datalines dlm='#';
  input tCol1 tCol2 tcol3 tcol4 tcol5 tcol6;
datalines;
12# 15
748# 48
494# 112
;
run;

data want;
  merge table1 table2;
  array col{9};
  array tcol{6};
  do i=1 to 9;
    result=col{i} * tcol{i} + col2...;
output; end; run;

This has all the data you presented on one line, you can then loop over that one line of data using the arrays to calculate results outputting each one.

 

KachiM
Rhodochrosite | Level 12

Doing Matrix Multiplication in a Data Step is possible. Usually it is not for beginners. But you must be courageous to read and understand the code. You have two Data sets with 3 rows and 3 columns, 3 rows and 2 columns. The product matrix is 3 by 2. Firstly, the Data set (Table1 say T1) is loaded into the Array M[3,3]. Similarly T2 is loaded into Array N[3,2]. Both Loadings are done when _N_ =1. The Product Matrix P[3,2] is computed in the usual way as we do in matrix multiplication in the Data Step. Here is the code. Be patient to understand the code.

data work.t1;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
datalines;
58 # 976 # 1
78 # 571 # 2
49 # 221 # 2
;
run;

data work.t2;
   infile datalines dlm='#';
   input Col1 Col2;
   datalines;
12# 15
748# 48
494# 112
;
run;

data t3;
array m[3,3] _temporary_;
array n[3,2] _temporary_;
array p[3,2] _temporary_;
call missing(of p[*]);
if _n_ = 1 then do;
   /* Load the Data SET T1 into Array sized to m[3,3] */
   do i = 1 by 1 until(z1);
      set t1 end = z1;
      array c[3] col1 - col3;
      do j = 1 to 3;
         m[i, j] = c[j];
      end;
   end;
   /* LOad the Data Set T2 into Array sized to n[3,2] */
   do i = 1 by 1 until(z2);
      set t2 end = z2;
      set t2 end = z2;
      array v[2] col1 - col2;
      do j = 1 to 2;
         n[i,j] = v[j];
      end;
  end;
end;
/* Do the Matrix Product (3*3 X 3*2 = 3*2) */
/* p[i,j] = SUM of m[i,k] * n[k,j] */
do i = 1 to 3;
   do j = 1 to 2;
      do k = 1 to 3;
         p[i,j] + (m[i,k] * n[k,j]);
      end;
   end;
end;
/** Output Matrix **/
do i = 1 to 3;
   do j = 1 to 2;
      put p[i,j] =;
   end;
end;
stop;
run;

This can be done easily using PROC FCMP which also comes with Base SAS. Again you consult the SAS Documentation for Proc FCMP. There are SPECIAL Matrix Functions. I am using CALL MULT(M, N, P) to get the Product Matrix. Read_array and Call Dynamic_array are special FCMP functions which you have to read. Read_array reads a Data Set into a 2-dimensional Array. The Call Dynamic_array creates an array of required dimensions (in this case 3 by 2) at run_time. The second function is available in Proc FCMP and is not supported in Data Step.

 

Here is the magic program:

proc fcmp;
   array m[1] / nosym;
   array n[1] / nosym;
   array p[1] / nosym;
   file log;
   rc = read_array('t1', m);
   rc = read_array('t2', n);
   call dynamic_array(p,3,2);
   call mult(m,n,p);
   put p =;
quit;

Here is the Product Matrix(P)

 

p[1, 1]=731238 p[1, 2]=47830 p[2, 1]=429032 p[2, 2]=28802 p[3, 1]=166884 p[3, 2]=11567

 

 

AFTER 6 Hours of my Reply:

 

There were two unwanted statements in the first solution. They are deleted and the revised code is given below:

 

data _null_;
array m[3,3] _temporary_;
array n[3,2] _temporary_;
array p[3,2] _temporary_;

if _n_ = 1 then do;
   /* Load the Data SET T1 into Array sized to m[3,3] */
   do i = 1 by 1 until(z1);
      set t1 end = z1;
      array c[3] col1 - col3;
      do j = 1 to 3;
         m[i, j] = c[j];
      end;
   end;
   /* LOad the Data Set T2 into Array sized to n[3,2] */
   do i = 1 by 1 until(z2);
      set t2 end = z2;
      array v[2] col1 - col2;
      do j = 1 to 2;
         n[i,j] = v[j];
      end;
  end;
end;
/* Do the Matrix Product (3*3 X 3*2 = 3*2) */
/* p[i,j] = SUM of m[i,k] * n[k,j] */
do i = 1 to 3;
   do j = 1 to 2;
      do k = 1 to 3;
         p[i,j] + (m[i,k] * n[k,j]);
      end;
   end;
end;
/** Output Matrix **/
do i = 1 to 3;
   do j = 1 to 2;
      put p[i,j] =;
   end;
end;
stop;
run;

To the Community Manager:

 

Why the replies are not time-sequenced? Earlier replies come at the bottom and the readability of the thread becomes fragmented. At least the Original Poster could have NOTICED and reacted to the reply. To me replying to Posts become dull and monotonous as there were no reactions shown to the reply. A mere 'LIKE' is the only solace.

mkeintz
PROC Star

If you want to "compute the product of two datasets", consider using proc score, whose original purpose was to compute factor scores from original variables, using a data set of scoring coefficients, means, and std's..  In your case, you might not need the means or std's.  Here's an example:

 

data work.staff;
   infile datalines dlm='#';
   input Col1 Col2 Col3;
   datalines;
58# 976# 1
78# 571# 2
49# 221# 2
39# 056# 2
40# 592# 2
38# 382# 3
12# 400# 5
21# 237# 1
38# 258# 3 
;
data scores;
input _TYPE_ :$1.  _name_ :$7.  col1 col2 col3 ;
datalines;
S  result1   1 2 3
S  result2  10 20 30
run;
proc score data=staff score=scores nostd type='S'  out=want;
  var col1 col2 col3;
run;

 

Now the matrix in dataset SCORE (ignoring the _TYPE_ and _NAME_ variables) is not in conformable arrangement for multiplying against staff - at least it's not conformable as we usually implement matrix multiplication.   Instead it's basically the transpose of a conformable matrix, so you could run a proc transpose, if needed, against it first, before proc score.

 

You can look up documentation on proc score for more information.  For instance the SCORES dataset could also have a row of means and a row of std's to get standardized results of matix multiplication (I eliminated that via the NOSTD option).  Take a look at Getting Started Example for PROC SCORE for more details.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
msantosp
Fluorite | Level 6

Thank you but I would like to learn how can I do it with arrays.

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

Please post an example of your wanted output based on the example of the data you have.

KachiM
Rhodochrosite | Level 12

You have Col1, col2 and Col3 and want to use 2-dimensional Array. Tell use whether you want to use 

 

Array k[col1, col2] = col3 

 

or

 

Array k[col2, col3] = col1

 

Assuming you want to use the former, here is how it can be done:

data want;

array k[12:78, 56:976] _temporary_;
     do until(last);
        set staff end = last;
        k[col1,col2] = col3;
    end;
   ....    Do other things with your saved array ------
run;

This is one way of using array. But this wastes memory spaces but gives you fast Lookup-time. In one hit you find whether LOOKUP Succeeded or Failed.


Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 7625 views
  • 2 likes
  • 6 in conversation