Statistical programming, matrix languages, and more

Dynamically create matrices using a SAS dataset with row and column

Accepted Solution Solved
Reply
Contributor
Posts: 42
Accepted Solution

Dynamically create matrices using a SAS dataset with row and column

[ Edited ]

Hi all,

 

My SAS dataset looks like following:

 

Year

countrymethodSegmentpr_st1-2pr_st1-3pr_st1-1pr_st2-1pr_st2-3pr_st2-2pr_st3-1pr_st3-2pr_st3-3
2017USAABCRetail0.20.60.20.30.60.10.10.30.6
2017USAXYZCorporate0.10.50.40.20.60.20.20.30.5

 

Now, the first four columns are the key, and from each row I need to create a matrix. For example, For '2017 USA ABC Retail' the matrix would be:

 

 state1state2state3
state10.20.20.6
state20.30.10.6
state30.10.30.6

 

They are probabilities and would sum to 1. And I would need to create a matrix for each row (with unique keys).

 

The problem is the number of states would be dynamic and can not be hard-coded. It should depend on the input file. For example, there could be 20 states and then I would have two matrix each with 20x20 dimension, the dimension of the matrix would be square root of number of columns (except the keys).

 

I relaise the problem has more than one dimension, so for now I'm just looking to be able to create one matrix dynamically from one row. Could you please help me do this using proc iml? Thanks!


Accepted Solutions
Highlighted
Solution
‎12-15-2017 11:49 AM
Regular Contributor
Posts: 162

Re: Dynamically create matrices using a SAS dataset with row and column

You will need to extract the row/col indices from the variables names.  Here's one possible way putting the final square matrix together using the FULL function:

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));
  do i = 1 to 2;
    p = full (  t( x[i,]) || rcidx );
    print p;
  end;
quit;

 

Storing information in variable names is never a good idea...

View solution in original post


All Replies
SAS Super FREQ
Posts: 4,110

Re: Dynamically create matrices using a SAS dataset with row and column

Sure.

1. Read the variables that have a common prefix. 'pr_st'  (OR you might be able to use READ ALL VAR _NUM_ INTO XSmiley Wink

2. Read in all variables to a matrix X. Compute n = sqrt(nrow(X)), which should be an integer.

3. The first matrix is x1 = shape(X[1,], n); The second matrix is x2 = shape(X[2,], n);

 

Contributor
Posts: 42

Re: Dynamically create matrices using a SAS dataset with row and column

[ Edited ]

Thanks a lot for the quick reply @Rick_SAS. I understood the overall methodology you are mentioning, and I was able to create the matrices. Additional question: how do I manage the key and the number of matrices?

 

For example, here it was two rows, and hence two matrices (each nxn, here n=3). Let's say my data is now axb. So I'll have 'a' different matrices each with sqrt(b)xsqrt(b) dimension. How to efficiently create and store them?

 

 

proc iml;
use A(keep=pr_st:);
  read all var _ALL_ into m[colname=c];
  x1 = shape(m[1,], sqrt(ncol(m)));
  x2 = shape(m[2,], sqrt(ncol(m)));
close A;

I'm using this code now, but instead of X1 and X2 it would be X1, X2, ..., Xa matrices.

 

Thanks again!

 

 

 

Regular Contributor
Posts: 162

Re: Dynamically create matrices using a SAS dataset with row and column

I am not sure that you can use the SHAPE function if the data is like the example you give above, since the 9 variables are not in row major order.  For example the 2nd variable gives the value for the 1st row and 3rd column.

Contributor
Posts: 42

Re: Dynamically create matrices using a SAS dataset with row and column

Posted in reply to IanWakeling

@IanWakeling yes, I can't use shape since they are not in order, which I didn't notice before. Is there another way to  do this?

Highlighted
Solution
‎12-15-2017 11:49 AM
Regular Contributor
Posts: 162

Re: Dynamically create matrices using a SAS dataset with row and column

You will need to extract the row/col indices from the variables names.  Here's one possible way putting the final square matrix together using the FULL function:

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));
  do i = 1 to 2;
    p = full (  t( x[i,]) || rcidx );
    print p;
  end;
quit;

 

Storing information in variable names is never a good idea...

Contributor
Posts: 42

Re: Dynamically create matrices using a SAS dataset with row and column

Posted in reply to IanWakeling

 

Thanks a lot @IanWakeling. This works. Just two questions,

 

1. How do I get the number of rows of the data set instead of having 2 hard-coded in the line do i = 1 to 2; ?

and

2.  Instead of p = full ( t( x[i,]) || rcidx ); if I can use p[i] and call it later? not working now.

 

 

Regular Contributor
Posts: 162

Re: Dynamically create matrices using a SAS dataset with row and column

Just count the number of rows that have been read into the matrix x.

 

To store the matrices you could use indirect assignment with VALSET and VALUE.

 

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));

  n = nrow(x);
  pn = 'p1' : 'p'+strip(char(n));

  do i = 1 to n;
    call valset( pn[i], full (  t( x[i,]) || rcidx ) );
  end;
  q = value(pn[1]);
  print q;
quit;
Super User
Posts: 10,611

Re: Dynamically create matrices using a SAS dataset with row and column

After ordering the variable names, you could use SHAPE().

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;
proc sql noprint;
select name	into : vname separated by ' '
 from dictionary.columns
  where libname='WORK' and memname='HAVE' and lowcase(name) eqt 'pr_st'
   order by input(compress(scan(name,-2,'_'),,'kd'),best.),input(compress(scan(name,-1,'_'),,'kd'),best.);
quit;

proc iml;
use have nobs nobs;
read all var {&vname} into x;
close;
do i=1 to nobs;
temp=shape(x[i,],sqrt(ncol(x)));
print temp;
end;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 420 views
  • 7 likes
  • 4 in conversation