BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ss59
Obsidian | Level 7

Hi all,

 

My SAS dataset looks like following:

 

Year

countrymethodSegmentpr_st1-2pr_st1-3pr_st1-1pr_st2-1pr_st2-3pr_st2-2pr_st3-1pr_st3-2pr_st3-3
2017USAABCRetail0.20.60.20.30.60.10.10.30.6
2017USAXYZCorporate0.10.50.40.20.60.20.20.30.5

 

Now, the first four columns are the key, and from each row I need to create a matrix. For example, For '2017 USA ABC Retail' the matrix would be:

 

 state1state2state3
state10.20.20.6
state20.30.10.6
state30.10.30.6

 

They are probabilities and would sum to 1. And I would need to create a matrix for each row (with unique keys).

 

The problem is the number of states would be dynamic and can not be hard-coded. It should depend on the input file. For example, there could be 20 states and then I would have two matrix each with 20x20 dimension, the dimension of the matrix would be square root of number of columns (except the keys).

 

I relaise the problem has more than one dimension, so for now I'm just looking to be able to create one matrix dynamically from one row. Could you please help me do this using proc iml? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
IanWakeling
Barite | Level 11

You will need to extract the row/col indices from the variables names.  Here's one possible way putting the final square matrix together using the FULL function:

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));
  do i = 1 to 2;
    p = full (  t( x[i,]) || rcidx );
    print p;
  end;
quit;

 

Storing information in variable names is never a good idea...

View solution in original post

8 REPLIES 8
Rick_SAS
SAS Super FREQ

Sure.

1. Read the variables that have a common prefix. 'pr_st'  (OR you might be able to use READ ALL VAR _NUM_ INTO X;)

2. Read in all variables to a matrix X. Compute n = sqrt(nrow(X)), which should be an integer.

3. The first matrix is x1 = shape(X[1,], n); The second matrix is x2 = shape(X[2,], n);

 

ss59
Obsidian | Level 7

Thanks a lot for the quick reply @Rick_SAS. I understood the overall methodology you are mentioning, and I was able to create the matrices. Additional question: how do I manage the key and the number of matrices?

 

For example, here it was two rows, and hence two matrices (each nxn, here n=3). Let's say my data is now axb. So I'll have 'a' different matrices each with sqrt(b)xsqrt(b) dimension. How to efficiently create and store them?

 

 

proc iml;
use A(keep=pr_st:);
  read all var _ALL_ into m[colname=c];
  x1 = shape(m[1,], sqrt(ncol(m)));
  x2 = shape(m[2,], sqrt(ncol(m)));
close A;

I'm using this code now, but instead of X1 and X2 it would be X1, X2, ..., Xa matrices.

 

Thanks again!

 

 

 

IanWakeling
Barite | Level 11

I am not sure that you can use the SHAPE function if the data is like the example you give above, since the 9 variables are not in row major order.  For example the 2nd variable gives the value for the 1st row and 3rd column.

ss59
Obsidian | Level 7

@IanWakeling yes, I can't use shape since they are not in order, which I didn't notice before. Is there another way to  do this?

IanWakeling
Barite | Level 11

You will need to extract the row/col indices from the variables names.  Here's one possible way putting the final square matrix together using the FULL function:

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));
  do i = 1 to 2;
    p = full (  t( x[i,]) || rcidx );
    print p;
  end;
quit;

 

Storing information in variable names is never a good idea...

ss59
Obsidian | Level 7

 

Thanks a lot @IanWakeling. This works. Just two questions,

 

1. How do I get the number of rows of the data set instead of having 2 hard-coded in the line do i = 1 to 2; ?

and

2.  Instead of p = full ( t( x[i,]) || rcidx ); if I can use p[i] and call it later? not working now.

 

 

IanWakeling
Barite | Level 11

Just count the number of rows that have been read into the matrix x.

 

To store the matrices you could use indirect assignment with VALSET and VALUE.

 

proc iml;
  use have(keep = pr_st:);
  read all var _num_ into x [colname=vn];

  rcidx = t( num(scan(vn,3,'t_')) // num(scan(vn,4,'t_')));

  n = nrow(x);
  pn = 'p1' : 'p'+strip(char(n));

  do i = 1 to n;
    call valset( pn[i], full (  t( x[i,]) || rcidx ) );
  end;
  q = value(pn[1]);
  print q;
quit;
Ksharp
Super User

After ordering the variable names, you could use SHAPE().

 

data have;
input Year (country method Segment) ( : $12.)
      pr_st1_2 pr_st1_3 pr_st1_1 pr_st2_1 pr_st2_3 pr_st2_2 pr_st3_1 pr_st3_2 pr_st3_3;
datalines;
2017 USA ABC Retail    0.2 0.6 0.2 0.3 0.6 0.1 0.1 0.3 0.6
2017 USA XYZ Corporate 0.1 0.5 0.4 0.2 0.6 0.2 0.2 0.3 0.5
;
run;
proc sql noprint;
select name	into : vname separated by ' '
 from dictionary.columns
  where libname='WORK' and memname='HAVE' and lowcase(name) eqt 'pr_st'
   order by input(compress(scan(name,-2,'_'),,'kd'),best.),input(compress(scan(name,-1,'_'),,'kd'),best.);
quit;

proc iml;
use have nobs nobs;
read all var {&vname} into x;
close;
do i=1 to nobs;
temp=shape(x[i,],sqrt(ncol(x)));
print temp;
end;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 8 replies
  • 3760 views
  • 7 likes
  • 4 in conversation