Statistical programming, matrix languages, and more

Convert dataset elements into matrix

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Convert dataset elements into matrix

Dear all,

 

I have a dataset that looks like this:

 

OrderID    JourneyPosition     Channel

1234         1                             D

1234         2                             T

1235         1                             B

 

I have sucessfully concatenated the channels into a list item, ordered by OrderID using

data abc.concat;
length Channels $ 200 /* need to be the max */;
set abc.cj2;
by OrderID;

retain Channels;

if first.OrderID then Channels = ChannelShort;
else
Channels=trim(Channels) || ' ' || trim(ChannelShort);


if last.OrderID then output;
run;

How can I achieve the same result as a matrix?

It should look like

 

OrderID   C1    C2

1234        D     T

1235        B

 

Thank you very much!

 


Accepted Solutions
Solution
‎03-11-2016 04:25 PM
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

It looks like the program editor corrupted my previous program. Sorry about that.

 

I assume that you have or can create a variable that enumerates the observations within each level of the OrderID variable.  Then you want to do something like this example. I've indicated where I think your variable names would go:

.

/* Create some sorted data similar to OP's */
proc sort data=sashelp.class out=class;
by sex;
run;
/* add variable that enumerates obs within each group */
data class;
set class;
by sex;
if first.sex then Position = 0;
Position + 1;
run;

/* begin SAS/IML solution */
proc iml;
use class; 
   read all var {sex} into C;    /* =OrderID */
   read all var {name} into x;   /* =ChannelShort? */ 
   read all var {position};      /* =JourneyPosition? */
close;

/* unique-loc technique: 
   http://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html */ 
u = unique(C);  
s = j(ncol(u), 1+max(position), blankstr(32)); 
do i = 1 to ncol(u);   
   idx = loc(C=u[i]);  
   s[i, 1] = u[i];
   s[i, 2:ncol(idx)+1] = rowvec(x[idx]);
end;

print s;

View solution in original post


All Replies
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

[ Edited ]

1. Your matrix doesn't reflect the DATA step code. In the DATA You are concatenating strings together, so the matrix would have two columns, not three. The columns would be OrderID and Channel, where Channel is the concatenation of all values of the ShortChannel variable.

2. If the data are in unsorted order, I'd use the UNIQUE-LOC technique to loop over the unique values of the OrderID variable. Use the ROWCAT function to concatentate the values into a single string.

3. If the data  are sorted, it would be more efficient to use the UNIQUEBY subroutine.

 

Here's an example of using the UNIQUE-LOC trick (item 2) to concatenate the NAME variable for each value of the SEX variable in the Sashelp.Class data.

 


proc iml;
use sashelp.class;
read all var {sex

 

Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

[ Edited ]

Sorry for being unclear. I want a matrix that contains n columns, n being the max count of elements in ChannelShort. 

I want a new row for each unique OrderID.

 

And you are right, the data step does something different. 

Solution
‎03-11-2016 04:25 PM
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

It looks like the program editor corrupted my previous program. Sorry about that.

 

I assume that you have or can create a variable that enumerates the observations within each level of the OrderID variable.  Then you want to do something like this example. I've indicated where I think your variable names would go:

.

/* Create some sorted data similar to OP's */
proc sort data=sashelp.class out=class;
by sex;
run;
/* add variable that enumerates obs within each group */
data class;
set class;
by sex;
if first.sex then Position = 0;
Position + 1;
run;

/* begin SAS/IML solution */
proc iml;
use class; 
   read all var {sex} into C;    /* =OrderID */
   read all var {name} into x;   /* =ChannelShort? */ 
   read all var {position};      /* =JourneyPosition? */
close;

/* unique-loc technique: 
   http://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html */ 
u = unique(C);  
s = j(ncol(u), 1+max(position), blankstr(32)); 
do i = 1 to ncol(u);   
   idx = loc(C=u[i]);  
   s[i, 1] = u[i];
   s[i, 2:ncol(idx)+1] = rowvec(x[idx]);
end;

print s;
Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

[ Edited ]

That looks perfect! 

 

Though, I am not sure why I am getting this error:

 

 

 NOTE: Closing WORK.CLASS
 62         
 63         /* unique-loc technique:
 64            http://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html */
 65         u = unique(C);
 66         s = j(ncol(u), 1+max(Position), blankstr(32));
 ERROR: (execution) Matrix has not been set to a value.

 

Printing C and x looks like we're getting there.

 

SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

Please post your SAS version and your complete log beginning with the "proc iml" statement

You can obtain the SAS version in the SAS log by submitting:


%put &sysvlong;

Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 55         
 56         libname ABC '/folders/myfolders/ABC/';
 NOTE: Libref ABC was successfully assigned as follows: 
       Engine:        V9 
       Physical Name: /folders/myfolders/ABC
 57         /* add variable that enumerates obs within each group */
 58         data class;
 59         set ABC.cj2;
 60         by OrderID;
 61         if first.OrderID then Position = 0;
 62         Position + 1;
 63         drop JourneyPosition;
 64         run;
 
 NOTE: There were 192469 observations read from the data set ABC.CJ2.
 NOTE: The data set WORK.CLASS has 192469 observations and 3 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.03 seconds
       cpu time            0.02 seconds
       
 
 65         
 66         /* begin SAS/IML solution */
 67         proc iml;
 NOTE: IML Ready
 68         use class;
 69            read all var {OrderID} into C;
 69       !                                      /* =OrderID */
 70            read all var {ChannelShort} into x;
 70       !                                          /* =ChannelShort? */
 71         close;
 NOTE: Closing WORK.CLASS
 72         
 73         /* unique-loc technique:
 74            http://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html */
 75         s = j(ncol(u), 1+max(Position), blankstr(32));
 ERROR: (execution) Matrix has not been set to a value.
 
  operation : MAX at line 75 column 21
  operands  : Position
 
 Position      0 row       0 col     (numeric)
 
 
  statement : ASSIGN at line 75 column 1
 76         do i = 1 to ncol(u);
 77            idx = loc(C=u[i]);
 78            s[i, 1] = u[i];
 79            s[i, 2:ncol(idx)+1] = rowvec(x[idx]);
 80         end;
 81         
 82         
 83         print s;
 ERROR: Matrix s has not been set to a value.
 
  statement : PRINT at line 83 column 1
 84         
 85         %put &sysvlong;
 9.04.01M3P062415
 86         
 87         
 88         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 100   
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

You are not reading in the Position variable in the Use/Read/Close statement.

Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

[ Edited ]

Oh I'm sorry. I tried removing lines of code to find the error. 

 

This is the error I got in the first place:

 

 1 
 65         
 66         /* begin SAS/IML solution */
 67         proc iml;
 NOTE: IML Ready
 68         use class;
 69            read all var {OrderID} into C;
 69       !                                      /* =OrderID */
 70            read all var {ChannelShort} into x;
 70       !                                          /* =ChannelShort? */
 71            read all var {position};
 72         close;
 NOTE: Closing WORK.CLASS
 73         
 74         /* unique-loc technique:
 75            http://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html */
 76         s = j(ncol(u), 1+max(position), blankstr(32));
 NOTE: Module BLANKSTR loaded from the storage SASHELP.IMLMLIB.
 ERROR: (execution) Invalid operand to operation.
 
  operation : J at line 76 column 6
  operands  : _TEM1001, _TEM1003, _TEM1004
 
 _TEM1001      1 row       1 col     (numeric)
 
          0
 
 _TEM1003      1 row       1 col     (numeric)
 
        767
 
 _TEM1004      1 row       1 col     (character, size 32)
 
                                  
 
  statement : START at offset   1 column   1
 77         do i = 1 to ncol(u);
 78            idx = loc(C=u[i]);
 79            s[i, 1] = u[i];
 80            s[i, 2:ncol(idx)+1] = rowvec(x[idx]);
 81         end;
 82         
 83         
 84         print s;
 ERROR: Matrix s has not been set to a value.
 
  statement : PRINT at line 84 column 1
 85         
 86         %put &sysvlong;
 9.04.01M3P062415
 87         
 88         
 89         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 101 
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

Please think about what you are doing and make an effort to check your program against the one that I gave you. You deleted the line

u = unique(C);

 

I'm sure that you can solve your problem from here. Good luck.

Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

Sorry about that. Should get some sleep. 

 

Thank you so much! You are my hero! I spent 2 days searching for solutions to my problem, but it seems like I chose the wrong keywords. Should have found your posts earlier.

 

Thanks again!

Grand Advisor
Posts: 9,593

Re: Convert dataset elements into matrix

You can use START-END INDEX skill I proposed to handle such group operation problem, and could use it to replace the UNIIQYE-LOC skill ,which is a little more complicated and not friendly  .

 

data have;
input OrderID    JourneyPosition     Channel $;
cards;
1234         1                             D
1234         2                             T
1235         1                             B
;
run;
proc iml;
use have nobs nobs;
read all var{OrderID  Channel};
close;

staet_end=t(loc(OrderID^=t({.}||remove(OrderID,nobs))))||
          t(loc(OrderID^=t(remove(OrderID,1)||{.}))) ;
id=OrderID[staet_end[,1]];
n=nrow(staet_end);
x=j(n,max((staet_end[,2]-staet_end[,1]))+1,blankstr(nleng(Channel)));
do i=1 to n;
 x[i,1:staet_end[i,2]-staet_end[i,1]+1]=t(Channel[staet_end[i,1]:staet_end[i,2]]);
end;

print id x ;
quit;
SAS Super FREQ
Posts: 3,416

Re: Convert dataset elements into matrix

KSharp's indexing trick is similar to the UNIQUEBY technique that I linked to earlier. If the data are already sorted, you can skip the UNIQUEBY call and go directly to the processing. See also the CUSUM-LAG trick, which gets the indices of each BY group for sorted data. 

Grand Advisor
Posts: 9,593

Re: Convert dataset elements into matrix

Rick,

Yeah. But UNIQUEBY() need firstly sort data and can get the first value index for a group  after data is ordered. But if data is not ordered? like:

data x;

 set x;

 by id NOTSORTED ;

Occasional Contributor
Posts: 17

Re: Convert dataset elements into matrix

Thank you both so much! 

I implemented Ksharp's solution, as I have the impression it runs faster.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 674 views
  • 1 like
  • 3 in conversation