Solved: creating a matrix

dustychair · Posted 08-22-2018 12:26 AM

Hi,

I have a data such as below.I tried to create a matrix as attached using this data but I failed. I tried merge or other similar statements but I think I need something more than these statements. If anyone can help me, I'll appreciate. Thanks.

Cand1 Cand2 Prob

87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06

5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05

5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06

dustychair · Posted 08-22-2018 01:36 AM

Hi,

Thanks for your response. Let me explain the logic/story of my data. Cand1 and cand2 are candidates and the last column is probability. for example the probability between candidate 87 and candidate 627 is .204E-05. In the raw data, candidate 5152 occurs three times and candidate 5522 occurs twice. Each candidates occurs once in the rows or columns and the probabilities related to these candidates are in the matrix in the excel file. That's why the size of raw data and matrix are different. I have updated the excel file, it is attached.

Many thanks,

View solution in original post

PeterClemmensen · Posted 08-22-2018 01:09 AM

Can you be more specific about the logic? In your Excel Workbook, you have a 46x46 matrix that you want to create. The Row/Column names seem to resemble the Cand1 and Cand2 variables combined, which you can create in a data step like this

data have;
input Cand1 Cand2 Prob;
datalines;
87 627 .204E-05 
2291 5787 .281E-05 
2397 5823 .133E-05 
5011 5014 .395E-05 
5096 5097 .389E-05 
5122 5125 .336E-05 
5152 5153 .300E-05 
5152 5154 .377E-05 
5152 5158 .241E-05 
5175 5176 .151E-05 
5214 5216 .257E-05
5353 5358 .415E-05 
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06 
5522 5530 .178E-05 
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05 
5775 5778 .147E-05 
5803 5810 .110E-05 
5871 5874 .987E-06 
5874 5879 .471E-06 
5932 5936 .406E-06
;

data help;
   set have;
   Cand=Cand1;output;
   Cand=Cand2;output;
run;

However, if we use this data, we will get a 50x50 matrix, since you have 25 rows in your original data set. Are we to exclude certain observations?

The DATA to DATA Step Macro
Blog: SASnrd

dustychair · Posted 08-22-2018 01:36 AM

Hi,

Thanks for your response. Let me explain the logic/story of my data. Cand1 and cand2 are candidates and the last column is probability. for example the probability between candidate 87 and candidate 627 is .204E-05. In the raw data, candidate 5152 occurs three times and candidate 5522 occurs twice. Each candidates occurs once in the rows or columns and the probabilities related to these candidates are in the matrix in the excel file. That's why the size of raw data and matrix are different. I have updated the excel file, it is attached.

Many thanks,

PeterClemmensen · Posted 08-22-2018 04:03 AM

Ok. I think something like this will meet your needs.

data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05 
2291 5787 .281E-05 
2397 5823 .133E-05 
5011 5014 .395E-05 
5096 5097 .389E-05 
5122 5125 .336E-05 
5152 5153 .300E-05 
5152 5154 .377E-05 
5152 5158 .241E-05 
5175 5176 .151E-05 
5214 5216 .257E-05
5353 5358 .415E-05 
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06 
5522 5530 .178E-05 
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05 
5775 5778 .147E-05 
5803 5810 .110E-05 
5871 5874 .987E-06 
5874 5879 .471E-06 
5932 5936 .406E-06
;

proc iml;

use have;
   read all var _ALL_; 
close have;

Cand=unique(Cand1//Cand2);
call sort(Cand);

X=j(ncol(Cand), ncol(Cand), .);

do i=1 to nrow(Prob);
   X[loc(Cand=Cand1[i]),loc(Cand=Cand2[i])]=Prob[i];
end;

print X[colName=(char(Cand)) rowName=(char(Cand))];

quit;

The DATA to DATA Step Macro
Blog: SASnrd

dustychair · Posted 08-22-2018 06:29 PM

Hi Draycut,

I tried the code you wrote. Here is the the error I got. Many thanks!

127 proc iml;
NOTE: IML Ready
128
129 use tr;
130 read all var _ALL_;
131 close tr;
132
133 Cand=unique(Cand_1//Cand_2);
134 call sort(Cand);
135
136 X=j(ncol(Cand), ncol(Cand), .);
137
138 do i=1 to nrow(Prob);
139 X[loc(Cand=Cand_1[i]),loc(Cand=Cand_2[i])]=RT_PROB[i];
140 end;
141
142 print X[colName=(char(Cand)) rowName=(char(Cand))];
ERROR: (execution) Character argument should be numeric.

operation : CHAR at line 142 column 43
operands : Cand
Cand 1 row 46 cols (character, size 😎

statement : PRINT at line 142 column 1
143
144 quit;
NOTE: Exiting IML.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds

Rick_SAS · Posted 08-22-2018 06:51 PM

draycut and ksharp are both using numeric variables for the Cand_1 and Cand_2 variables. You are using character variables. Either change to numeric or get rid of the calls (such as CHAR) that assume that Cand_* are numeric.

dustychair · Posted 08-22-2018 07:04 PM

Hi Dr. Wicklin,

Your correction helped for both codes.

Thanks

Ksharp · Posted 08-22-2018 10:38 AM

Do you need IML code ? or sql ,data step could be accepted?

data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05 
2291 5787 .281E-05 
2397 5823 .133E-05 
5011 5014 .395E-05 
5096 5097 .389E-05 
5122 5125 .336E-05 
5152 5153 .300E-05 
5152 5154 .377E-05 
5152 5158 .241E-05 
5175 5176 .151E-05 
5214 5216 .257E-05
5353 5358 .415E-05 
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06 
5522 5530 .178E-05 
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05 
5775 5778 .147E-05 
5803 5810 .110E-05 
5871 5874 .987E-06 
5874 5879 .471E-06 
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;

levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;

dustychair · Posted 08-22-2018 06:25 PM

Hello Ksharp,

Thanks for your response. The code did not work. It creates a matrix but the cells are empty and the columns and rows don't have candidates id numbers. I attached the output and the code I had.

Thanks

Ksharp · Posted 08-22-2018 10:40 AM

Do you need IML code ? or sql ,data step could be accepted?

data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05 
2291 5787 .281E-05 
2397 5823 .133E-05 
5011 5014 .395E-05 
5096 5097 .389E-05 
5122 5125 .336E-05 
5152 5153 .300E-05 
5152 5154 .377E-05 
5152 5158 .241E-05 
5175 5176 .151E-05 
5214 5216 .257E-05
5353 5358 .415E-05 
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06 
5522 5530 .178E-05 
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05 
5775 5778 .147E-05 
5803 5810 .110E-05 
5871 5874 .987E-06 
5874 5879 .471E-06 
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;

levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;

Ksharp · Posted 08-22-2018 10:42 AM

data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05 
2291 5787 .281E-05 
2397 5823 .133E-05 
5011 5014 .395E-05 
5096 5097 .389E-05 
5122 5125 .336E-05 
5152 5153 .300E-05 
5152 5154 .377E-05 
5152 5158 .241E-05 
5175 5176 .151E-05 
5214 5216 .257E-05
5353 5358 .415E-05 
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06 
5522 5530 .178E-05 
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05 
5775 5778 .147E-05 
5803 5810 .110E-05 
5871 5874 .987E-06 
5874 5879 .471E-06 
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;

levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;

creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Re: creating a matrix

Catch up on SAS Innovate 2026