Hi,
I have a data such as below.I tried to create a matrix as attached using this data but I failed. I tried merge or other similar statements but I think I need something more than these statements. If anyone can help me, I'll appreciate. Thanks.
Cand1 Cand2 Prob
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
Hi,
Thanks for your response. Let me explain the logic/story of my data. Cand1 and cand2 are candidates and the last column is probability. for example the probability between candidate 87 and candidate 627 is .204E-05. In the raw data, candidate 5152 occurs three times and candidate 5522 occurs twice. Each candidates occurs once in the rows or columns and the probabilities related to these candidates are in the matrix in the excel file. That's why the size of raw data and matrix are different. I have updated the excel file, it is attached.
Many thanks,
Can you be more specific about the logic? In your Excel Workbook, you have a 46x46 matrix that you want to create. The Row/Column names seem to resemble the Cand1 and Cand2 variables combined, which you can create in a data step like this
data have;
input Cand1 Cand2 Prob;
datalines;
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
;
data help;
set have;
Cand=Cand1;output;
Cand=Cand2;output;
run;
However, if we use this data, we will get a 50x50 matrix, since you have 25 rows in your original data set. Are we to exclude certain observations?
Hi,
Thanks for your response. Let me explain the logic/story of my data. Cand1 and cand2 are candidates and the last column is probability. for example the probability between candidate 87 and candidate 627 is .204E-05. In the raw data, candidate 5152 occurs three times and candidate 5522 occurs twice. Each candidates occurs once in the rows or columns and the probabilities related to these candidates are in the matrix in the excel file. That's why the size of raw data and matrix are different. I have updated the excel file, it is attached.
Many thanks,
Ok. I think something like this will meet your needs.
data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
;
proc iml;
use have;
read all var _ALL_;
close have;
Cand=unique(Cand1//Cand2);
call sort(Cand);
X=j(ncol(Cand), ncol(Cand), .);
do i=1 to nrow(Prob);
X[loc(Cand=Cand1[i]),loc(Cand=Cand2[i])]=Prob[i];
end;
print X[colName=(char(Cand)) rowName=(char(Cand))];
quit;
Hi Draycut,
I tried the code you wrote. Here is the the error I got. Many thanks!
127 proc iml;
NOTE: IML Ready
128
129 use tr;
130 read all var _ALL_;
131 close tr;
132
133 Cand=unique(Cand_1//Cand_2);
134 call sort(Cand);
135
136 X=j(ncol(Cand), ncol(Cand), .);
137
138 do i=1 to nrow(Prob);
139 X[loc(Cand=Cand_1[i]),loc(Cand=Cand_2[i])]=RT_PROB[i];
140 end;
141
142 print X[colName=(char(Cand)) rowName=(char(Cand))];
ERROR: (execution) Character argument should be numeric.
operation : CHAR at line 142 column 43
operands : Cand
Cand 1 row 46 cols (character, size 😎
statement : PRINT at line 142 column 1
143
144 quit;
NOTE: Exiting IML.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
draycut and ksharp are both using numeric variables for the Cand_1 and Cand_2 variables. You are using character variables. Either change to numeric or get rid of the calls (such as CHAR) that assume that Cand_* are numeric.
Hi Dr. Wicklin,
Your correction helped for both codes.
Thanks
Do you need IML code ? or sql ,data step could be accepted?
data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;
levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;
Hello Ksharp,
Thanks for your response. The code did not work. It creates a matrix but the cells are empty and the columns and rows don't have candidates id numbers. I attached the output and the code I had.
Thanks
Do you need IML code ? or sql ,data step could be accepted?
data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;
levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;
data have;
input Cand1 Cand2 Prob :e7.;
datalines;
87 627 .204E-05
2291 5787 .281E-05
2397 5823 .133E-05
5011 5014 .395E-05
5096 5097 .389E-05
5122 5125 .336E-05
5152 5153 .300E-05
5152 5154 .377E-05
5152 5158 .241E-05
5175 5176 .151E-05
5214 5216 .257E-05
5353 5358 .415E-05
5372 5380 .731E-06
5402 5613 .188E-05
5505 5509 .356E-05
5522 5523 .945E-06
5522 5530 .178E-05
5533 5534 .122E-06
5667 5670 .950E-06
5714 5716 .248E-05
5775 5778 .147E-05
5803 5810 .110E-05
5871 5874 .987E-06
5874 5879 .471E-06
5932 5936 .406E-06
;
proc iml;
use have nobs nobs;
read all var{Cand1 Cand2 Prob};
close;
levels=unique(Cand1||Cand2);
want=j(ncol(levels),ncol(levels),.);
mattrib want r=(char(levels)) c=(char(levels)) label='' ;
row=char(Cand1);
col=char(Cand2);
do i=1 to nobs;
r=row[i];c=col[i];p=Prob[i];
want[r,c]=p;
end;
print want;
quit;
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.