Hi,
I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)
Person_ID | Center1 | Center2 | Center3 |
abc | 1 | 0 | 1 |
efg | 0 | 1 | 1 |
xyz | 1 | 1 | 1 |
What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1.
Person_ID | C1C2 | C1C3 | C2C3 |
abc | 0 | 1 | 0 |
efg | 0 | 0 | 1 |
xyz | 1 | 1 | 1 |
The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.
Center1 | Center2 | Center3 | |
Center1 | NA | 1 | 2 |
Center2 | NA | NA | 2 |
Center3 | NA | NA | NA |
Thank you so much for your help in advance!
Try the approach I have outlined here:
https://gist.github.com/statgeek/a5184a4e1678d81e2643#file-sas_distance_matrix-sas
I think it's exactly your question.
Note that this handles 2X2 comparisons, 3x3 comparisons are much more difficult.
@michellel wrote:
Hi,
I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)
Person_ID Center1 Center2 Center3 abc 1 0 1 efg 0 1 1 xyz 1 1 1
What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1.
Person_ID C1C2 C1C3 C2C3 abc 0 1 0 efg 0 0 1 xyz 1 1 1
The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.
Center1 Center2 Center3 Center1 NA 1 2 Center2 NA NA 2 Center3 NA NA NA
Thank you so much for your help in advance!
Are you prepared to analyze 5,000 additional variables? That's the number of pairs you will get (approximately) with 100 centers.
It seems too many variables. Maybe I should list 5000 results as observations instead of variables.
Try the approach I have outlined here:
https://gist.github.com/statgeek/a5184a4e1678d81e2643#file-sas_distance_matrix-sas
I think it's exactly your question.
Note that this handles 2X2 comparisons, 3x3 comparisons are much more difficult.
@michellel wrote:
Hi,
I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)
Person_ID Center1 Center2 Center3 abc 1 0 1 efg 0 1 1 xyz 1 1 1
What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1.
Person_ID C1C2 C1C3 C2C3 abc 0 1 0 efg 0 0 1 xyz 1 1 1
The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.
Center1 Center2 Center3 Center1 NA 1 2 Center2 NA NA 2 Center3 NA NA NA
Thank you so much for your help in advance!
How do you want to handle diagonals? Empty or number of people who visited Center1 in total?
data test;
input id $ center1-center3;
cards;
abc 1 0 1
efg 0 1 1
xyz 1 1 1
;
ods output sscp=coocs;
proc corr data=test sscp;
var center1-center3;
run;
proc print data=coocs;
format _numeric_ 8.;
run;
Results:
SAS Output
Obs | Variable | center1 | center2 | center3 |
---|---|---|---|---|
1 | center1 | 2 | 1 | 2 |
2 | center2 | 1 | 2 | 2 |
3 | center3 | 2 | 2 | 3 |
Thank you so much for your answer above! It is really helpful.
For the diagonals, Empty would be better, as I am not interested in the number of people who visited one center in total.
Here is a way to get your pair variables at compile time for array statement. I am sure if you are an intermediate level SAS user you can then write your if then look up statements to populate the pair variables accordingly.
data have;
input Person_ID $ Center1 Center2 Center3;
datalines;
abc 1 0 1
efg 0 1 1
xyz 1 1 1
;
%macro t(n);
%do i=1 %to (&n-1);
%do j=&i+1 %to &n;
%if &j ne &i %then center&i&j;
%end;
%end;
%mend t;
%put %t(3);
data want;
set have;
array t(*) %t(3);
/*---Write your look up statements here---*/
run;
I'll try to work on your complete requirement if I find time
Thank you!
a good array exercise
data have;
input Person_ID $ Center1 Center2 Center3;
datalines;
abc 1 0 1
efg 0 1 1
xyz 1 1 1
;
%macro t;
%global n;
%let n=3 ;/*number of centers parameter*/
%do i=1 %to (&n-1);
%do j=&i+1 %to &n;
center&i.center&j
%end;
%end;
%mend t;
%put %t ;
data want;
set have;
array k(*) %t;
array t(*) center1-center3;
_c=0;
do _i=1 to (&n-1);
do _j=_i+1 to &n;
_c=_c+1;
k(_c)=t(_i)+t(_j);
if k(_c)>1 then k(_c)=1;else k(_c)=0;
end;
end;
drop _:;
run;
in a datastep:
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.