BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michellel
Calcite | Level 5

Hi,

 

I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)

 

Person_ID Center1 Center2 Center3
abc 1 0 1
efg 0 1 1
xyz 1 1 1

 

What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1. 

Person_ID C1C2 C1C3 C2C3
abc 0 1 0
efg 0 0 1
xyz 1 1 1

 

The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.

  Center1 Center2 Center3
Center1 NA 1 2
Center2 NA NA 2
Center3 NA NA NA

 

Thank you so much for your help in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Try the approach I have outlined here:

https://gist.github.com/statgeek/a5184a4e1678d81e2643#file-sas_distance_matrix-sas

 

I think it's exactly your question.

 

Note that this handles 2X2 comparisons, 3x3 comparisons are much more difficult.

 


@michellel wrote:

Hi,

 

I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)

 

Person_ID Center1 Center2 Center3
abc 1 0 1
efg 0 1 1
xyz 1 1 1

 

What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1. 

Person_ID C1C2 C1C3 C2C3
abc 0 1 0
efg 0 0 1
xyz 1 1 1

 

The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.

  Center1 Center2 Center3
Center1 NA 1 2
Center2 NA NA 2
Center3 NA NA NA

 

Thank you so much for your help in advance!


 

 

 

View solution in original post

8 REPLIES 8
Astounding
PROC Star

Are you prepared to analyze 5,000 additional variables?  That's the number of pairs you will get (approximately) with 100 centers.

michellel
Calcite | Level 5

It seems too many variables. Maybe I should list 5000 results as observations instead of variables.

 

Reeza
Super User

Try the approach I have outlined here:

https://gist.github.com/statgeek/a5184a4e1678d81e2643#file-sas_distance_matrix-sas

 

I think it's exactly your question.

 

Note that this handles 2X2 comparisons, 3x3 comparisons are much more difficult.

 


@michellel wrote:

Hi,

 

I have a question about creating flag variables to do pair check. To make it simple, I listed a monk table below. It shows whether people went to each of centers, for each person. The center variables are flag variables showing if the person went to that center. (In reality, I have about 100 centers to check.)

 

Person_ID Center1 Center2 Center3
abc 1 0 1
efg 0 1 1
xyz 1 1 1

 

What kind of table I want is listed below. I want to create flag variables showing which pair of centers people went to. It does not matter about which center people go first. For person_ID xyz, this person went to three of the centers, so I want all of the pair-flag variables C1C2, C1C3, C2C3 be 1. 

Person_ID C1C2 C1C3 C2C3
abc 0 1 0
efg 0 0 1
xyz 1 1 1

 

The second table I want is listed below. I want to show how many people went to each pair of centers. From the first step table, we can see only 1 person went to Center1 & Center2, 2 people went to center1 & center3, 2 people went to center2 & center3.

  Center1 Center2 Center3
Center1 NA 1 2
Center2 NA NA 2
Center3 NA NA NA

 

Thank you so much for your help in advance!


 

 

 

Reeza
Super User

How do you want to handle diagonals? Empty or number of people who visited Center1 in total?

 

data test;
input id $ center1-center3;
cards;
abc    1    0    1
efg    0    1    1
xyz    1    1    1
;
ods output sscp=coocs;
proc corr data=test sscp;
var center1-center3;
run;
proc print data=coocs;
format _numeric_ 8.;
run;

Results:

 

SAS Output

Obs Variable center1 center2 center3
1 center1 2 1 2
2 center2 1 2 2
3 center3 2 2 3
michellel
Calcite | Level 5

Thank you so much for your answer above! It is really helpful.

 

For the diagonals, Empty would be better, as I am not interested in the number of people who visited one center in total.

novinosrin
Tourmaline | Level 20

Here is a way to get your pair variables at compile time for array statement. I am sure if you are an intermediate level SAS user you can then write your if then look up statements to populate the pair variables accordingly.

data have;
input Person_ID $	Center1	Center2	Center3;
datalines;
abc	1	0	1
efg	0	1	1
xyz	1	1	1
;




%macro t(n);
%do i=1 %to (&n-1);
%do j=&i+1 %to &n;
%if &j ne &i %then center&i&j;
%end;
%end;
%mend t;

%put %t(3);

data want;
set have;
array t(*) %t(3);
/*---Write your look up statements here---*/
run;

I'll try to work on your complete requirement if I find time

 

 

novinosrin
Tourmaline | Level 20

a good array exercise

data have;
input Person_ID $	Center1	Center2	Center3;
datalines;
abc	1	0	1
efg	0	1	1
xyz	1	1	1
;

%macro t;
%global n;
%let n=3 ;/*number of centers parameter*/
%do i=1 %to (&n-1);
	%do j=&i+1 %to &n;
	 center&i.center&j
	%end;
%end;
%mend t;

%put %t ;

data want;
set have;
array k(*) %t;
array t(*) center1-center3;
_c=0;
do _i=1 to (&n-1);
do _j=_i+1 to &n;
_c=_c+1;
k(_c)=t(_i)+t(_j);
if k(_c)>1 then k(_c)=1;else k(_c)=0;
end;
end;
drop _:;
run;

in a datastep:

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1397 views
  • 1 like
  • 4 in conversation