Hi everyone,
I've got a tricky one here (at least for me).I'd like to get a matrix that tells me how many staff members schools share
For example, I start from:
Staff ID School ID
1 50
2 50
3 80
4 80
4 100
5 50
5 100
Note that staff 4 and 5 is in two different schools. So in summary:
School_Staff
50 - 1, 2, 5
80 - 3, 4
100 - 4, 5
And ideally, the matrix I'd like to get is this:
50 80 100
50 3 0 1
80 0 2 1
100 1 1 2
Maybe I'll need to use proc iml? But that confuses a bit
Thanks a lot for your help! 🙂 Let me know if you need more info.
Something like below could do the job.
data have;
infile datalines truncover;
input Staff_ID School_ID;
datalines;
1 50
2 50
3 80
4 80
4 100
5 50
5 100
;
proc sql;
create table comb as
select
l.School_ID as l_School_ID,
r.School_ID as r_School_ID
from have l inner join have r
on l.Staff_ID=r.Staff_ID
;
quit;
options missing='0';
proc tabulate data=comb noseps;
keylabel n=' ';
class l_school_id r_school_id;
table l_school_id=' ', r_school_id=' ';
run;
Something like below could do the job.
data have;
infile datalines truncover;
input Staff_ID School_ID;
datalines;
1 50
2 50
3 80
4 80
4 100
5 50
5 100
;
proc sql;
create table comb as
select
l.School_ID as l_School_ID,
r.School_ID as r_School_ID
from have l inner join have r
on l.Staff_ID=r.Staff_ID
;
quit;
options missing='0';
proc tabulate data=comb noseps;
keylabel n=' ';
class l_school_id r_school_id;
table l_school_id=' ', r_school_id=' ';
run;
Hi Patrick,
Thanks a lot for that, I really appreciate it. However, if you don't mind, would you be able to quickly explain what the code does? I'm having trouble understanding it.
Thanks again!
@Antoine44 wrote:
Hi Patrick,
Thanks a lot for that, I really appreciate it. However, if you don't mind, would you be able to quickly explain what the code does? I'm having trouble understanding it.
Thanks again!
You sort-of want to "double count" people. The Proc SQL creates all combinations per staff_id (=adding additional rows to the table). The Proc Tabulate then just does a count of rows per combination.
@Antoine44 wrote:
Ok I think I'm starting to see what you're doing. How does Proc Sql create those combinations? Do l and r correspond to left and right?
l and r are just table aliases and you're free to use any name (i.e. t1, t2). If I've got only two tables then I tend to use l and R (and as you guessed that stands for left and right) - but that's just me.
Here some links to SQL docu:
Selecting Data from More Than One Table By Using Joins
Showing Relationships within a Single Table Using Self-Joins
Hi Patrick,
Thanks again for helping me on this. I've been trying to play around with proc tabulate as there's something else I'd like to display but again, can't seem to figure it out.
I"m wondering if in that output I can also display, in another column or same cell maybe but separated with a comma, the gender of each staff member that schools have in common. So like next to each number I'll get for example: 3 - Male, Female, Female. Is that something proc tabulate can do? I'm guessing Gender will also need to be in the comb dataset, that I think I can do but can't get proc tabulate to show it.
Thanks again for your help, really appreciate it.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.