How do I figure out the best way to pair up between students?

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

How do I figure out the best way to pair up between students?

Hi SAS Experts,

 

Need some help! I'm trying to start with a given "student list", to create a "synergy list" that looks like below. I've tried many different ways, still no success yet.

 

/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID $3 SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

/* Synergy List */
data synergy;
infile datalines dlm=',';
input STUDENT_ID $3 STUDENT1-STUDENT4;
datalines;
001,3,2,1,2
002,2,3,2,1
003,1,2,3,1
004,2,1,1,3
;

 

The idea is this. With a list of students and the subjects they are interested in (value "1" equals interested), I would like to find out for each student which of their friends have the best synergy with them, ie. someone who share the most subjects they both like.

 

The plan is for each common interested subject between two students, I will assign a "synergy score" of 1. Eg. if student 001 and 002 both like Biology and English, they would be assigned a synergy score of "2" between them.

 

In another words, I would like to figure a way to get a "synergy list" from a "student list". The synergy list is a N*N table with the synergy score between each student pair combination. N is total number of students.

 

For example, from the Synergy list above, I can tell Student 001 has the highest synergy score with 002 and 003 (both has synergy score of "2" with 001), which is potentially good for pair up.

 

Any tips or help is appreciated!!


Accepted Solutions
Solution
yesterday
Esteemed Advisor
Posts: 5,526

Re: How do I figure out the best way to pair up between students?

Posted in reply to XabiAlonso

The "natural" data format in SAS is the long form. Move your data from wide to long, do the tally, then move back to wide form:

 

/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID $3 SUBJECT1-SUBJECT5;
array s subject: ;
do subj = 1 to dim(s);
    interest = s{subj};
    output;
    end;
drop subject: ;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

proc sql;
create table syn as
select
    a.student_id as student_id,    
    b.student_id as stud2,
    sum(a.interest and b.interest) as common_interest
from
    student as a inner join
    student as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;
PG

View solution in original post


All Replies
Solution
yesterday
Esteemed Advisor
Posts: 5,526

Re: How do I figure out the best way to pair up between students?

Posted in reply to XabiAlonso

The "natural" data format in SAS is the long form. Move your data from wide to long, do the tally, then move back to wide form:

 

/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID $3 SUBJECT1-SUBJECT5;
array s subject: ;
do subj = 1 to dim(s);
    interest = s{subj};
    output;
    end;
drop subject: ;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

proc sql;
create table syn as
select
    a.student_id as student_id,    
    b.student_id as stud2,
    sum(a.interest and b.interest) as common_interest
from
    student as a inner join
    student as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;
PG
Regular Contributor
Posts: 159

Re: How do I figure out the best way to pair up between students?

Posted in reply to XabiAlonso

sounds like matching used in observational studies may be analogous: http://www2.sas.com/proceedings/sugi29/173-29.pdf

--------------
blog: papersandprograms.com
Super User
Posts: 23,700

Re: How do I figure out the best way to pair up between students?

Posted in reply to PaulBrownPhD

@PaulBrownPhD wrote:

sounds like matching used in observational studies may be analogous: http://www2.sas.com/proceedings/sugi29/173-29.pdf



Echoing @PaulBrownPhD it sounds like case control matching and if you’re using the latest version of SAS STAT there’s a PROC PSMATCH that can help with that. 

 

 

PROC Star
Posts: 1,792

Re: How do I figure out the best way to pair up between students?

Posted in reply to XabiAlonso
data student;
infile datalines dlm=',';
input STUDENT_ID $ SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

data w;
set student;
_Student_id=student_id;
do n=1 to nobs;
grp+1;
set student point=_n_;
output;
set student nobs=nobs point=n;
output;
end;
run;

data w1;
set w;
by grp;
array t(*) subject:;
do i=1 to dim(t);
if sum(t(i), lag(t(i)))=2 then count+1;
end;
if not last.grp then count=0;
if count ne 0;
keep _student_id grp count;
run;

proc transpose data=w1 out=want(drop=_name_)  prefix=Student_;
by _Student_id;
var count;
run;
PROC Star
Posts: 1,792

Re: How do I figure out the best way to pair up between students?

Posted in reply to novinosrin

@XabiAlonso Your first post(Welcome bienvenido) to the SAS forum and that too with an interesting question. I'd go with PG's answer  for real and have fun with mine. If you are experienced user, I'm sure you'd comprehend both the solutions. 

I can of course eliminate data w1 step and merge that into one with the previous and make in work in 2 steps. Have a great weekend

Esteemed Advisor
Posts: 5,526

Re: How do I figure out the best way to pair up between students?

Posted in reply to XabiAlonso

Here is another version of my code which is more appropriate if your data is already in a wide format dataset.

 

/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID $3 SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

/* Convert wide to long format */
proc transpose data=student out=studLong prefix=interest name=subj;
by student_id;
var subject: ;
run;

/* Count common interests */
proc sql;
create table syn as
select
    a.student_id as student_id,    
    b.student_id as stud2,
    sum(a.interest1 and b.interest1) as common_interest
from
    studlong as a inner join
    studlong as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

/* Convert long to wide format */
proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;
PG
New Contributor
Posts: 2

Re: How do I figure out the best way to pair up between students?

Thanks so much everyone on the input. Especially to PG and Novinosrin, you took the time to write the code, in SQL and PROC. Now I have some solid examples to follow to solve similar problems in the future. So appreciative on this!!

Smiley Very HappySmiley Very Happy

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 166 views
  • 2 likes
  • 5 in conversation