Solved
New Contributor
Posts: 2

# How do I figure out the best way to pair up between students?

Hi SAS Experts,

Need some help! I'm trying to start with a given "student list", to create a "synergy list" that looks like below. I've tried many different ways, still no success yet.

/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID \$3 SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

/* Synergy List */
data synergy;
infile datalines dlm=',';
input STUDENT_ID \$3 STUDENT1-STUDENT4;
datalines;
001,3,2,1,2
002,2,3,2,1
003,1,2,3,1
004,2,1,1,3
;

The idea is this. With a list of students and the subjects they are interested in (value "1" equals interested), I would like to find out for each student which of their friends have the best synergy with them, ie. someone who share the most subjects they both like.

The plan is for each common interested subject between two students, I will assign a "synergy score" of 1. Eg. if student 001 and 002 both like Biology and English, they would be assigned a synergy score of "2" between them.

In another words, I would like to figure a way to get a "synergy list" from a "student list". The synergy list is a N*N table with the synergy score between each student pair combination. N is total number of students.

For example, from the Synergy list above, I can tell Student 001 has the highest synergy score with 002 and 003 (both has synergy score of "2" with 001), which is potentially good for pair up.

Any tips or help is appreciated!!

Accepted Solutions
Solution
yesterday
Posts: 5,526

## Re: How do I figure out the best way to pair up between students?

The "natural" data format in SAS is the long form. Move your data from wide to long, do the tally, then move back to wide form:

``````/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID \$3 SUBJECT1-SUBJECT5;
array s subject: ;
do subj = 1 to dim(s);
interest = s{subj};
output;
end;
drop subject: ;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

proc sql;
create table syn as
select
a.student_id as student_id,
b.student_id as stud2,
sum(a.interest and b.interest) as common_interest
from
student as a inner join
student as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;``````
PG

All Replies
Solution
yesterday
Posts: 5,526

## Re: How do I figure out the best way to pair up between students?

The "natural" data format in SAS is the long form. Move your data from wide to long, do the tally, then move back to wide form:

``````/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID \$3 SUBJECT1-SUBJECT5;
array s subject: ;
do subj = 1 to dim(s);
interest = s{subj};
output;
end;
drop subject: ;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

proc sql;
create table syn as
select
a.student_id as student_id,
b.student_id as stud2,
sum(a.interest and b.interest) as common_interest
from
student as a inner join
student as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;``````
PG
Regular Contributor
Posts: 159

## Re: How do I figure out the best way to pair up between students?

sounds like matching used in observational studies may be analogous: http://www2.sas.com/proceedings/sugi29/173-29.pdf

--------------
blog: papersandprograms.com
Super User
Posts: 23,700

## Re: How do I figure out the best way to pair up between students?

@PaulBrownPhD wrote:

sounds like matching used in observational studies may be analogous: http://www2.sas.com/proceedings/sugi29/173-29.pdf

Echoing @PaulBrownPhD it sounds like case control matching and if you’re using the latest version of SAS STAT there’s a PROC PSMATCH that can help with that.

PROC Star
Posts: 1,792

## Re: How do I figure out the best way to pair up between students?

``````data student;
infile datalines dlm=',';
input STUDENT_ID \$ SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

data w;
set student;
_Student_id=student_id;
do n=1 to nobs;
grp+1;
set student point=_n_;
output;
set student nobs=nobs point=n;
output;
end;
run;

data w1;
set w;
by grp;
array t(*) subject:;
do i=1 to dim(t);
if sum(t(i), lag(t(i)))=2 then count+1;
end;
if not last.grp then count=0;
if count ne 0;
keep _student_id grp count;
run;

proc transpose data=w1 out=want(drop=_name_)  prefix=Student_;
by _Student_id;
var count;
run;``````
PROC Star
Posts: 1,792

## Re: How do I figure out the best way to pair up between students?

@XabiAlonso Your first post(Welcome bienvenido) to the SAS forum and that too with an interesting question. I'd go with PG's answer  for real and have fun with mine. If you are experienced user, I'm sure you'd comprehend both the solutions.

I can of course eliminate data w1 step and merge that into one with the previous and make in work in 2 steps. Have a great weekend

Posts: 5,526

## Re: How do I figure out the best way to pair up between students?

Here is another version of my code which is more appropriate if your data is already in a wide format dataset.

``````/* Student List */
data student;
infile datalines dlm=',';
input STUDENT_ID \$3 SUBJECT1-SUBJECT5;
datalines;
001,1,0,0,1,1
002,0,0,1,1,1
003,0,1,1,0,1
004,1,1,0,1,0
;

/* Convert wide to long format */
proc transpose data=student out=studLong prefix=interest name=subj;
by student_id;
var subject: ;
run;

/* Count common interests */
proc sql;
create table syn as
select
a.student_id as student_id,
b.student_id as stud2,
sum(a.interest1 and b.interest1) as common_interest
from
studlong as a inner join
studlong as b on a.subj=b.subj
group by a.student_id, b.student_Id;
quit;

/* Convert long to wide format */
proc transpose data=syn out=synergy(drop=_name_) prefix=student;
by student_id;
id stud2;
var common_interest;
run;

proc print data=synergy; run;
``````
PG
New Contributor
Posts: 2

## Re: How do I figure out the best way to pair up between students?

Thanks so much everyone on the input. Especially to PG and Novinosrin, you took the time to write the code, in SQL and PROC. Now I have some solid examples to follow to solve similar problems in the future. So appreciative on this!!

☑ This topic is solved.