DATA Step, Macro, Functions and more

Coffee anyone?

Accepted Solution Solved
Reply
Contributor
Posts: 49
Accepted Solution

Coffee anyone?

[ Edited ]

 

 

I tried this in C# but have not had much success. So I am now trying in SAS. Using an EG session and my SAS code, we work with the list of students in SASHELP.CLASS. These people want to get to know each other and have a monthly random pairing to go on a Coffee Date.

Rules: A random Coffee Date List is Generated monthly; I store each months pairing into a Historical Dataset, which I append monthly.

One person cannot have coffee with the same person within a 6 month period. So we keep a separate dataset for historical purposes with 3 Vars: LastDate,InviterID,InvitedID

We check each pairing against the Historical list of which we only load the most recent 6 months data into a temp dataset for checking purposes.

If no recent matched pair is found, a new matched pair is added to a new Paired Dataset, and the 2 names (Rows) are removed from the original Participants dataset until the dataset has less than 2 rows. (a single person cannot be paired with another)

Unfortunately we have 19 people in this list so one person will be left out until we can add a new participant. Is anyone interested in joining our coffee club? :-)

So I start by deriving and ID (n) from the dataset, and I only keep the Name

Data Participants(Keep=ID Name);
FORMAT ID 8.;
set SASHelp.class;
ID=_n_;
run;

These 19 People will be my Participants in the Coffee Club.

I more or less follow the line of thought:

data _null_;
randvar = ceil(rand('UNIFORM') * 100000);
call symput('RANDSEED', randvar);
run;

data CR.names2(keep=MEMID randid);
set CR.Participants;
randid = rand('UNIFORM');
run;

proc sort data=CR.names2 ; by randid; run;

data CR.pairs(keep=pairgrp MEMID);
 set CR.names2 nobs=num_peeps;
 pairgrp+1;
 if pairgrp > floor(num_peeps/2) then pairgrp=1;
run;

proc sort data=CR.pairs; by pairgrp;run;

proc transpose data=CR.pairs 
                    out=CR.pairs2  (drop=_NAME_);
    var memid;
    by pairgrp;

run;

Data CR.Pairs3;
set CR.pairs2;
rename COL1=InviterID COL2=InvitedID;
run;

But I get stuck :-( I need help with the rest please...

Has anyone else done this type of random pairing successfully before? I am grasping straws here...

I think at this point I still need to check if the selected pair exists in the Historical dataset, if not, I can add them into a currentselectedpair dataset, and remove them from the list of Participants.

The remaining people in Participants are then processed and paired again, checked again, until the list is empty or in our case less than 2 items.

Any help much appreciated. Len


Accepted Solutions
Solution
‎07-11-2016 09:41 AM
Super Contributor
Posts: 298

Re: Coffee anyone?

The problem reduces to selecting 9 random pairs
out of 19 numbers. This must be repeated for 6 months
such a way that no pair appears more than once. The
names of persons need not be used to find the cominations.

There is no need to save the monthly selections to a data
set and consult them for selections in the subsequent
months, if they are saved in memory.

The use of arrays magically reduces the complications in the
selection processes.

ARRAY M[ ] is used to select individuals without any repeat
with in each month.

ARRAY NAMES[ ] remembers the names of indivduals. They are
just used at the output stage and they are not used in finding
the combinations as said above.

ARRAY K[6, 19] is used to save the individuals selected in each
month(1-6). It is used to check the pair selections in the
previous months.

Finally the output is sorted by month, first person, second person.
The sorted output shows both the indivdual numbers and
the corresponding names by month.

Here is the output:

Obs    mon    r1    r2    name1      name2

  1     1      2     8    Alice      Janet
  2     1      3    14    Barbara    Mary
  3     1      6    11    James      Joyce
  4     1      7     4    Jane       Carol
  5     1     10    19    John       William
  6     1     12     1    Judy       Alfred
  7     1     15    17    Philip     Ronald
  8     1     16     9    Robert     Jeffrey
  9     1     18     5    Thomas     Henry
 10     2      2    11    Alice      Joyce
 11     2      4     3    Carol      Barbara
 12     2      5    12    Henry      Judy
 13     2      6    13    James      Louise
 14     2      7     1    Jane       Alfred
 15     2     10    18    John       Thomas
 16     2     14    16    Mary       Robert
 17     2     17     9    Ronald     Jeffrey
 18     2     19    15    William    Philip
 19     3      1    17    Alfred     Ronald
 20     3      4    13    Carol      Louise
 21     3      6     9    James      Jeffrey
 22     3      7    14    Jane       Mary
 23     3      8    10    Janet      John
 24     3     11     5    Joyce      Henry
 25     3     12     3    Judy       Barbara
 26     3     16    15    Robert     Philip
 27     3     18     2    Thomas     Alice
 28     4      3    15    Barbara    Philip
 29     4      4     1    Carol      Alfred
 30     4      5    13    Henry      Louise
 31     4      7    11    Jane       Joyce
 32     4     10    16    John       Robert
 33     4     12     9    Judy       Jeffrey
 34     4     14     8    Mary       Janet
 35     4     18    17    Thomas     Ronald
 36     4     19     2    William    Alice
 37     5      1    19    Alfred     William
 38     5      4    12    Carol      Judy
 39     5      5     7    Henry      Jane
 40     5      8     3    Janet      Barbara
 41     5     11    15    Joyce      Philip
 42     5     13    10    Louise     John
 43     5     14     9    Mary       Jeffrey
 44     5     16    18    Robert     Thomas
 45     5     17     6    Ronald     James
 46     6      2    17    Alice      Ronald
 47     6      4    11    Carol      Joyce
 48     6      5    14    Henry      Mary
 49     6      9    15    Jeffrey    Philip
 50     6     10     3    John       Barbara
 51     6     12     6    Judy       James
 52     6     13     8    Louise    Janet
 53     6     16     7    Robert    Jane
 54     6     18    19    Thomas    William
 

The code used is presented below. This code is amenable to write a C-program.

data pairs;
   set sashelp.class(keep = name);
   ID = _N_;
run;

data monthly;
   array k[6, 19] _temporary_;
   array m[19] _temporary_;
   array names[19] $8 _temporary_;
   do i = 1 by 1 until(last);
      set pairs end = last;
      names[i] = name;
   end;
   call streaminit(123);
   do mon = 1 to 6;
      do p = 1 to 9;
     THERE1:
         r1 = ceil(rand("Uniform") * 19);
         if m[r1] then goto THERE1;
         m[r1] = 1;
      THERE2:
         r2 = ceil(rand("Uniform") * 19);
         if m[r2] then goto THERE2;
         * Check for previous selections;
         do v = 1 to mon - 1;
            if k[v, r1] = r2  or k[v, r2] = r1 then do; 
               m[r1] = .; goto THERE1; end;
         end;
         m[r2] = 1;
         k[mon, r1] = r2;
         k[mon, r2] = r1;
         name1 = names[r1];
         name2 = names[r2];
         output;
      end;
      call missing(of m[*]);
   end;

keep mon r1 r2 name1 name2;
run;

proc sort data = monthly;
by mon r1 r2;
run;

proc print data = monthly;
run;


View solution in original post


All Replies
Solution
‎07-11-2016 09:41 AM
Super Contributor
Posts: 298

Re: Coffee anyone?

The problem reduces to selecting 9 random pairs
out of 19 numbers. This must be repeated for 6 months
such a way that no pair appears more than once. The
names of persons need not be used to find the cominations.

There is no need to save the monthly selections to a data
set and consult them for selections in the subsequent
months, if they are saved in memory.

The use of arrays magically reduces the complications in the
selection processes.

ARRAY M[ ] is used to select individuals without any repeat
with in each month.

ARRAY NAMES[ ] remembers the names of indivduals. They are
just used at the output stage and they are not used in finding
the combinations as said above.

ARRAY K[6, 19] is used to save the individuals selected in each
month(1-6). It is used to check the pair selections in the
previous months.

Finally the output is sorted by month, first person, second person.
The sorted output shows both the indivdual numbers and
the corresponding names by month.

Here is the output:

Obs    mon    r1    r2    name1      name2

  1     1      2     8    Alice      Janet
  2     1      3    14    Barbara    Mary
  3     1      6    11    James      Joyce
  4     1      7     4    Jane       Carol
  5     1     10    19    John       William
  6     1     12     1    Judy       Alfred
  7     1     15    17    Philip     Ronald
  8     1     16     9    Robert     Jeffrey
  9     1     18     5    Thomas     Henry
 10     2      2    11    Alice      Joyce
 11     2      4     3    Carol      Barbara
 12     2      5    12    Henry      Judy
 13     2      6    13    James      Louise
 14     2      7     1    Jane       Alfred
 15     2     10    18    John       Thomas
 16     2     14    16    Mary       Robert
 17     2     17     9    Ronald     Jeffrey
 18     2     19    15    William    Philip
 19     3      1    17    Alfred     Ronald
 20     3      4    13    Carol      Louise
 21     3      6     9    James      Jeffrey
 22     3      7    14    Jane       Mary
 23     3      8    10    Janet      John
 24     3     11     5    Joyce      Henry
 25     3     12     3    Judy       Barbara
 26     3     16    15    Robert     Philip
 27     3     18     2    Thomas     Alice
 28     4      3    15    Barbara    Philip
 29     4      4     1    Carol      Alfred
 30     4      5    13    Henry      Louise
 31     4      7    11    Jane       Joyce
 32     4     10    16    John       Robert
 33     4     12     9    Judy       Jeffrey
 34     4     14     8    Mary       Janet
 35     4     18    17    Thomas     Ronald
 36     4     19     2    William    Alice
 37     5      1    19    Alfred     William
 38     5      4    12    Carol      Judy
 39     5      5     7    Henry      Jane
 40     5      8     3    Janet      Barbara
 41     5     11    15    Joyce      Philip
 42     5     13    10    Louise     John
 43     5     14     9    Mary       Jeffrey
 44     5     16    18    Robert     Thomas
 45     5     17     6    Ronald     James
 46     6      2    17    Alice      Ronald
 47     6      4    11    Carol      Joyce
 48     6      5    14    Henry      Mary
 49     6      9    15    Jeffrey    Philip
 50     6     10     3    John       Barbara
 51     6     12     6    Judy       James
 52     6     13     8    Louise    Janet
 53     6     16     7    Robert    Jane
 54     6     18    19    Thomas    William
 

The code used is presented below. This code is amenable to write a C-program.

data pairs;
   set sashelp.class(keep = name);
   ID = _N_;
run;

data monthly;
   array k[6, 19] _temporary_;
   array m[19] _temporary_;
   array names[19] $8 _temporary_;
   do i = 1 by 1 until(last);
      set pairs end = last;
      names[i] = name;
   end;
   call streaminit(123);
   do mon = 1 to 6;
      do p = 1 to 9;
     THERE1:
         r1 = ceil(rand("Uniform") * 19);
         if m[r1] then goto THERE1;
         m[r1] = 1;
      THERE2:
         r2 = ceil(rand("Uniform") * 19);
         if m[r2] then goto THERE2;
         * Check for previous selections;
         do v = 1 to mon - 1;
            if k[v, r1] = r2  or k[v, r2] = r1 then do; 
               m[r1] = .; goto THERE1; end;
         end;
         m[r2] = 1;
         k[mon, r1] = r2;
         k[mon, r2] = r1;
         name1 = names[r1];
         name2 = names[r2];
         output;
      end;
      call missing(of m[*]);
   end;

keep mon r1 r2 name1 name2;
run;

proc sort data = monthly;
by mon r1 r2;
run;

proc print data = monthly;
run;


Contributor
Posts: 49

Re: Coffee anyone?

Hi datasp

 

I must say I like your approach as it gives me a fresh perspective on this problem, and its solution. Smiley Wink

In our world one of the issues we deal with is that new people join the company / coffee club, and others leave, so we need to accommodate for newcomers and leavers, which is easy enough.

 

%Let Participants = 19; We set this based on the number of names in our list.

Running the list 6 months ahead of time thus becomes a small problem as it does not include newcomers that joined in Month 3.

Sorry to be a pain in the butt. Man Sad

Thus the requirement to save the HistList and append the currentpairing for the month to the Histlist.

 

I will see what I can do to load the Histlist into an array and do the checks anyway.

Thank you so much for a very clever and well thought-out answer

Smiley Happy

 

Super Contributor
Posts: 298

Re: Coffee anyone?

You did not mention in your earlier posting about those who leave and join in the middle. Here are some questions.


Do you give new ID for those who join and the IDs of those who left are not used anymore?

What is the approximate number leavers and joiners per month?

What is the maximum number of IDs to be handled in your realistic circumstance?

 

You may work on a sample input data set and show the required output. Better to close this thread and come with a new thread with a different subject line and place your data there. If you consider that my earlier answer is acceptable, favor me so by clicking at the appropriate box.

 

 

Super User
Posts: 10,041

Re: Coffee anyone?

It is hard to understand what you want to do.

Post all your data. And explain it step by step with DATA .

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 366 views
  • 1 like
  • 3 in conversation