DATA Step, Macro, Functions and more

Help in SAS Program automation

Reply
New Contributor
Posts: 4

Help in SAS Program automation

Hi all

I would need some help in writing a program to automate these steps:

1) Group client according to start & end time (if they overlapped, they will be group together) and the room number

2) Flag out the clients if their total sum of 'Est amount' is +/-$5000 - these could be of any combination within the group

Below is a set of sample data and the outcome i would like to achieve. Thanks.

Original Data

client_idEst amountMax PlayAvg. Playin_dateout_dateStart TimeEnd TimetimeplayedRoom Number
P1330002000050003/7/20123/7/20124:17:00 PM6:34:00 PM

137

4444

P2-30000700020003/7/20123/7/20124:31:00 PM6:16:00 PM1054444
P1-23300030000150003/7/20123/7/20126:34:00 PM7:55:00 PM814444
P3-5000050000500003/7/20123/7/20126:59:00 PM7:00:00 PM14444
P1-4700020000100003/7/20123/7/20129:05:00 PM9:48:00 PM436789
P24200025000100003/7/20123/7/20129:24:00 PM10:49:00 PM856789
P480002100080003/7/20123/7/20129:38:00 PM10:55:00 PM776789

Step 1

client_idEst amountMax PlayAvg. Playin_dateout_dateStart TimeEnd TimetimeplayedRoom Number
Group 1
P1330002000050003/7/20123/7/20124:17:00 PM6:34:00 PM

137

4444

P2-30000700020003/7/20123/7/20124:31:00 PM6:16:00 PM1054444
Group 2
P1-23300030000150003/7/20123/7/20126:34:00 PM7:55:00 PM814444
P3-5000050000500003/7/20123/7/20126:59:00 PM7:00:00 PM14444

Group 3

P1-4700020000100003/7/20123/7/20129:05:00 PM9:48:00 PM436789
P24200025000100003/7/20123/7/20129:24:00 PM10:49:00 PM856789
P480002100080003/7/20123/7/20129:38:00 PM10:55:00 PM776789

Step 2

CombinationsFlagged
Group 1
P1 & P2P1, P2

Group 2
P1 & P3

Group 3

P1 & P2P1 & P2
P1 & P4
P1 & P2 & P4P1 &P2 & P4
Super User
Posts: 9,687

Re: Help in SAS Program automation

"Group client according to start & end time (if they overlapped, they will be group together) and the room number"

Your post is obscure . where is start & end time and room number .

What is the meaning of "+/-$5000" ?

New Contributor
Posts: 4

Re: Help in SAS Program automation

Hi Ksharp

Thank you for the response. The Start time and end time is listed on the original data (column 7 and 8 respectively). Room number is on the last column. You might have to scroll to the right to see the figures.

I would like to flag out the combination where the sum of their "Est amount" (column 2) is within the range of + or - $5000.

Step 1 and 2 show the output that i would like to achieve.

Frequent Contributor
Posts: 95

Re: Help in SAS Program automation

data x;

input client_id $ Est_Amount Max_Play Avg_Play in_date anydtdte. out_date anydtdte.

Start_Time time11. End_Time time12. timeplayed Room_Number ;

IN_Datetime = in_date * 86400 + Start_Time;

Out_Datetime = out_date * 86400 + End_Time;

format IN_Datetime Out_Datetime datetime20. In_Date Out_Date Date9. Start_Time End_Time time8.;

datalines;

P1 33000 20000 5000 3/7/2012 3/7/2012 04:17:00 PM 06:34:00 PM 137 4444

P2 -30000 7000 2000 3/7/2012 3/7/2012 04:31:00 PM 06:16:00 PM 105 4444

P1 -233000 30000 15000 3/7/2012 3/7/2012 06:34:00 PM 07:55:00 PM 81 4444

P3 -50000 50000 50000 3/7/2012 3/7/2012 06:59:00 PM 07:00:00 PM 1 4444

P1 -47000 20000 10000 3/7/2012 3/7/2012 09:05:00 PM 09:48:00 PM 43 6789

P2 42000 25000 10000 3/7/2012 3/7/2012 09:24:00 PM 10:49:00 PM  85 6789

P4 8000 21000 8000 3/7/2012 3/7/2012 09:38:00 PM 10:55:00 PM  77 6789

;

run;

data want1;

    if _n_ = 1 then GroupId = 0;

    retain GroupId;

    set x;

    by Room_Number notsorted;

    Lag_Out_Datetime = LAG(Out_Datetime);

    if first.Room_Number or Lag_Out_Datetime < In_Datetime then do;

        GroupId + 1;

    end;

    drop Out_Datetime In_Datetime Lag:;

run;

proc sql noprint;

  select max(Freq) into :maxfreq

  from (

  select GroupId, Count(1) as Freq from want1 group by GroupId);

quit;

data want2;

  array p[1:&maxfreq] $3;

  array total[1:&maxfreq];

  do k=1 to &maxfreq;

    p = ' ';

    total = .;

  end;

  n = 0;

  if _n_ = 1 then do;

    declare hash h(ordered:'a');

    h.defineKey('client_id');

    h.defineData('client_id');

    h.defineDone();

    declare hiter hi('h');

  end;

  do until(last.GroupId);

      set want1;

      by GroupId;

    n + 1;

    p = client_id;

    total = est_amount;

  end;

  do k=2 to n;

    ncomb = comb(n,k);

    do i=1 to ncomb;

      rc=allcomb(i, k, of p

  • );
  •       rc=allcomb(i, k, of total

  • );
  •       tot = 0;

          length Combinations Flagged $200;

          h.clear();

          Combinations = '';

          do j=1 to k;

            tot = SUM(tot,total);

            client_id = p;

            h.add();

          end;

          rc = hi.first();

          do while(rc = 0);

            Combinations = CATX(',',Combinations,client_id);

            rc = hi.next();

          end;

          tot_flag = (ABS(tot) <= 5000);

          if tot_flag ne 1 then Flagged = '';

          else Flagged = Combinations;

          output;

       end;

    end;

    keep GroupId Combinations Flagged;

    run;

    Super User
    Posts: 9,687

    Re: Help in SAS Program automation

    Do you consider to use In_date and Out_date as grouping variable too ?

    If you only consider " start & end time and the room number" ,that might be meet a error .

    Anyway , Anyhow , that is really not easy for all the combination.

    data x;
    input client_id $ Est_Amount Max_Play Avg_Play in_date : mmddyy10. out_date : mmddyy10.
    Start_Time & time11. End_Time  & time12. timeplayed Room_Number ;
    
    format  In_Date Out_Date Date9. Start_Time End_Time time8.;
    datalines;
    P1 33000 20000 5000 3/7/2012 3/7/2012 04:17:00 PM  06:34:00 PM  137 4444
    P2 -30000 7000 2000 3/7/2012 3/7/2012 04:31:00 PM  06:16:00 PM  105 4444
    P1 -233000 30000 15000 3/7/2012 3/7/2012 06:34:00 PM  07:55:00 PM  81 4444
    P3 -50000 50000 50000 3/7/2012 3/7/2012 06:59:00 PM  07:00:00 PM  1 4444
    P1 -47000 20000 10000 3/7/2012 3/7/2012 09:05:00 PM  09:48:00 PM  43 6789
    P2 42000 25000 10000 3/7/2012 3/7/2012 09:24:00 PM  10:49:00 PM   85 6789
    P4 8000 21000 8000 3/7/2012 3/7/2012 09:38:00 PM  10:55:00 PM   77 6789
    ;
    run;
    data one;
     set x ;
     if (Room_Number=lag(Room_Number) and lag(End_Time) lt Start_Time) or (Room_Number ne lag(Room_Number)) then group+1;
    run;
    proc transpose data=one out=temp(drop=_name_) ;
    by group;
    id client_id;
    var Est_Amount;
    run; 
    data two(keep=group var);
     set temp;
     length var $ 200;
     array x{*} p: ;
      n=dim(x); 
    do k=2 to n;
     ncomb=comb(n,k);
     do j=1 to ncomb;
      call allcomb(j, k, of x{*});
      flag=0;sum=0;var=' ';
      do i=1 to k;
       if missing(x{i}) then flag=1;
      end;
      if not flag then do;
                          do i=1 to k;
                            sum+x{i};
                          end;
                          if abs(sum) le 5000 then do;
                                                     do i=1 to k;
                                                      var=catx(',',var,x{i});
                                                     end;
                                                     output;
                                                    end;
                        end;
     end;
    end;
    run;
    data two(keep=group var _var);
     merge two(in=a) temp;
     by group;
     length _var $ 200;
     array x{*} p: ;
     if a then do;
                 do j=1 to countw(var,',');
                  temp=input(scan(var,j,','),best32.);
                  do i=1 to dim(x);
                   if temp eq x{i} and not findw(_var,strip(vname(x{i}))) then _var=catx(',',_var,vname(x{i}));
                  end;
                 end;
                 output;
               end;
    run;
    
    
    
    
    

    Ksharp

    Ask a Question
    Discussion stats
    • 4 replies
    • 288 views
    • 0 likes
    • 3 in conversation