Help to choose max date based on priority

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Help to choose max date based on priority

[ Edited ]

Hi All,

I am facing a bit of dead end with the logic. Thought to post in on the forum to seek help.

I have the following datasets.

 

data have1;
   input case action_in_date action_out_date priority;
   cards;
100 28-MAY-2015 29SEP2015 02
100 04-SEP-2015 27NOV2015 03
100 27-NOV-2015 17SEP2016 04
;
run;

data have2;
input case post_date amount;
   cards;
100 18-SEP-2015 250
100 19-SEP-2015 100
100 30-NOV-2015 300
;
run;

 

Now, the post date in first 2 rows (18-SEP-2015 and 19-SEP-2015)  in have2 dataset have a post date which falls between two date ranges in have1 dataset. i e. between  28-MAY-2015 and 29SEP2015 and 04-SEP-2015 and  27NOV2015. But i want the money to be allocated to highest priority. ie between 04-SEP-2015 and  27NOV2015 which has a priority 03. The output i am looking for is below

/*output;*/
100 04-SEP-2015 27NOV2015 03 350
100 27-NOV-2015 17SEP2016 04 300

 

Any help is highly appreciated.


Accepted Solutions
Solution
‎09-28-2017 08:03 AM
Trusted Advisor
Posts: 1,311

Re: Help to choose max date based on priority

I presume you have different date ranges for difference CASE values.  If so, I think there are 3 steps.

 

  1. Read all the HAVE1 records for a CASE and build an array from the mindate to the maxdate indicating the highest priority, and corresponding date-group (action_in_date:action_out_date).  That is the arrays are a map from each date to the desired date group.
  2. Read all the HAVE2 records for the same case.  Use the map above to accumulate identify which date groups do with each post_date.
  3. The aggregate the amount's for each date group:

 


%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));

/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
  set have1;
  by case;

  array _aidate{&lobound:&upbound} _temporary_;
  array _aodate{&lobound:&upbound} _temporary_;
  array _priorities{&lobound:&upbound} _temporary_;

  if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);

  retain min_date max_date;
  min_date=min(min_date,action_in_date);
  max_date=max(max_date,action_out_date);

  do post_date=action_in_date to action_out_date;
    if priority>_priorities{post_date} then do;
	  _priorities{post_date}=priority;
	  _aidate{post_date}=action_in_date;
	  _aodate{post_date}=action_out_date;
	end;
  end;

  if last.case then do post_date=min_date to max_date;
    action_in_date=_aidate{post_date};
    action_out_date=_aidate{post_date};
	output;
	format post_date date9.;
  end;
run;

/** step 2 **/
data need / view=need;
  merge date_groups have2 (in=inh);
  by case post_date;
  if inh;
run;

/** step 3 **/

proc summary data=need noprint nway ;
  class case action_in_date action_out_date ;
  var amount;
  output out=want sum(amount)=total_amount;
run;

 

Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.

 

All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects.  This is relatively simpler to understand.

View solution in original post


All Replies
Solution
‎09-28-2017 08:03 AM
Trusted Advisor
Posts: 1,311

Re: Help to choose max date based on priority

I presume you have different date ranges for difference CASE values.  If so, I think there are 3 steps.

 

  1. Read all the HAVE1 records for a CASE and build an array from the mindate to the maxdate indicating the highest priority, and corresponding date-group (action_in_date:action_out_date).  That is the arrays are a map from each date to the desired date group.
  2. Read all the HAVE2 records for the same case.  Use the map above to accumulate identify which date groups do with each post_date.
  3. The aggregate the amount's for each date group:

 


%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));

/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
  set have1;
  by case;

  array _aidate{&lobound:&upbound} _temporary_;
  array _aodate{&lobound:&upbound} _temporary_;
  array _priorities{&lobound:&upbound} _temporary_;

  if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);

  retain min_date max_date;
  min_date=min(min_date,action_in_date);
  max_date=max(max_date,action_out_date);

  do post_date=action_in_date to action_out_date;
    if priority>_priorities{post_date} then do;
	  _priorities{post_date}=priority;
	  _aidate{post_date}=action_in_date;
	  _aodate{post_date}=action_out_date;
	end;
  end;

  if last.case then do post_date=min_date to max_date;
    action_in_date=_aidate{post_date};
    action_out_date=_aidate{post_date};
	output;
	format post_date date9.;
  end;
run;

/** step 2 **/
data need / view=need;
  merge date_groups have2 (in=inh);
  by case post_date;
  if inh;
run;

/** step 3 **/

proc summary data=need noprint nway ;
  class case action_in_date action_out_date ;
  var amount;
  output out=want sum(amount)=total_amount;
run;

 

Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.

 

All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects.  This is relatively simpler to understand.

PROC Star
Posts: 2,316

Re: Help to choose max date based on priority

[ Edited ]

Like this?

data HAVE1;
   input CASE ACTION_IN_DATE date9. ACTION_OUT_DATE : date9. PRIORITY;
   format ACTION_IN_DATE ACTION_OUT_DATE  date9.;
   cards;
100 28MAY2015 29SEP2015 02
100 04SEP2015 27NOV2015 03
100 27NOV2015 17SEP2016 04
run;

data HAVE2;
  input CASE  POST_DATE date9. AMOUNT;
  format POST_DATE   date9.;
  ID+1;
  cards;
100 18SEP2015 250
100 19SEP2015 100
100 30NOV2015 300
run;
 

proc sql;
  create table GET_CARTESIAN as
  select  have1.CASE 
        , ACTION_IN_DATE 
        , ACTION_OUT_DATE 
        , PRIORITY
        , ID
        , AMOUNT
  from HAVE1, HAVE2
  where have1.CASE=have2.CASE
    and ACTION_IN_DATE <= POST_DATE <= ACTION_OUT_DATE
  order by 1, ID, PRIORITY;
quit;

data GET_PRIORITY; 
  set GET_CARTESIAN;
  by CASE ID ;
  if last.ID;
run; 

proc sql;
  select  a.CASE 
        , ACTION_IN_DATE 
        , ACTION_OUT_DATE 
        , PRIORITY
        , sum(AMOUNT)
  from GET_PRIORITY a
  group by 1,2,3,4;
quit;
CASE ACTION_IN_DATE ACTION_OUT_DATE PRIORITY  
100 04SEP2015 27NOV2015 3 350
100 27NOV2015 17SEP2016 4 300

 

New Contributor
Posts: 2

Re: Help to choose max date based on priority

Thankyou both. Much appreciated

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 225 views
  • 1 like
  • 3 in conversation