Hi All,
I am facing a bit of dead end with the logic. Thought to post in on the forum to seek help.
I have the following datasets.
data have1;
input case action_in_date action_out_date priority;
cards;
100 28-MAY-2015 29SEP2015 02
100 04-SEP-2015 27NOV2015 03
100 27-NOV-2015 17SEP2016 04
;
run;
data have2;
input case post_date amount;
cards;
100 18-SEP-2015 250
100 19-SEP-2015 100
100 30-NOV-2015 300
;
run;
Now, the post date in first 2 rows (18-SEP-2015 and 19-SEP-2015) in have2 dataset have a post date which falls between two date ranges in have1 dataset. i e. between 28-MAY-2015 and 29SEP2015 and 04-SEP-2015 and 27NOV2015. But i want the money to be allocated to highest priority. ie between 04-SEP-2015 and 27NOV2015 which has a priority 03. The output i am looking for is below
/*output;*/
100 04-SEP-2015 27NOV2015 03 350
100 27-NOV-2015 17SEP2016 04 300
Any help is highly appreciated.
I presume you have different date ranges for difference CASE values. If so, I think there are 3 steps.
%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));
/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
set have1;
by case;
array _aidate{&lobound:&upbound} _temporary_;
array _aodate{&lobound:&upbound} _temporary_;
array _priorities{&lobound:&upbound} _temporary_;
if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);
retain min_date max_date;
min_date=min(min_date,action_in_date);
max_date=max(max_date,action_out_date);
do post_date=action_in_date to action_out_date;
if priority>_priorities{post_date} then do;
_priorities{post_date}=priority;
_aidate{post_date}=action_in_date;
_aodate{post_date}=action_out_date;
end;
end;
if last.case then do post_date=min_date to max_date;
action_in_date=_aidate{post_date};
action_out_date=_aidate{post_date};
output;
format post_date date9.;
end;
run;
/** step 2 **/
data need / view=need;
merge date_groups have2 (in=inh);
by case post_date;
if inh;
run;
/** step 3 **/
proc summary data=need noprint nway ;
class case action_in_date action_out_date ;
var amount;
output out=want sum(amount)=total_amount;
run;
Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.
All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects. This is relatively simpler to understand.
I presume you have different date ranges for difference CASE values. If so, I think there are 3 steps.
%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));
/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
set have1;
by case;
array _aidate{&lobound:&upbound} _temporary_;
array _aodate{&lobound:&upbound} _temporary_;
array _priorities{&lobound:&upbound} _temporary_;
if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);
retain min_date max_date;
min_date=min(min_date,action_in_date);
max_date=max(max_date,action_out_date);
do post_date=action_in_date to action_out_date;
if priority>_priorities{post_date} then do;
_priorities{post_date}=priority;
_aidate{post_date}=action_in_date;
_aodate{post_date}=action_out_date;
end;
end;
if last.case then do post_date=min_date to max_date;
action_in_date=_aidate{post_date};
action_out_date=_aidate{post_date};
output;
format post_date date9.;
end;
run;
/** step 2 **/
data need / view=need;
merge date_groups have2 (in=inh);
by case post_date;
if inh;
run;
/** step 3 **/
proc summary data=need noprint nway ;
class case action_in_date action_out_date ;
var amount;
output out=want sum(amount)=total_amount;
run;
Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.
All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects. This is relatively simpler to understand.
Like this?
data HAVE1;
input CASE ACTION_IN_DATE date9. ACTION_OUT_DATE : date9. PRIORITY;
format ACTION_IN_DATE ACTION_OUT_DATE date9.;
cards;
100 28MAY2015 29SEP2015 02
100 04SEP2015 27NOV2015 03
100 27NOV2015 17SEP2016 04
run;
data HAVE2;
input CASE POST_DATE date9. AMOUNT;
format POST_DATE date9.;
ID+1;
cards;
100 18SEP2015 250
100 19SEP2015 100
100 30NOV2015 300
run;
proc sql;
create table GET_CARTESIAN as
select have1.CASE
, ACTION_IN_DATE
, ACTION_OUT_DATE
, PRIORITY
, ID
, AMOUNT
from HAVE1, HAVE2
where have1.CASE=have2.CASE
and ACTION_IN_DATE <= POST_DATE <= ACTION_OUT_DATE
order by 1, ID, PRIORITY;
quit;
data GET_PRIORITY;
set GET_CARTESIAN;
by CASE ID ;
if last.ID;
run;
proc sql;
select a.CASE
, ACTION_IN_DATE
, ACTION_OUT_DATE
, PRIORITY
, sum(AMOUNT)
from GET_PRIORITY a
group by 1,2,3,4;
quit;
CASE | ACTION_IN_DATE | ACTION_OUT_DATE | PRIORITY | |
---|---|---|---|---|
100 | 04SEP2015 | 27NOV2015 | 3 | 350 |
100 | 27NOV2015 | 17SEP2016 | 4 | 300 |
Thankyou both. Much appreciated
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.