Hi All,
I am facing a bit of dead end with the logic. Thought to post in on the forum to seek help.
I have the following datasets.
data have1;
input case action_in_date action_out_date priority;
cards;
100 28-MAY-2015 29SEP2015 02
100 04-SEP-2015 27NOV2015 03
100 27-NOV-2015 17SEP2016 04
;
run;
data have2;
input case post_date amount;
cards;
100 18-SEP-2015 250
100 19-SEP-2015 100
100 30-NOV-2015 300
;
run;
Now, the post date in first 2 rows (18-SEP-2015 and 19-SEP-2015) in have2 dataset have a post date which falls between two date ranges in have1 dataset. i e. between 28-MAY-2015 and 29SEP2015 and 04-SEP-2015 and 27NOV2015. But i want the money to be allocated to highest priority. ie between 04-SEP-2015 and 27NOV2015 which has a priority 03. The output i am looking for is below
/*output;*/
100 04-SEP-2015 27NOV2015 03 350
100 27-NOV-2015 17SEP2016 04 300
Any help is highly appreciated.
I presume you have different date ranges for difference CASE values. If so, I think there are 3 steps.
%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));
/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
set have1;
by case;
array _aidate{&lobound:&upbound} _temporary_;
array _aodate{&lobound:&upbound} _temporary_;
array _priorities{&lobound:&upbound} _temporary_;
if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);
retain min_date max_date;
min_date=min(min_date,action_in_date);
max_date=max(max_date,action_out_date);
do post_date=action_in_date to action_out_date;
if priority>_priorities{post_date} then do;
_priorities{post_date}=priority;
_aidate{post_date}=action_in_date;
_aodate{post_date}=action_out_date;
end;
end;
if last.case then do post_date=min_date to max_date;
action_in_date=_aidate{post_date};
action_out_date=_aidate{post_date};
output;
format post_date date9.;
end;
run;
/** step 2 **/
data need / view=need;
merge date_groups have2 (in=inh);
by case post_date;
if inh;
run;
/** step 3 **/
proc summary data=need noprint nway ;
class case action_in_date action_out_date ;
var amount;
output out=want sum(amount)=total_amount;
run;
Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.
All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects. This is relatively simpler to understand.
I presume you have different date ranges for difference CASE values. If so, I think there are 3 steps.
%let lobound=%sysfunc(inputn(01may2015,date9.));
%let upbound=%sysfunc(inputn(30sep2016,date9.));
/** step 1 **/
data date_groups (keep=case action_in_date action_out_date post_date) / veiw=date_groups;
set have1;
by case;
array _aidate{&lobound:&upbound} _temporary_;
array _aodate{&lobound:&upbound} _temporary_;
array _priorities{&lobound:&upbound} _temporary_;
if first.case then call missing(of _aidate{*},of _aodate{*}, of _priorities{*}, min_date, max_date);
retain min_date max_date;
min_date=min(min_date,action_in_date);
max_date=max(max_date,action_out_date);
do post_date=action_in_date to action_out_date;
if priority>_priorities{post_date} then do;
_priorities{post_date}=priority;
_aidate{post_date}=action_in_date;
_aodate{post_date}=action_out_date;
end;
end;
if last.case then do post_date=min_date to max_date;
action_in_date=_aidate{post_date};
action_out_date=_aidate{post_date};
output;
format post_date date9.;
end;
run;
/** step 2 **/
data need / view=need;
merge date_groups have2 (in=inh);
by case post_date;
if inh;
run;
/** step 3 **/
proc summary data=need noprint nway ;
class case action_in_date action_out_date ;
var amount;
output out=want sum(amount)=total_amount;
run;
Note I've set LOBOUND and UPBOUND to the earliest and latest expected action dates over the entire set of CASEs.
All this could be done in a single DATA step, but I don't see how to do so without introducing hash objects. This is relatively simpler to understand.
Like this?
data HAVE1;
input CASE ACTION_IN_DATE date9. ACTION_OUT_DATE : date9. PRIORITY;
format ACTION_IN_DATE ACTION_OUT_DATE date9.;
cards;
100 28MAY2015 29SEP2015 02
100 04SEP2015 27NOV2015 03
100 27NOV2015 17SEP2016 04
run;
data HAVE2;
input CASE POST_DATE date9. AMOUNT;
format POST_DATE date9.;
ID+1;
cards;
100 18SEP2015 250
100 19SEP2015 100
100 30NOV2015 300
run;
proc sql;
create table GET_CARTESIAN as
select have1.CASE
, ACTION_IN_DATE
, ACTION_OUT_DATE
, PRIORITY
, ID
, AMOUNT
from HAVE1, HAVE2
where have1.CASE=have2.CASE
and ACTION_IN_DATE <= POST_DATE <= ACTION_OUT_DATE
order by 1, ID, PRIORITY;
quit;
data GET_PRIORITY;
set GET_CARTESIAN;
by CASE ID ;
if last.ID;
run;
proc sql;
select a.CASE
, ACTION_IN_DATE
, ACTION_OUT_DATE
, PRIORITY
, sum(AMOUNT)
from GET_PRIORITY a
group by 1,2,3,4;
quit;
| CASE | ACTION_IN_DATE | ACTION_OUT_DATE | PRIORITY | |
|---|---|---|---|---|
| 100 | 04SEP2015 | 27NOV2015 | 3 | 350 |
| 100 | 27NOV2015 | 17SEP2016 | 4 | 300 |
Thankyou both. Much appreciated
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.