BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tSAS1
Obsidian | Level 7

I have a database which contains IDs , dates and costs like this:

IDdatecostDifference
ID125/02/2020201
ID126/02/202010.
ID201/05/2020156
ID207/05/202023.
ID303/01/2019181
ID304/01/2019171
ID305/01/201911.

what I want to do is to sum the cost by ID and by dates with gap of 1 to 6 days, so the required data would be as below:

IDdatecost
ID125/02/202030
ID201/05/202038
ID303/01/201946

 

What I tried to do is to create another date variable that takes the same date if we have gaps, and then do the sum by this date variable and the ID with this code but it doesn't work:

 

data new;
set old;
format new_date yymmdd10.;
if (1<=lag_Diff<=6 ) then new_date=lag_date;
if (1<=lag_Diff<=6 and 1<=lag_lag_Diff<=6) then new_date=lag_lag_date;

if new_date eq . then new_date=date;
run;

I would be thankful if you help me 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

First, create a "group date", then run PROC SUMMARY:

data have;
input ID $ date :ddmmyy10. cost  Difference;
format date yymmdd10.;
datalines;
ID1 25/02/2020  20  1
ID1 26/02/2020  10  .
ID2 01/05/2020  15  6
ID2 07/05/2020  23  .
ID3 03/01/2019  18  1
ID3 04/01/2019  17  1
ID3 05/01/2019  11  .
;

data grouped;
set have;
by id;
retain gapdate;
format gapdate yymmdd10.;
if first.id
then gapdate = date;
else if date - gapdate > 6 then gapdate = date;
run;

proc summary data=grouped;
by id gapdate;
var cost;
output
  out=want (
    drop=_:
    rename=(gapdate=date)
  )
  sum()=
;
run;

proc print data=want noobs;
run;

Result:

ID	date	cost
ID1	2020-02-25	30
ID2	2020-05-01	38
ID3	2019-01-03	46

 

 Edit: changed the conditions in the second data step to avoid a "missing" NOTE.

View solution in original post

2 REPLIES 2
Kurt_Bremser
Super User

First, create a "group date", then run PROC SUMMARY:

data have;
input ID $ date :ddmmyy10. cost  Difference;
format date yymmdd10.;
datalines;
ID1 25/02/2020  20  1
ID1 26/02/2020  10  .
ID2 01/05/2020  15  6
ID2 07/05/2020  23  .
ID3 03/01/2019  18  1
ID3 04/01/2019  17  1
ID3 05/01/2019  11  .
;

data grouped;
set have;
by id;
retain gapdate;
format gapdate yymmdd10.;
if first.id
then gapdate = date;
else if date - gapdate > 6 then gapdate = date;
run;

proc summary data=grouped;
by id gapdate;
var cost;
output
  out=want (
    drop=_:
    rename=(gapdate=date)
  )
  sum()=
;
run;

proc print data=want noobs;
run;

Result:

ID	date	cost
ID1	2020-02-25	30
ID2	2020-05-01	38
ID3	2019-01-03	46

 

 Edit: changed the conditions in the second data step to avoid a "missing" NOTE.

tSAS1
Obsidian | Level 7
Thank you very much for the solution

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 703 views
  • 1 like
  • 2 in conversation