Hello there,
I have a data set as follows:
date Eventdate sales company
20150101 20150105 700 A
20150102 20150105 500 A
20150103 20150105 300 A
20150104 20150105 300 A
20150105 20150105 600 A
20150106 20150105 700 A
20150107 20150105 500 A
20150108 20150105 600 A
20150109 20150105 300 A
20150110 20150105 800 A
20150111 20150105 700 A
20150112 20150105 600 A
20150101 20150109 2000 B
20150102 20150109 2100 B
20150103 20150109 4400 B
20150104 20150109 3200 B
20150105 20150109 1600 B
20150106 20150109 2200 B
20150107 20150109 4200 B
20150108 20150109 2500 B
20150109 20150109 3600 B
20150110 20150109 4700 B
20150111 20150109 3300 B
20150112 20150109 2600 B
20150113 20150109 2300 B
20150114 20150109 3400 B
20150115 20150109 3000 B
20150116 20150109 2100 B
20150117 20150109 3600 B
20150118 20150109 3800 B
I want to calculate the three days’ total before and after the Eventdate. The new data should look like the following:
date Eventdate sales company aftertotal beforetotal
20150101 20150105 700 A 1800 1100
20150102 20150105 500 A 1800 1100
20150103 20150105 300 A 1800 1100
20150104 20150105 300 A 1800 1100
20150105 20150105 600 A 1800 1100
20150106 20150105 700 A 1800 1100
20150107 20150105 500 A 1800 1100
20150108 20150105 600 A 1800 1100
20150109 20150105 300 A 1800 1100
20150110 20150105 800 A 1800 1100
20150111 20150105 700 A 1800 1100
20150112 20150105 600 A 1800 1100
20150101 20150109 2000 B 10600 8900
20150102 20150109 2100 B 10600 8900
20150103 20150109 4400 B 10600 8900
20150104 20150109 3200 B 10600 8900
20150105 20150109 1600 B 10600 8900
20150106 20150109 2200 B 10600 8900
20150107 20150109 4200 B 10600 8900
20150108 20150109 2500 B 10600 8900
20150109 20150109 3600 B 10600 8900
20150110 20150109 4700 B 10600 8900
20150111 20150109 3300 B 10600 8900
20150112 20150109 2600 B 10600 8900
20150113 20150109 2300 B 10600 8900
20150114 20150109 3400 B 10600 8900
20150115 20150109 3000 B 10600 8900
20150116 20150109 2100 B 10600 8900
20150117 20150109 3600 B 10600 8900
20150118 20150109 3800 B 10600 8900
Can anyone help me? Thanks a lot!
data step solution:
data intermediate (keep=company aftertotal beforetotal);
set have;
by company;
retain aftertotal beforetotal;
if first.company
then do;
aftertotal = 0;
beforetotal = 0;
end;
if eventdate - 3 <= date < eventdate then beforetotal + sales;
if eventdate < date <= eventdate + 3 then aftertotal + sales;
if last.company then output;
run;
data want;
merge
have
intermediate
;
by company;
run;
data step solution:
data intermediate (keep=company aftertotal beforetotal);
set have;
by company;
retain aftertotal beforetotal;
if first.company
then do;
aftertotal = 0;
beforetotal = 0;
end;
if eventdate - 3 <= date < eventdate then beforetotal + sales;
if eventdate < date <= eventdate + 3 then aftertotal + sales;
if last.company then output;
run;
data want;
merge
have
intermediate
;
by company;
run;
Hi Kurt,
Thank you. Your code works perfectly!
If your data is not big , you can try SQL. data have; input date : yymmdd10. Eventdate : yymmdd10. sales company $; format date Eventdate yymmdd10.; cards; 20150101 20150105 700 A 20150102 20150105 500 A 20150103 20150105 300 A 20150104 20150105 300 A 20150105 20150105 600 A 20150106 20150105 700 A 20150107 20150105 500 A 20150108 20150105 600 A 20150109 20150105 300 A 20150110 20150105 800 A 20150111 20150105 700 A 20150112 20150105 600 A 20150101 20150109 2000 B 20150102 20150109 2100 B 20150103 20150109 4400 B 20150104 20150109 3200 B 20150105 20150109 1600 B 20150106 20150109 2200 B 20150107 20150109 4200 B 20150108 20150109 2500 B 20150109 20150109 3600 B 20150110 20150109 4700 B 20150111 20150109 3300 B 20150112 20150109 2600 B 20150113 20150109 2300 B 20150114 20150109 3400 B 20150115 20150109 3000 B 20150116 20150109 2100 B 20150117 20150109 3600 B 20150118 20150109 3800 B ; run; proc sql; create table want as select *, (select sum(sales) from have where company=a.company and date between a.Eventdate+1 and a.Eventdate+3) as aftertotal , (select sum(sales) from have where company=a.company and date between a.Eventdate-3 and a.Eventdate-1) as beforetotal from have as a; quit;
Hi Keshan,
My data is actually very big. You're always so warm-hearted. Whenever I have a questions about SAS code, I googled the solution first and I see you helped a lot of people here! Thank you!
The difference between the data step and the SQL solution is that with the data step, the creation of the intermediate sums is done explicitly, while the SQL does it implicitly (albeit with a note about the remerging of summary statistics). It's just that the way that SQL goes about it is often not really performant, and that is the reason why @Ksharp noted a dependency on data set size.
Still it's good to know the SQL way, as you might have to use it when dealing with data in a remote SQL database.
Hi Kurt,
Thank you so much for explaining the difference. It's really good for a learner like me.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.