Hi all,
I've rolled up email data at a date level for the year and I want to add a couple columns showing from each date, how many recipients or emails there were in the past week and I would like to write this out as a selection. The sample data set is attached, and you you look at it, I'd be looking to add a column where on 01/10/19 the value showing the total communications in the past 7 days would equal 6 total communications in the past week.
Thanks for your help and let me know if you need further clarification.
The way I would do it is using a proc sql (which a lot of people suggest not to) and do a self join.
if b.date in between a.date-7 (use intnx) and a.date then sum(comms) and group by a.date.
Question: are you asking for 7-day rolling sums? I notice that you have some missing dates in the time frame, but that can still be accommodated in a data step.
I'm not going to transcribe test data you provide in a picture instead of as text, so this program is untested. It assumes your data set (named HAVE below) is sorted by date, starts at Jan 1, 2019, and can have missing SEND_DATES:
data dummy_dates;
total_opens=0;
do send_date='01jan2019'd to today();
run;
data want;
merge dummy_dates have;
by send_date;
array opens {0:6} _temporary_ (7*0);
retain opens_rolling_n 0;
i=mod(date,7);
opens_rolling_n=opens_rolling_n-opens{i}+total_opens;
opens{i}=coalesce(total_opens,0);
run;
The technique here is to make a dummy dataset of all possible dates in your time frame, and assign TOTAL_OPEN=0 for each of those dates. Then merge this dataset (DUMMY_DATES) with HAVE by SENT_DATE. Any time a send_date is missing in HAVE, it is added with total_opens=0. When a send_date is present in both datasets, the value in HAVE prevails (because it is the rightmost dataset in the MERGE statement).
The program has an array for the 7 most recent values of total_open. A running OPENS_ROLLING_N is updated by adding in the current and subtracting out the 7-day old value (found in the array).
If you want to avoid having less than 7 day total in the first 6 observations, then:
data want (drop=_:);
merge dummy_dates have;
by send_date;
array opens {0:6} _temporary_ (7*0);
retain _opens 0;
i=mod(date,7);
_opens=_opens-opens{i}+total_opens;
opens{i}=coalesce(total_opens,0);
if _n_>6 then opens_rolling_n=_opens;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.