BookmarkSubscribeRSS Feed
Dogo23
Quartz | Level 8

Hi all,

 

I've rolled up email data at a date level for the year and I want to add a couple columns showing from each date, how many recipients or emails there were in the past week and I would like to write this out as a selection. The sample data set is attached, and you you look at it, I'd be looking to add a column where on 01/10/19 the value showing the total communications in the past 7 days would equal 6 total communications in the past week. 

 

Thanks for your help and let me know if you need further clarification. 

 

 

capture.jpg

3 REPLIES 3
Reeza
Super User
Can you show what you expect as output please?
Please post data as text, not an image.

Otherwise, you're likely looking for PROC MEANS with a format on data and possibly a WHERE statement to filter your data dynamically.
Ramakanthkrovi
Obsidian | Level 7

The way I would do it is using a proc sql (which a lot of people suggest not to) and do a self join.

 

if b.date in between a.date-7 (use intnx) and a.date then sum(comms) and group by a.date.

 

 

mkeintz
PROC Star

 

Question: are you asking for 7-day rolling sums?  I notice that you have some missing dates in the time frame, but that can still be accommodated in a data step.

 

I'm not going to transcribe test data you provide in a picture instead of as text, so this program is untested.  It assumes your data set (named HAVE below) is sorted by date, starts at Jan 1, 2019, and can have missing SEND_DATES:

 

data dummy_dates;  
  total_opens=0;
  do send_date='01jan2019'd to today();
run;

data want;
  merge dummy_dates have;
  by send_date;

  array opens {0:6} _temporary_ (7*0);
  retain opens_rolling_n 0;

  i=mod(date,7);

  opens_rolling_n=opens_rolling_n-opens{i}+total_opens;
  opens{i}=coalesce(total_opens,0);
run;
  

 

The technique here is to make a dummy dataset of all possible dates in your time frame, and assign TOTAL_OPEN=0 for each of those dates.  Then merge this dataset (DUMMY_DATES) with HAVE by SENT_DATE.  Any time a send_date is missing in HAVE, it is added with total_opens=0.  When a send_date is present in both datasets, the value in HAVE prevails (because it is the rightmost dataset in the MERGE statement).

 

The program has an array for the 7 most recent values of total_open.  A running OPENS_ROLLING_N is updated by adding in the current and subtracting out the 7-day old value (found in the array).

 

If you want to avoid having less than 7 day total in the first 6 observations, then:

 

data want (drop=_:);
  merge dummy_dates have;
  by send_date;

  array opens {0:6} _temporary_ (7*0);
  retain _opens 0;

  i=mod(date,7);

  _opens=_opens-opens{i}+total_opens;
  opens{i}=coalesce(total_opens,0);

  if _n_>6 then opens_rolling_n=_opens;
run;
  

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1461 views
  • 0 likes
  • 4 in conversation