BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aperansi
Quartz | Level 8

I have a dataset that I created that pulls in data over a 13 month period. I am pulling a list of unique keys and dates, and I am getting a count of the keys within a given day.

 

The issue im running into i that I need the count to express the count of api_calls that happen in a month and go from the last Saturday of the previous month, to the last Friday of the current month. The current month should be the month where the count is recorded.

 

For example: I have an invoice for August. The start date of that invoice would be July 28, 2018 (Last Saturday of the Month) to August 31, 2018 (Last Friday of the current month). The count for August should reflect the above period. And all other months should be consistent with this as well.

 

Could someone please help me figure out how to code this into my program?

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
DBailey
Lapis Lazuli | Level 10

similar to other solution....

data have;
input api_date:mmddyy10. api_count;
format api_date mmddyy10. api_count 8.0;
datalines;
1/1/2017 125
8/2/2017 126
8/3/2017 127
8/4/2017 128
8/5/2017 129
8/6/2017 130
8/7/2017 131
8/8/2017 132
8/9/2017 133
8/10/2017 134
8/11/2017 135
8/12/2017 136
8/13/2017 137
8/14/2017 138
8/15/2017 139
8/16/2017 140
8/17/2017 141
8/18/2017 142
8/19/2017 143
8/20/2017 144
8/21/2017 145
8/22/2017 146
8/23/2017 147
8/24/2017 148
8/25/2017 149
8/26/2017 150
8/27/2017 151
8/28/2017 152
8/29/2017 153
8/30/2017 154
8/31/2017 155
9/1/2017 156
9/2/2017 157
9/3/2017 158
9/4/2017 159
9/5/2017 160
9/6/2017 161
;
run;

proc sql;
create table want as select		
	intnx('month',api_date,case when api_date>=nwkdom(5,6,month(api_date),year(api_date)) then 1 else 0 end,'begin') format mmddyy10. as  Period,
	case
		/*if api_date >= last saturday of current month then use last saturday of current month*/
		when api_date >=nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,6,month(api_date),year(api_date))
		/*else use last saturday of previous month*/
		else nwkdom(5,6,month(intnx('month',api_date,-1,'same')),year(intnx('month',api_date,-1,'same')))
		end format mmddyy10. as PeriodStart,
	case
		/*if api_date >= last saturday of current month then use last saturday of next month*/
		when api_date >=nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,6,month(intnx('month',api_date,1,'same')),year(intnx('month',api_date,1,'same')))
		/*else use last saturday of current month*/
		else nwkdom(5,6,month(api_date),year(api_date))
		end 
		/*and backup one day from that*/
		-1 format mmddyy10. as PeriodEnd,
	sum(api_count) as Total_API_Count
from	
	have
group by
	calculated period
	,calculated periodstart
	,calculated periodEnd
;
quit;

View solution in original post

8 REPLIES 8
pink_poodle
Barite | Level 11

A possible start is to express a date in a Month-day-week format

 

that has month ( 1 - 12 ), week (1 - 4), day (1 - 7)

 

Then assign variables month, week, day based on the formatted dates

 

group by month, week, day

 

(a) count whole month, (b) count last friday to the end of the month, (c) count last saturday of previous month to the end of that month.

 

final count = (a) - (b) + (c)

 

 

 

 

FreelanceReinh
Jade | Level 19

Hi @aperansi,

 

You can adapt the solution in the earlier thread to the new period definition. For example, the "invoice month" for a given api_date (e.g. '2018-08' for api_date='30JUL2018'd) can be written as

put(intnx('month',api_date,api_date>nwkdom(5,6,month(api_date),year(api_date))),yymmd7.)
aperansi
Quartz | Level 8

Im having trouble working through this in my head. Can you help me understand this a little more?

FreelanceReinh
Jade | Level 19

Sure. Let's analyze the expression from the inside outwards:

 

  1. The MONTH and YEAR functions are applied to the SAS date value in api_date to obtain month and year of the respective date. These are the third and fourth argument of function NWKDOM.
  2. The first two arguments of NWKDOM, 5 and 6, stand for the fifth (or last) Friday (=sixth day of the week in SAS) of the month specified in the other two arguments. So, nwkdom(5,6,month(api_date),year(api_date)) is the SAS date value of the last Friday of the month api_date falls into. (As described in the documentation of NWKDOM, the 5 in the first argument works even in cases where the last Friday is only the fourth Friday of the month.)
  3. The inequality api_date>nwkdom(...) evaluates to either 1 (true) or 0 (false) depending on whether api_date falls after the last Friday of the month or not. So, the first argument of the PUT function amounts to either intnx('month',api_date,1) or intnx('month',api_date,0).
  4. The INTNX function in turn computes a SAS date value from the next month (third argument 1) or the same month (third argument 0) api_date falls into, namely the beginning of the respective month (see default value of the fourth argument, 'alignment', in the documentation of INTNX).
  5. Finally, the PUT function converts the SAS date value obtained by INTNX into a seven-character string of the form YYYY-MM, e.g. 21397 (='01AUG2018'd) into '2018-08'.*

The result is what you desired as the month of invoice: The "next" month if api_date falls after the last Friday of the month and the "current" month otherwise. You can use this expression in a GROUP BY clause as was suggested by DBailey in the other thread (calculated period). The other two expressions in the GROUP BY clause can be defined in a similar way.

 

*Edit: Feel free to use a different format in place of yymmd7., e.g. monyy7., as long as it aggregates SAS date values on month-year level.

aperansi
Quartz | Level 8
Thank you for providing that clarification! What a great explanation! Im just still confused on how I would get the PeriodStart and Period End for what I need, using the intnx statement. Could you assist me with that?
FreelanceReinh
Jade | Level 19

Okay. Again, we have to distinguish between two cases:

 

  1. api_date falls after the last Friday of the month.
    In this case PeriodStart is the last Saturday of the same month or, equivalently, the day after the last Friday of that month.

    PeriodEnd is the last Friday of the next month.

  2. api_date does not fall after the last Friday of the month.
    In this case PeriodStart is the day after the last Friday of the previous month. (Note that there was an ambiguity in your definition where you wrote "last Saturday of the previous month"! Your example of August 2018, however, resolved this ambiguity: Instead of "last Saturday of the previous month" you meant "the day after the last Friday of the previous month" [which can be the first day of the current month!]. Otherwise, the last few days of August 2018 would appear again in the September invoice!)

    PeriodEnd is the last Friday of the current month.

 

So, we recognize a common pattern:

PeriodStart = nwkdom(5,6,m,y)+1 for certain month-year combinations (m, y) depending on the case. More precisely:

m=month(intnx('month', api_date, (api_date>nwkdom(5,6,month(api_date),year(api_date)))-1))
y= year(intnx('month', api_date, (api_date>nwkdom(5,6,month(api_date),year(api_date)))-1))

So, the final expression for PeriodStart can be written as

nwkdom(5,6,month(intnx('month', api_date, (api_date>nwkdom(5,6,month(api_date),year(api_date)))-1)),year(intnx('month', api_date, (api_date>nwkdom(5,6,month(api_date),year(api_date)))-1)))+1

An equivalent and perhaps better readable CASE expression (assuming you want to use PROC SQL) is:

case when api_date>nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,7,month(api_date),year(api_date))
     else nwkdom(5,6,month(intnx('month',api_date,-1)),year(intnx('month',api_date,-1)))+1
end

(Note that I use nwkdom(5,7,...), i.e. last Saturday in one place, as explained above.)

 

Similarly, PeriodEnd can be written as

nwkdom(5,6,month(intnx('month', api_date, api_date>nwkdom(5,6,month(api_date),year(api_date)))),year(intnx('month',api_date,api_date>nwkdom(5,6,month(api_date),year(api_date)))))

or, equivalently, as

case when api_date>nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,6,month(intnx('month',api_date,1)),year(intnx('month',api_date,1)))
     else nwkdom(5,6,month(api_date),year(api_date))
end

All these expressions are quite lengthy. Maybe there is a more elegant way to define these non-standard intervals as custom intervals which can be used in the INTNX (and INTCK) function (see documentation).

 

And definitely there are other ways to compute the (invoice-)monthly totals. Here is an example:

/* Create a format to assign the "invoice month" to a given date */

data invmonf / view=invmonf;
fmtname='invmonf';
do start='01AUG2017'd to '31DEC2020'd; /* modify the range as appropriate */
  label=put(intnx('month',start,start>nwkdom(5,6,month(start),year(start))),yymmd7.);
  output;
end;
format start weekdate.;
run;

proc format cntlin=invmonf;
run;

/* Compute total counts for each "invoice month" */

proc summary data=have nway;
format api_date invmonf.;
class api_date;
var count;
output out=mtotals(drop=_: rename=(api_date=invoice_month)) sum=;
run;

proc print data=mtotals;
run;

 

 

 

FreelanceReinh
Jade | Level 19

Interesting, I've just noticed that @DBailey had posted another reply. A quick comparison of the results (for made-up input data) has revealed that they differ! I leave it to you, @aperansi, to decide which solution is correct for your purposes. (I'd use your example, the invoice for August 2018, to find out.)

DBailey
Lapis Lazuli | Level 10

similar to other solution....

data have;
input api_date:mmddyy10. api_count;
format api_date mmddyy10. api_count 8.0;
datalines;
1/1/2017 125
8/2/2017 126
8/3/2017 127
8/4/2017 128
8/5/2017 129
8/6/2017 130
8/7/2017 131
8/8/2017 132
8/9/2017 133
8/10/2017 134
8/11/2017 135
8/12/2017 136
8/13/2017 137
8/14/2017 138
8/15/2017 139
8/16/2017 140
8/17/2017 141
8/18/2017 142
8/19/2017 143
8/20/2017 144
8/21/2017 145
8/22/2017 146
8/23/2017 147
8/24/2017 148
8/25/2017 149
8/26/2017 150
8/27/2017 151
8/28/2017 152
8/29/2017 153
8/30/2017 154
8/31/2017 155
9/1/2017 156
9/2/2017 157
9/3/2017 158
9/4/2017 159
9/5/2017 160
9/6/2017 161
;
run;

proc sql;
create table want as select		
	intnx('month',api_date,case when api_date>=nwkdom(5,6,month(api_date),year(api_date)) then 1 else 0 end,'begin') format mmddyy10. as  Period,
	case
		/*if api_date >= last saturday of current month then use last saturday of current month*/
		when api_date >=nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,6,month(api_date),year(api_date))
		/*else use last saturday of previous month*/
		else nwkdom(5,6,month(intnx('month',api_date,-1,'same')),year(intnx('month',api_date,-1,'same')))
		end format mmddyy10. as PeriodStart,
	case
		/*if api_date >= last saturday of current month then use last saturday of next month*/
		when api_date >=nwkdom(5,6,month(api_date),year(api_date)) then nwkdom(5,6,month(intnx('month',api_date,1,'same')),year(intnx('month',api_date,1,'same')))
		/*else use last saturday of current month*/
		else nwkdom(5,6,month(api_date),year(api_date))
		end 
		/*and backup one day from that*/
		-1 format mmddyy10. as PeriodEnd,
	sum(api_count) as Total_API_Count
from	
	have
group by
	calculated period
	,calculated periodstart
	,calculated periodEnd
;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 2494 views
  • 3 likes
  • 4 in conversation