BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
warnerj
Calcite | Level 5

I am working with a data set that has billing cycles.  Sadly the billing cycles do not correspond exactly with the start and end dates of the months.  I have a column STARTDATE and a column ENDDATE both formatted as dates: yyyymmdd. 

I need another two columns, one that counts from the STARTDATE to the end of the month i.e. if STARTDATE is 1/20/2011 then count the days until 1/31/2011,

and another that counts back from the ENDDATE to the start of the month, i.e. if the ENDDATE is 2/20/2011 then count back to 2/1/2011. 

Is there any way to do this?  The reason is that I need to proportion monthly variables to the billing cycles and of course for a number of the people billed they each have a unique start and end date each month.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The INTNX function is perfect for your purpose :

data have;
informat startdate enddate yymmdd8.;
format startdate enddate yymmdd10.;
input startdate enddate;
datalines;
20110120 20110220
;

data want;
set have;
daysToEnd = intnx("MONTH",startdate,0,"END") - startdate;
daysToBeginning = enddate - intnx("MONTH",enddate,0,"BEGINNING");
run;

proc print; run;

PG

PG

View solution in original post

12 REPLIES 12
PGStats
Opal | Level 21

The INTNX function is perfect for your purpose :

data have;
informat startdate enddate yymmdd8.;
format startdate enddate yymmdd10.;
input startdate enddate;
datalines;
20110120 20110220
;

data want;
set have;
daysToEnd = intnx("MONTH",startdate,0,"END") - startdate;
daysToBeginning = enddate - intnx("MONTH",enddate,0,"BEGINNING");
run;

proc print; run;

PG

PG
warnerj
Calcite | Level 5

I tried using your code and I keep getting an error.  I think the problem is that I am somehow formatting the dates wrong.  Here is an example of what the date looks like in the data set: 20071225 ie year 2007 month 12 day 25.  Originally it was formatted as a numeric but I used your code to for the format and informat:

informat startdate enddate yymmdd8.;

format startdate enddate yymmdd10.;

and then tried the data step you suggested.  However, when I do I get the following error continually in the log:


NOTE: Invalid argument to function INTNX at line 140 column 13.

NOTE: Invalid argument to function INTNX at line 141 column 34.

New_ID=123353443 billyr=2007 billmo=200710 bill_per_beg=********** bill_per_end=**********

bill_days=29 daysToEnd=. daysToBeginning=. _ERROR_=1 _N_=1

and the resulting dataset has missing values "." for both daystoend and daystobeginning.  I'm sorry to ask again after you were kind enough to put the code up but I am at a loss.  I also tried just using the intnx function to get the beginning of the month but it still didn't work. I also also tried to jsut get the month from the bill_per_beg(Bill period beginning aka startdate) and both times I kept getting an error which is why I think it has something to do with how I'm formatting the columns.  But when I look at column attributes it says the format and informat are yymmdd10. and yymmdd8.

warnerj
Calcite | Level 5

I figured it out.  it was my formatting.  Your code works to a T!!  Thank you so much!

art297
Opal | Level 21

You would have to provide some example data and a complete description of your billing cycle but, from your overview, it sounds like creating an interval dataset and using it with the intervalds system option might be exactly what you are looking for.

Like PGStats suggested, the intnx function is probably what you want to use, but my first guess is that you would want to use it paired with an interval dataset.

It is a relatively new SAS option, thus there isn't much available about it.  I and another Forum colleague are presenting a paper on it at the global forum in April.  It is mentioned in the documentation at:

http://support.sas.com/documentation/cdl/en/etsug/63348/HTML/default/viewer.htm#etsug_intervals_sect...

warnerj
Calcite | Level 5

USER   STARTDATE  ENDDATE   AVGTEMP

1          1/12/10          2/12/10          ?

2          1/08/10          2/10/10          ?

3          1/15/10          2/18/10          ?

I am trying to get the average temperature(for example) over the billing period.  Thus I need the number of days from each month in the billing cycle so I can get the average temp for each user in each cycle, then divide by the number of days in the billing cycle and multiply by 30 to normalize.  Is that what you meant by example data?  As for a complete description of the billing cycle, I'm working with a cable tv company data and the billing months for each s\user is pretty much unique as it just starts on the day of each month whenever they started service.

art297
Opal | Level 21

If I correctly understand your specs, then you have 31 possible billing cycles and I presume that billing cycle is identified in your data.

Are the individual cycles based on the same day each cycle or, if not, how are they determined?

warnerj
Calcite | Level 5

You are correct on both counts, if by billing cycle you mean number of of days in the cycle.  The billing cycles are determined by when the customer first signed up, but the data itself seems to give or take a few days sometimes.

art297
Opal | Level 21

Then, like pgstats suggested, you can probably get away with just the int functions, unless you wanted to get precise.  I presume that the month-to-month billing date fluctuations are due to weekends and holidays occuring when the billing cycle comes up.  If you needed to deal with those in a  precise manner, then you could set up 31 different billing calendars and incorporate the intervalds option along with those functions.

You could probably build the interval datasets quite easily by bringing in the weekend and holiday functions.

warnerj
Calcite | Level 5

I tried using pgstats code and I keep getting an error.  I think the problem is that I am somehow formatting the dates wrong.  Here is an example of what the date looks like in the data set: 20071225 ie year 2007 month 12 day 25.  Originally it was formatted as a numeric but I used your code to for the format and informat:

informat startdate enddate yymmdd8.;

format startdate enddate yymmdd10.;

and then tried the data step you suggested.  However, when I do I get the following error continually in the log:


NOTE: Invalid argument to function INTNX at line 140 column 13.

NOTE: Invalid argument to function INTNX at line 141 column 34.

New_ID=123353443 billyr=2007 billmo=200710 bill_per_beg=********** bill_per_end=**********

bill_days=29 daysToEnd=. daysToBeginning=. _ERROR_=1 _N_=1

and the resulting dataset has missing values "." for both daystoend and daystobeginning.  I'm sorry to ask again but I am at a loss.  I also tried just using the intnx function to get the beginning of the month but it still didn't work. I also also tried to jsut get the month from the bill_per_beg(Bill period beginning aka startdate) and both times I kept getting an error which is why I think it has something to do with how I'm formatting the columns.  But when I look at column attributes it says the format and informat are yymmdd10. and yymmdd8.

art297
Opal | Level 21

Your post didn't include the code you ran and, from your error message, there doesn't appear to be a date field present, just a year and year+month field, that may or may not actually be date fields.

Seeing a few example records, as well as your code, would definitely help to understand the errors you got.

PGStats
Opal | Level 21

Variables STARTDATE and ENDDATE must be SAS dates for the intnx function to work. SAS dates are numeric variables. Your problem must be with the way the date variables are imported into a dataset. You would need to provide details of that step to get useful help.


PG

PG
Ksharp
Super User

Arthur

I hope be one of your authors. Just hope. Smiley Happy

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3163 views
  • 6 likes
  • 4 in conversation