03-04-2015 09:50 AM
I have data that looks like this:
Status_start_date Status_end_date Status ID
01JAN2001 17DEC2001 one 0001
18DEC2001 23JAN2002 two 0001
23JAN2002 15MAR2015 three 0001
01JAN2001 17MAY2005 three 0002
18MAY2005 23JAN2007 two 0002
23JAN2007 15MAR2015 one 0002
What I want to do is to create a distinct count of ID on monthly basis by status (therefore this should be an average count, since the status isn't always for a complete month..)
Output should look like (let's say I have 1000 ID's):
Monthly_date Status Average_count
01JAN2001 One 300.3
01JAN2001 Two 478.7
01JAN2001 Three 221,0
Can somebody help?
Thanks in advance
03-04-2015 10:03 AM
What do you mean by average count?
Isn't that more of a weighed count?
Would 1 mean the same status for on ID a whole month?
If it's tricky to explain, provide sample data that match (have - want).
03-04-2015 10:19 AM
Weighted count is more correct indeed.
WHAT I HAVE
What I did before was generating for every date range (status_start_date to status_end_date) all the in between lying months.
For the first line (of my example above) it results in
Status_start_date Status_end_date Status ID Month
01JAN2001 17DEC2001 one 0001 01JAN2001
01JAN2001 17DEC2001 one 0001 01FEB2001
01JAN2001 17DEC2001 one 0001 01DEC2001
Based on this table, I did a distinct count on ID by month and status.
WHAT I WANT
the previous isn't sufficient, because DEC2001 is counted as a complete month for status=one and ID=0001, while that status was only active for the first 17 days of DEC2001. I want to account for that problem..