BookmarkSubscribeRSS Feed
Reeza
Super User

I think for what you have you can only say, you'll receive X dollars in the first month. You don't have enough information to say how long it will take to get the remaining. To do percentiles you need the distribution. You don't have the distribution, you have a single point in time. 

 

I think what you need is the following to do a distribution curve:

 

MonthIncurred MonthReceived Amount

2016-01 2016-01 10%

2016-01 2016-02 15%

2016-01 2016-03 25%

2016-01 2016-04 75%

...

acemanhattan
Quartz | Level 8

@Reeza

 

I appreciate your help.  But I'm also laughing to myself at how bad we're communicating with one another.

 

I don't care about how the claims develop in months 2,3,...etc.  We know those data points are used for certain methods of projecting claims, but let's forget we know that.  I'm interested in 2 things:

 

(1) I want to know what the ratio is between payments made in the first month, and total payments made, which I'm assuming are all payments made in the first 12 months after a given incurred month.  All I need to answer this is the historical data for month 1 and the total paid through month 12, and I have this data (Paid_0 and UIC_11). In other words, this question is answered.  In fact, I have it answered for every month and every line of business for the last 10 years up through 2016-10.  This problem is solved.

 

(2) I want to answer questions about the data points I've collected in (1); namely, I want to know what the kth percentile is of Paid_0/UIC_11 from (1).  The only thing that I'd add to make that any more complicated is that I want the kth percentile to be calculated at each new month, and the distribution is paid_0/UIC_11 for the previous months.

 

I feel like the misunderstanding of the business problem is what's making this difficult.  Forget I have a business problem.  Help me calculate the kth percentile of the CF for each group, on each observation, and base the kth percentile on only the CFs from the months that preceded the observation where the calculation is occurring (of course, within the given group).

Reeza
Super User

You can use the approach here, adding in the metric for percentiles as I demonstrated earlier. You’ll have to adjust the WHERE to filter for periods less than the current rather than a window. If you want a percentile not listed in the PROC MEANS statistics you’ll need to use PROC UNIVARIATE. 

 

where period < = &i.

 

And the do loop is from minimum to maximum period, no window. 

 

I can’t see a data step approach for a running percentile - this is what you’re doing - and PROC EXPAND doesn’t support percentiles that I saw. 

 

https://gist.github.com/statgeek/e5e43ff45a4ba1f64d0873ff3bc35974

 

 

ballardw
Super User

Since this data has no component that shows how many claims were started/processed/whatever I'm not sure that Excel approach is very valid as the changes in values could very well be highly correlated with the numbers of claims submitted. I would think any process should take that into consideration. Such as having a value that is the count of submitted claims and that would be a weight/freq variable (depending on the type of analysis).

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 18 replies
  • 2550 views
  • 0 likes
  • 3 in conversation