I have calculated the confidence intervals using the reference https://www.lexjansen.com/pharmasug/2003/Posters/P048.pdf and implemented the following code for my dataset;
PROC MEANS DATA=data NOPRINT ; by model reason; VAR default; OUTPUT OUT=xxtmp N=n MEAN=mean STDERR=stderr LCLM=lclm Uclm=uclm ; RUN ; DATA xxtmp15 ; SET xxtmp ; lo = mean - ( TINV ( 0.9 , n-1 ) * stderr ) ; hi = mean + ( TINV ( 0.9 , n-1 ) * stderr ) ; RUN ;
However, as a result of it, I receive a lower limit of the confidence interval as negative.
I understand that I should use the log-normal distribution for avoiding the negative lower limit for confidence intervals. However, I don't exactly know how I can give this input to PROC MEANS.
Any help on this is highly appreciated.
Kind regards,
Mari
Yes, binomial distribution and your code is appropriate here.
If the confidence interval for the non-defaults is (made up example) 6% to 19%, then the confidence interval for the defaults is that confidence interval subtracted from 100% (or 81% to 94%).
And you can't get negative confidence intervals using the binomial distribution.
Negative values in a confidence interval are not impossible, so explain further why this is a problem.
Also, why do you compute the confidence intervals via a data step when they are computed by PROC MEANS and stored using the UCLM and LCLM option?
The reason, I believe may be due to the small sample size.
I have the very same results with or without using data step.
I read literature that this needs to be done using log-normal distribution and don't find the ways to input this need.
Thanks
So, you still have not explained why you consider negative confidence intervals a problem that needs to be fixed.
You use Lognormal distribution confidence intervals only if your data has a lognormal distribution.
I consider the negative lower confidence interval as a problem as the variable which is a default rate, can not be in negative. The goal is to see how the minimum and maximum values are for a default % for a specific population.
May I know how can I check if my data is log-normally distributed. (Indeed, I am going to check myself as well)
Thanks
Lognormal and normal are not appropriate for Rates, which I assume are a percent.
You might be able to make use of binomial distribution confidence intervals, again depending on the distribution of your data. What is the distribution of your data?
I think I am understanding better with your comments.
In my case, i calculated the default % for a population (using the fact that whether a client has defaulted or not. 1 if yes, 0 if not). Then Ideally I am looking into the binomial distribution. I found this code helping;
proc freq data=data
by model reason;
tables default/ nocum norow binomial;
output out=results;
exact binomial;
run;
Is the above you think is right to do?
And in case if this is calculated for the non-defaults (0), exploring to know how can I inform SAS to estimate the confidence interval limits for the defaults (1).
Thanks
Kind regards,
Mari
Yes, binomial distribution and your code is appropriate here.
If the confidence interval for the non-defaults is (made up example) 6% to 19%, then the confidence interval for the defaults is that confidence interval subtracted from 100% (or 81% to 94%).
And you can't get negative confidence intervals using the binomial distribution.
In PROC FREQ, you can use the BINOMIAL LEVEL= option to specify the variable level for the binomial proportion. For example,
tables default / binomial(level='1');
You can use the BINOMIAL CL= option to specify the type(s) of binomial confidence limits to compute. Please see the doc for more info.
It's true that some asymptotic methods might produce an out-of-range confidence limit (e.g., negative) for particular data. PROC FREQ truncates the binomial confidence limits at 0 and 1.
Thank you so much for your help and triggering comments.
Kind regards,
@ggfggrr wrote:
proc freq data=data by model reason; tables default/ nocum norow binomial; output out=results; exact binomial; run;
Is the above you think is right to do?
And in case if this is calculated for the non-defaults (0), exploring to know how can I inform SAS to estimate the confidence interval limits for the defaults (1).
Just to add to the good advice you've already received:
tables default / nocum binomial(level='1');
output out=results binomial;
Thanks so much and it helps me a lot in knowing these options.
Kind regards.
Mari
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.