BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ggfggrr
Obsidian | Level 7

I have calculated the confidence intervals using the reference https://www.lexjansen.com/pharmasug/2003/Posters/P048.pdf and implemented the following code for my dataset;

 

PROC MEANS DATA=data NOPRINT ;
by model reason;
VAR default;
OUTPUT OUT=xxtmp N=n MEAN=mean
STDERR=stderr LCLM=lclm Uclm=uclm ;
RUN ;

DATA xxtmp15 ;
SET xxtmp ;
lo = mean - ( TINV ( 0.9 , n-1 ) * stderr ) ;
hi = mean + ( TINV ( 0.9 , n-1 ) * stderr ) ;
RUN ;

However, as a result of it, I receive a lower limit of the confidence interval as negative.

 

I understand that I should use the log-normal distribution for avoiding the negative lower limit for confidence intervals. However, I don't exactly know how I can give this input to PROC MEANS.

 

Any help on this is highly appreciated.

 

Kind regards,

Mari

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Yes, binomial distribution and your code is appropriate here.

 

If the confidence interval for the non-defaults is (made up example) 6% to 19%, then the confidence interval for the defaults is that confidence interval subtracted from 100% (or 81% to 94%).

 

And you can't get negative confidence intervals using the binomial distribution.

--
Paige Miller

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

Negative values in a confidence interval are not impossible, so explain further why this is a problem.

 

Also, why do you compute the confidence intervals via a data step when they are computed by PROC MEANS and stored using the UCLM and LCLM option?

--
Paige Miller
ggfggrr
Obsidian | Level 7

The reason, I believe may be due to the small sample size.

 

I have the very same results with or without using data step.

 

I read literature that this needs to be done using log-normal distribution and don't find the ways to input this need.

 

Thanks

PaigeMiller
Diamond | Level 26

So, you still have not explained why you consider negative confidence intervals a problem that needs to be fixed.

 

You use Lognormal distribution confidence intervals only if your data has a lognormal distribution.

--
Paige Miller
ggfggrr
Obsidian | Level 7

I consider the negative lower confidence interval as a problem as the variable which is a default rate, can not be in negative. The goal is to see how the minimum and maximum values are for a default % for a specific population.

 

May I know how can I check if my data is log-normally distributed. (Indeed, I am going to check myself as well)

 

Thanks

PaigeMiller
Diamond | Level 26

Lognormal and normal are not appropriate for Rates, which I assume are a percent.

 

You might be able to make use of binomial distribution confidence intervals, again depending on the distribution of your data. What is the distribution of your data?

--
Paige Miller
ggfggrr
Obsidian | Level 7

I think I am understanding better with your comments.

 

In my case, i calculated the default % for a population (using the fact that whether a client has defaulted or not. 1 if yes, 0 if not). Then Ideally I am looking into the binomial distribution.  I found this code helping;

 

proc freq data=data
by model reason;
tables default/ nocum norow binomial;
output out=results;
exact binomial;
run; 

Is the above you think is right to do?

 

And in case if this is calculated for the non-defaults (0), exploring to know how can I inform SAS to estimate the confidence interval limits for the defaults (1).

 

Thanks

 

Kind regards,

Mari

PaigeMiller
Diamond | Level 26

Yes, binomial distribution and your code is appropriate here.

 

If the confidence interval for the non-defaults is (made up example) 6% to 19%, then the confidence interval for the defaults is that confidence interval subtracted from 100% (or 81% to 94%).

 

And you can't get negative confidence intervals using the binomial distribution.

--
Paige Miller
Watts
SAS Employee

In PROC FREQ, you can use the BINOMIAL LEVEL= option to specify the variable level for the binomial proportion. For example,

tables default / binomial(level='1');

You can use the BINOMIAL CL= option to specify the type(s) of binomial confidence limits to compute. Please see the doc for more info.

 

It's true that some asymptotic methods might produce an out-of-range confidence limit (e.g., negative) for particular data. PROC FREQ truncates the binomial confidence limits at 0 and 1. 

 

ggfggrr
Obsidian | Level 7

Thank you so much for your help and triggering comments.

 

Kind regards,

FreelanceReinh
Jade | Level 19

@ggfggrr wrote:
proc freq data=data
by model reason;
tables default/ nocum norow binomial;
output out=results;
exact binomial;
run; 

Is the above you think is right to do?

 

And in case if this is calculated for the non-defaults (0), exploring to know how can I inform SAS to estimate the confidence interval limits for the defaults (1).


Just to add to the good advice you've already received:

  • As an alternative to computing the "100%−x%" differences you can use the LEVEL= suboption of the BINOMIAL option of the TABLES statement (see documentation) -- EDIT: I hadn't seen @Watts's post, sorry:
    tables default / nocum binomial(level='1');
  • The NOROW option is redundant here (as no crosstabulation is produced).
  • You should add the BINOMIAL keyword to the OUTPUT statement in order to obtain the desired results in the output dataset:
    output out=results binomial;
  • The EXACT BINOMIAL statement is not needed for the exact confidence interval, but requests an exact test (in your case: of the default null hypothesis P=0.5). Do you really want this?
  • I assume the missing semicolon after your PROC FREQ statement is only a typo.
  • If you have downloaded the PDF file from the URL in your first post, you may want to change the file name to something like P048_FAULTY!.pdf or delete it.
ggfggrr
Obsidian | Level 7

Thanks so much and it helps me a lot in knowing these options.

 

Kind regards.

Mari

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2456 views
  • 3 likes
  • 4 in conversation