Hi,
I have two data points: in year 2012, there were 89 cars; in year 2022, there were 111 cars. This percentage increase is 24.7%. Is there a way to obtain a confidence interval around this percentage?
Thanks!
I assume that these are counts from large populations. You could use a Poisson model for the counts and save the information needed by the NLMeans macro which can then compute the percent change and provide a confidence interval as shown below. Note that unless you have the latest SAS 9.4 release, SAS 9.4M9 (TS1M9), you will need to download the latest version of the NLMeans macro (version 3.2). Also note that the NLMeans macro requires version 2.1 of the NLEST macro. See the above link to the NLMeans macro for information on downloading and using the macro.
data a;
year=1; count=89; output;
year=2; count=111; output;
run;
proc genmod;
class year;
model count=year/dist=poisson;
lsmeans year / ilink e plots=none;
ods output coef=c;
store mod;
run;
%nlmeans(instore=mod, coef=c, link=log, f=100*(mu2-mu1)/mu1, flabel=Pct Change, title=Percent change)
@SASUser67 wrote:
Hi,
I have two data points: in year 2012, there were 89 cars; in year 2022, there were 111 cars. This percentage increase is 24.7%. Is there a way to obtain a confidence interval around this percentage?
Thanks!
It doesn't sound to me like you have the right situation for the concept of confidence intervals to be meaningful. You don't apply confidence intervals to actual data observations, that is something that is never done.
You could create confidence intervals around estimates of a statistic (like an average) or a predicted value from a model, but neither of these is what you are talking about.
I assume that these are counts from large populations. You could use a Poisson model for the counts and save the information needed by the NLMeans macro which can then compute the percent change and provide a confidence interval as shown below. Note that unless you have the latest SAS 9.4 release, SAS 9.4M9 (TS1M9), you will need to download the latest version of the NLMeans macro (version 3.2). Also note that the NLMeans macro requires version 2.1 of the NLEST macro. See the above link to the NLMeans macro for information on downloading and using the macro.
data a;
year=1; count=89; output;
year=2; count=111; output;
run;
proc genmod;
class year;
model count=year/dist=poisson;
lsmeans year / ilink e plots=none;
ods output coef=c;
store mod;
run;
%nlmeans(instore=mod, coef=c, link=log, f=100*(mu2-mu1)/mu1, flabel=Pct Change, title=Percent change)
These are confidence intervals with respect to what randomness? No population or sampling from that population has been discussed by the original poster.
These numbers that @SASUser67 talks about could be non-sample from a population, they could be the entire population, in which case there is no randomness (and no confidence intervals)
I think @SASUser67 needs to explain a whole lot more before we can say that a Poisson distribution applies here.
That's brilliant, @StatDave - thanks! I found something similar in a STATA thread: https://www.statalist.org/forums/forum/general-stata-discussion/general/1675933-confidence-intervals... but wasn't sure about the SAS equivalent. Always still surprising to me when we need to execute a macro or two to compute a statistic like this 🙂
Actually, this can be done with just a single run of PROC NLMIXED rather than using the macros. But I find that people find it easier to work with a familiar, more specialized procedure, like GENMOD than with NLMIXED. The following provides essentially the same result as from GENMOD and NLMeans.
proc nlmixed data=a df=1e8;
mu=exp(b1*(year=1) + b2*(year=2));
model count ~ poisson(mu);
estimate 'pct chg' ( 100*(exp(b2)-exp(b1))/ exp(b1) );
run;
@StatDave , that's good to know! If I were to modify the situation a bit and have different denominators (and subsequently rates) for each of the years, how would the code you provided change? For example, in year 2012 there were 89 cars out of a population of 3,000 (3.0%) and in year 2022 there were 111 cars per 5,000 (2.2%). I would like to know the % difference between the 3.0% and 2.2% and its 95% confidence interval. Assuming I couldn't use a Poisson model since I'm no longer using counts? Thanks again!
These are now proportions that you could model with a logistic model and then similarly compute the percent change and confidence interval using NLMeans or directly with NLMIXED. If you just need the difference and its confidence interval, then this can easily be done in PROC FREQ without a fitting a model.
data a;
input year ncars ntot;
y=1; count=ncars; output;
y=0; count=ntot-ncars; output;
datalines;
2012 89 3000
2022 111 5000
;
proc freq;
weight count;
table year*y / riskdiff;
run;
proc genmod data=a;
class year;
model ncars/ntot=year/dist=bin;
lsmeans year / ilink e plots=none;
ods output coef=c;
store mod;
run;
%nlmeans(instore=mod, coef=c, link=logit, f=100*(mu2-mu1)/mu1, flabel=Pct Change, title=Percent change)
Hi @StatDave, sorry that the situation keeps shifting, but now I'm looking at cars per 1,000 people. To modify your code below would I use the following. Essentially make the new counts for each year the count per 1000 and offset by the log(population for the given year). If so, I'm now showing a modeled % change that reverses direction from the crude % change and whose confidence interval is outside the bounds of the raw number. For example, if the 89 and 111 were the per 1,000 rates (24.7% increase), the modeled rate is -10% with a 95% CI of (-31%, -3%). Am I doing something incorrectly? Thanks again for your guidance.
data a;
year=1; count=Cars/1000; ln=log(Denom); output;
year=2; count=Cars/1000; ln=log(Denom); output;
run;
proc genmod;
class year;
model count=year/dist=poisson offset=ln;
lsmeans year / ilink e plots=none;
ods output coef=c;
store mod;
run;
%nlmeans(instore=mod, coef=c, link=log, f=100*(mu2-mu1)/mu1, flabel=Pct Change, title=Percent change)
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.