Hello,
I am using NHIS Complex Survey Weighted data and I need to produce Cohen's Kappa Statistics and Prevalence-adjusted bias-adjusted Kappa (PABAK) Statistics for a variety of variables. As background, we chose to calculate a PABAK because the prevalence of our outcome is approximately 2% in the national population so we want to adjust our statistics for any bias introduced by the low prevalence. In addition, I am comparing two different case ascertainment methods to one another (claims data and complex survey data) and Cohen's Kappa does not account for the bias introduced by two different methodologies. In the past, I have used SAS-Callable SUDAAN to calculate Cohen's Kappa and am quite familiar with it's functions, but unfortunately, I have not found evidence that it can calculate a PABAK.
Currently I am running into issues understanding all of the capabilities that SAS has to calculate these measures while also selecting the appropriate subgroup in my dataset. I have several questions that would like help with:
1) Should use Proc Surveyfreq or Proc Freq to calculate Cohen's Kappa and PABAK? To my understanding, I need to use Proc Surveyfreq to handle the complex survey weights, but there is not a domain function to allow me to select the correct subgroup in my data AND allow me to calculate a Kappa Statistic.Will my estimates be biased if I use Proc Freq with the domain function?
2) Is the weighted kappa function WTKAPPA similar to a PABAK? Is this a superior method?
3) Is there another statistical method that I can use in SAS or SAS-Callable SUDAAN that will allow for me to account for bias introduced because of the low prevalence AND different case methodologies utilized?
Thanks in advance!
You should be using SURVEYFREQ to properly account for the design effect.
You can use the AGREE(PABAK) option on the TABLES statement within Proc SURVEYFREQ to get the prevalence-adjusted bias-adjusted kappa coefficient. There is more information in the documentation here.
If you need to calculate it for only a single domain, then you should use the syntax indicated under the heading “Subsetting Multiway Tables” here. You may also want to take a look at the information about Domain analysis here.
You should be using SURVEYFREQ to properly account for the design effect.
You can use the AGREE(PABAK) option on the TABLES statement within Proc SURVEYFREQ to get the prevalence-adjusted bias-adjusted kappa coefficient. There is more information in the documentation here.
If you need to calculate it for only a single domain, then you should use the syntax indicated under the heading “Subsetting Multiway Tables” here. You may also want to take a look at the information about Domain analysis here.
Thank you. I figured out what my issue was.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.