- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I believe the formula described in the SAS IRT Manual (v 14.3) for the Pearson's chi-square item fit statistic may be calculated differently within the PROC IRT programming than what is listed in the manual, and I would like to know what is being used.
I am having my students calculate Yen's Q for item fit statistics (which I believe is presented as the Pearson Chi-Square for item fit in PROC IRT). However, we don't get the same values (for Yen’s Q or G-squared) using the formulas presented in the manual.
I wonder if it is related to recommendations by researchers to use values other than the estimated thetas for calculating P(U=1|theta) for each item (represented as E_ji in Yen’s formula in the SAS IRT manual)? One reason I think a value other than estimated theta is being used is because the item fit statistics do not change when different estimators are used that change the value of theta (ML, EAP, MAP).
I cannot find any SAS documentation that indicates what adjustments are being used (if that is the case) or if a different formula is being used other than what is presented in the manual. If anyone has any suggestions, I would greatly appreciate it. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I didn't mention before, but wondered if something like S-c2 might be what is used in SAS (in Ames & Penfield, 2015; Orlando & Thissen, 2000)? But again, cannot find any documentation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Just to fully understand - I likely won't be able to answer your question, you're referring to these equations where the manual calculation is not matching the formula's shown?
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statu...
EDIT: it helps if you can show an example of your code, both the PROC IRT and the manual correction to ensure that we're using the same options in the testing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I could be doing something wrong, but my doctoral students and I have been trying this with more than one dataset (over two semesters) and cannot figure out why our calculations are not matching SAS's when this formula is pretty simple. We are at the stage where we think something other than estimated theta's are being used for the E_jk in the formula.
Thank you for any help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are comparing a "by hand" calculation with a SAS procedure it is always a good idea to see how many observations SAS reports as "used" and your manual calculation use. Sometimes one of your procedure options may reduce the observations that SAS uses that you may not think affect the manual calculation but the procedure code excludes them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you post sample data, the PROC IRT code, and the program (presumably PROC IML) that you are using to validate the formula, we'll take a look at what you are doing and figure out why you are not getting the same result as the procedure.
My initial guess is it has something to do with the line in the documentation that says "partition them into 10 intervals such that the number of subjects in each interval is approximately equal." That line is ambiguous. For example, if N is not a multiple of 10, there are many ways to implement that statement. Another ambiguity involves tied values. But hopefully we can get agreement for the unambiguous cases.
Another thing to check: the doc says "These item fit statistics apply only to binary items that have one latent factor." So make sure your model has one factor, then sort by that factor and use that ordering to partition the data. I presume you are already doing that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
These are great questions, thank you.
Here is an example of 14 dichotomous attitudinal items with N = 684 participants with complete data. EAP estimates were used for the ability scores. Although not an ideal measure of unidimensionality, the first two eigenvalues are 12.374 and 0.937 indicating one prominent factor.
I've attached my PROC IRT code and how I created the 10 decile groups or bins using SAS. The hand calculations are below, along with the Yen's Q test statistics from the PROC IRT program. Item parameter values are provided in the SAS program. I can provide more detail if helpful. If it was a difference of 94.23 vs 97.38, I would think it is just a difference in "bin assignment". But the differences are huge (e.g., item 1 - 133.42 vs 17.36). And the formula is so simple... I can't figure out what we are doing wrong, but I don't want to keep working on it once we realized that changing the estimation method for theta in PROC IRT (e.g., using EAP vs ML vs MAP) did not change the item fit statistics in PROC IRT - which it would have to if the original version of Yen's Q (and LR G-square) are being used. So we knew we had to figure this out first.
P(u1 = 1 | q) | p-value | Yen's Q is sum | ||||||
Phi | Ohi | Nhi | rhi = Ohi - Phi | rhi2 | Phi*(1-Phi) | Nhi*rhi2 / (Phi*(1-Phi)) | ||
Item1 | Item1 raw | Frequency per Bin | ||||||
Decile 1 | 0.0057 | 0.0147 | 68 | 0.009 | 0.000 | 0.006 | 0.972 | |
Decile 2 | 0.3463 | 0.2941 | 68 | -0.052 | 0.003 | 0.226 | 0.819 | |
Decile 3 | 0.7515 | 0.8116 | 69 | 0.060 | 0.004 | 0.187 | 1.335 | |
Decile 4 | 0.8974 | 0.7794 | 68 | -0.118 | 0.014 | 0.092 | 10.283 | |
Decile 5 | 0.9713 | 0.8696 | 69 | -0.102 | 0.010 | 0.028 | 25.601 | |
Decile 6 | 0.9560 | 0.9706 | 68 | 0.015 | 0.000 | 0.042 | 0.345 | |
Decile 7 | 0.9994 | 0.9706 | 68 | -0.029 | 0.001 | 0.001 | 94.060 | |
Decile 8 | 0.9999 | 1.0000 | 69 | 0.000 | 0.000 | 0.000 | 0.007 | |
Decile 9 | 1.0000 | 1.0000 | 68 | 0.000 | 0.000 | 0.000 | 0.000 | |
Decile 10 | 1.0000 | 1.0000 | 69 | 0.000 | 0.000 | 0.000 | 0.000 | |
Yen's Q | 133.421 | |||||||
P(u1 = 1 | q) | Yen's Q is sum | |||||||
Phi | Ohi | Nhi | rhi = Ohi - Phi | rhi2 | Phi*(1-Phi) | Nhi*rhi2 / (Phi*(1-Phi)) | ||
Item2 | Item2 raw | Frequency per Bin | ||||||
Decile 1 | 0.0000 | 0.0000 | 68 | 0.000 | 0.000 | 0.000 | 0.000 | |
Decile 2 | 0.0000 | 0.0000 | 68 | 0.000 | 0.000 | 0.000 | 0.000 | |
Decile 3 | 0.0000 | 0.0000 | 69 | 0.000 | 0.000 | 0.000 | 0.003 | |
Decile 4 | 0.0008 | 0.0147 | 68 | 0.014 | 0.000 | 0.001 | 15.989 | |
Decile 5 | 0.0689 | 0.1739 | 69 | 0.105 | 0.011 | 0.064 | 11.870 | |
Decile 6 | 0.7439 | 0.7059 | 68 | -0.038 | 0.001 | 0.191 | 0.515 | |
Decile 7 | 0.9982 | 0.9412 | 68 | -0.057 | 0.003 | 0.002 | 122.961 | |
Decile 8 | 1.0000 | 1.0000 | 69 | 0.000 | 0.000 | 0.000 | 0.000 | |
Decile 9 | 1.0000 | 1.0000 | 68 | 0.000 | 0.000 | 0.000 | 0.000 | |
Decile 10 | 1.0000 | 1.0000 | 69 | 0.000 | 0.000 | 0.000 | 0.000 | |
Yen's Q | 151.339 | |||||||
Although P_hi appears to be 0 or 1 for some bins, these are asymptotic to the lower and upper limits of 0 and 1, thus we made sure the values (e.g., .000021) are retained in the cells. | ||||||||
Item Fit Statistics | ||||||||
Item | DF | Pearson | Pr > P ChiSq | LR | Pr > LR ChiSq | |||
Chi-Square | Chi-Square | |||||||
Item1 | 8 | 17.36219 | 0.0266 | 19.54714 | 0.0122 | |||
Item2 | 8 | 2.6954 | 0.952 | 3.12093 | 0.9265 |
I didn't post the actual data because it falls under a MDUA data restriction. But if I need to create a new example, I will.