Re: Pearson's chi-square item fit statistic (Yen's Q) in PROC IRT

rcturner · Posted 02-17-2021 12:15 PM

I believe the formula described in the SAS IRT Manual (v 14.3) for the Pearson's chi-square item fit statistic may be calculated differently within the PROC IRT programming than what is listed in the manual, and I would like to know what is being used.

I am having my students calculate Yen's Q for item fit statistics (which I believe is presented as the Pearson Chi-Square for item fit in PROC IRT). However, we don't get the same values (for Yen’s Q or G-squared) using the formulas presented in the manual.

I wonder if it is related to recommendations by researchers to use values other than the estimated thetas for calculating P(U=1|theta) for each item (represented as E_ji in Yen’s formula in the SAS IRT manual)? One reason I think a value other than estimated theta is being used is because the item fit statistics do not change when different estimators are used that change the value of theta (ML, EAP, MAP).

I cannot find any SAS documentation that indicates what adjustments are being used (if that is the case) or if a different formula is being used other than what is presented in the manual. If anyone has any suggestions, I would greatly appreciate it. Thank you!

rcturner · Posted 02-17-2021 01:13 PM

I didn't mention before, but wondered if something like S-c2 might be what is used in SAS (in Ames & Penfield, 2015; Orlando & Thissen, 2000)? But again, cannot find any documentation.

Reeza · Posted 02-17-2021 01:41 PM

Just to fully understand - I likely won't be able to answer your question, you're referring to these equations where the manual calculation is not matching the formula's shown?
https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statu...

EDIT: it helps if you can show an example of your code, both the PROC IRT and the manual correction to ensure that we're using the same options in the testing.

rcturner · Posted 02-17-2021 03:44 PM

Yes - the 3rd formula with the text: Pearson's chi-square statistic, proposed by Yen (1981), has the form....

I could be doing something wrong, but my doctoral students and I have been trying this with more than one dataset (over two semesters) and cannot figure out why our calculations are not matching SAS's when this formula is pretty simple. We are at the stage where we think something other than estimated theta's are being used for the E_jk in the formula.

Thank you for any help.

ballardw · Posted 02-17-2021 03:59 PM

If you are comparing a "by hand" calculation with a SAS procedure it is always a good idea to see how many observations SAS reports as "used" and your manual calculation use. Sometimes one of your procedure options may reduce the observations that SAS uses that you may not think affect the manual calculation but the procedure code excludes them.

rcturner · Posted 03-03-2021 02:28 PM

Good thing to remember. Unfortunately the item fit statistics in PROC IRT do not provide information about analytical n for the procedure. It only provides a value of df which does not use n in its calculation. We are matching the sample size reported for the PROC IRT analysis.

Reeza · Posted 02-17-2021 04:35 PM

An example would really help here, otherwise I'd have to take the time to mock up a full problem and solution.

Rick_SAS · Posted 02-18-2021 01:21 PM

If you post sample data, the PROC IRT code, and the program (presumably PROC IML) that you are using to validate the formula, we'll take a look at what you are doing and figure out why you are not getting the same result as the procedure.

My initial guess is it has something to do with the line in the documentation that says "partition them into 10 intervals such that the number of subjects in each interval is approximately equal." That line is ambiguous. For example, if N is not a multiple of 10, there are many ways to implement that statement. Another ambiguity involves tied values. But hopefully we can get agreement for the unambiguous cases.

Another thing to check: the doc says "These item fit statistics apply only to binary items that have one latent factor." So make sure your model has one factor, then sort by that factor and use that ordering to partition the data. I presume you are already doing that.

rcturner · Posted 02-19-2021 07:05 PM

These are great questions, thank you.

Here is an example of 14 dichotomous attitudinal items with N = 684 participants with complete data. EAP estimates were used for the ability scores. Although not an ideal measure of unidimensionality, the first two eigenvalues are 12.374 and 0.937 indicating one prominent factor.

I've attached my PROC IRT code and how I created the 10 decile groups or bins using SAS. The hand calculations are below, along with the Yen's Q test statistics from the PROC IRT program. Item parameter values are provided in the SAS program. I can provide more detail if helpful. If it was a difference of 94.23 vs 97.38, I would think it is just a difference in "bin assignment". But the differences are huge (e.g., item 1 - 133.42 vs 17.36). And the formula is so simple... I can't figure out what we are doing wrong, but I don't want to keep working on it once we realized that changing the estimation method for theta in PROC IRT (e.g., using EAP vs ML vs MAP) did not change the item fit statistics in PROC IRT - which it would have to if the original version of Yen's Q (and LR G-square) are being used. So we knew we had to figure this out first.

	P(u1 = 1 \| q)	p-value						Yen's Q is sum
	Phi	Ohi	Nhi		rhi = Ohi - Phi	rhi2	Phi*(1-Phi)	Nhirhi2 / (Phi(1-Phi))
	Item1	Item1 raw	Frequency per Bin
Decile 1	0.0057	0.0147	68		0.009	0.000	0.006	0.972
Decile 2	0.3463	0.2941	68		-0.052	0.003	0.226	0.819
Decile 3	0.7515	0.8116	69		0.060	0.004	0.187	1.335
Decile 4	0.8974	0.7794	68		-0.118	0.014	0.092	10.283
Decile 5	0.9713	0.8696	69		-0.102	0.010	0.028	25.601
Decile 6	0.9560	0.9706	68		0.015	0.000	0.042	0.345
Decile 7	0.9994	0.9706	68		-0.029	0.001	0.001	94.060
Decile 8	0.9999	1.0000	69		0.000	0.000	0.000	0.007
Decile 9	1.0000	1.0000	68		0.000	0.000	0.000	0.000
Decile 10	1.0000	1.0000	69		0.000	0.000	0.000	0.000

							Yen's Q	133.421

	P(u1 = 1 \| q)							Yen's Q is sum
	Phi	Ohi	Nhi		rhi = Ohi - Phi	rhi2	Phi*(1-Phi)	Nhirhi2 / (Phi(1-Phi))
	Item2	Item2 raw	Frequency per Bin
Decile 1	0.0000	0.0000	68		0.000	0.000	0.000	0.000
Decile 2	0.0000	0.0000	68		0.000	0.000	0.000	0.000
Decile 3	0.0000	0.0000	69		0.000	0.000	0.000	0.003
Decile 4	0.0008	0.0147	68		0.014	0.000	0.001	15.989
Decile 5	0.0689	0.1739	69		0.105	0.011	0.064	11.870
Decile 6	0.7439	0.7059	68		-0.038	0.001	0.191	0.515
Decile 7	0.9982	0.9412	68		-0.057	0.003	0.002	122.961
Decile 8	1.0000	1.0000	69		0.000	0.000	0.000	0.000
Decile 9	1.0000	1.0000	68		0.000	0.000	0.000	0.000
Decile 10	1.0000	1.0000	69		0.000	0.000	0.000	0.000

							Yen's Q	151.339

Although P_hi appears to be 0 or 1 for some bins, these are asymptotic to the lower and upper limits of 0 and 1, thus we made sure the values (e.g., .000021) are retained in the cells.

Item Fit Statistics
Item	DF	Pearson	Pr > P ChiSq	LR	Pr > LR ChiSq
Chi-Square	Chi-Square
Item1	8	17.36219	0.0266	19.54714	0.0122
Item2	8	2.6954	0.952	3.12093	0.9265

I didn't post the actual data because it falls under a MDUA data restriction. But if I need to create a new example, I will.

Ready to join fellow brilliant minds for the SAS Hackathon?