Hi guys,
I have a strange issue with proc ttest, something I must be missing..I'm using the 'paired' ttest as I have two different variables I am trying to compare to show they have statistically equal means accross by groups, but the issue is, it seems to be showing very low p-values, indicating rejection of the null hypothesis that means are equal, when the variables are equal down to the 6th-7th decimal place, but showing higher p-values inducating failure to reject the null hypothesis for by groups where the variables are actually nontrivially different.. So it seems like my null hypothesis is reversed, I've tried using the default options (not setting h0 option or anything else), and specifying specific options for the test & null hypothesis. Here is my code:
v1:
proc ttest data=forttest test=diff h0=0;
by loan_price_cat y9c;
paired aggr_roll_on_rate*roll_on_rate;
run;
v2:
proc ttest data=forttest test=ratio h0=1;
by loan_price_cat y9c;
paired aggr_roll_on_rate*roll_on_rate;
run;
v3:
proc ttest data=forttest;
by loan_price_cat y9c;
paired aggr_roll_on_rate*roll_on_rate;
run;
And for example these are the manually calculated ratios of the vars 'aggr_roll_on_rate / roll_on_rate' for the first by group, which the above code (all 3 versions) gives p-value < .0001, rejecting h0:
1.000003427
1.0000059207
1.0000023787
1.0000021402
1.0000017226
1.0000029146
1.0000034323
1.0000064012
1.0000019977
1.0000026861
1.000001868
1.0000023016
1.0000015753
1.0000018216
1.0000020762
1.0000013303
1.0000018023
1.0000018581
1.00000197
So surely this null hypothesis should NOT be rejected here, right? Am I missing something??
-Thank you so much for your time.
What does the data actually look like? The ratios don't really indicate anything other than the paired groups are similar, but there may be something that you are missing. If you could post even 10 lines of data which when you run this should up incorrectly that would help. Additionally what does the output look like when you run
proc means data = forttest;
class loan_price_cat y9c;
var aggr_roll_on_rate roll_on_rate;
run;
This at least will give us an idea of what the means of the two groups are and it the means in this case are drastically different may explain why you are being told to reject H0
loan_price_cat | date | month_end | Aggr_Roll_on_Rate | Y9C | Roll_On_Rate | roll_on_ratio |
Commercial & Industrial Loans, SHAW Business IL Fixed | 12/31/2009 | 200912 | 4.749967444 | C&I Loans | 4.749983722 | 1.000003427 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 01/31/2010 | 201001 | 4.749943754 | C&I Loans | 4.749971877 | 1.000005921 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 02/28/2010 | 201002 | 4.749977403 | C&I Loans | 4.749988701 | 1.000002379 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 03/31/2010 | 201003 | 4.749979668 | C&I Loans | 4.749989834 | 1.00000214 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 04/30/2010 | 201004 | 4.760706983 | C&I Loans | 4.760715184 | 1.000001723 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 05/31/2010 | 201005 | 4.749972311 | C&I Loans | 4.749986156 | 1.000002915 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 06/30/2010 | 201006 | 4.701572491 | C&I Loans | 4.701588628 | 1.000003432 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 07/31/2010 | 201007 | 4.749939189 | C&I Loans | 4.749969594 | 1.000006401 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 08/31/2010 | 201008 | 4.749981022 | C&I Loans | 4.749990511 | 1.000001998 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 09/30/2010 | 201009 | 4.749974482 | C&I Loans | 4.749987241 | 1.000002686 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 10/31/2010 | 201010 | 4.749982254 | C&I Loans | 4.749991127 | 1.000001868 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 11/30/2010 | 201011 | 4.749978135 | C&I Loans | 4.749989068 | 1.000002302 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 12/31/2010 | 201012 | 4.749985035 | C&I Loans | 4.749992517 | 1.000001575 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 01/31/2011 | 201101 | 4.749982694 | C&I Loans | 4.749991347 | 1.000001822 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 02/28/2011 | 201102 | 4.749980277 | C&I Loans | 4.749990138 | 1.000002076 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 03/31/2011 | 201103 | 4.749987362 | C&I Loans | 4.749993681 | 1.00000133 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 04/30/2011 | 201104 | 4.717541853 | C&I Loans | 4.717550356 | 1.000001802 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 05/31/2011 | 201105 | 4.749982348 | C&I Loans | 4.749991174 | 1.000001858 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 06/30/2011 | 201106 | 4.749981285 | C&I Loans | 4.749990643 | 1.00000197 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 07/31/2011 | 201107 | 4.749985529 | C&I Loans | 4.749992765 | 1.000001523 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 08/31/2011 | 201108 | 4.742211513 | C&I Loans | 4.742218886 | 1.000001555 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 09/30/2011 | 201109 | 4.745776323 | C&I Loans | 4.745782374 | 1.000001275 |
Commercial & Industrial Loans, SHAW Business IL Fixed | 10/31/2011 | 201110 | 4.749986761 | C&I Loans | 4.749993381 | 1.000001394 |
overmar, thank you for your reply. This is the first by group of data, Please let me know if this helps indicate the problem.
The result is correct. Do not look complete values but the differences after the third decimal values. Remember the properties of the variance of a variable, it remain invariante when adding or subtracting a constant. Analyze original value minus 4.74 and results must be equal. (excuse my poor english).
This is the analysis with Excel
Prueba t para medias de dos muestras emparejadas | ||
Aggr_Roll_on_Rate | Roll_On_Rate | |
Media | 4.746407657 | 4.746419083 |
Varianza | 0.000149777 | 0.000149767 |
Observaciones | 23 | 23 |
Coeficiente de correlación de Pearson | 0.999999868 | |
Diferencia hipotética de las medias | 0 | |
Grados de libertad | 22 | |
Estadístico t | -8.702547638 | |
P(T<=t) una cola | 7.0983E-09 | |
Valor crítico de t (una cola) | 1.717144335 | |
P(T<=t) dos colas | 1.41966E-08 | |
Valor crítico de t (dos colas) | 2.073873058 |
Thank you for your reply. I am interested in seeing your excel work to produce these conclusions. Would you mind sharing the excel sheet with me?
Thanks!!
You appear to have time series data, I'm not sure a t-test is the appropriate statistical test for your data.
Hmmm...Is there are a SAS procedure you may recommend for taking time series data and showing that the two series are essentially equal?
My answer was focused on the apparent lack of precision of PROC TTEST and I have overlooked the serial nature of the data. I have used the Excel data analysis toolpack (Tools-> Data Analysis Toolpack in older versions, Data -> Data Analysis Toolpack in Excel 2007/2010) for simplicity. I am not sure if time series analysis solve your problem due to the paired nature of your data. I think repeated measures models with PROC GLM or PROC MIXED help you better. Or use advanced features of PROC TTEST (http://www2.sas.com/proceedings/sugi31/208-31.pdf). But your first analysis is just right.
Below is a simple example of repeated measures analysis with PROC MIXED, but statements are approximate:
*** create a new dataset;
data newtest; set forttest;
state="aggr"; rate=aggr_roll_on_rate; output;
state="end"; rate=roll_on_rate;
drop aggr_roll_on_rate roll_on_rate;
*** perform the analysis;
proc mixed;
class loan_price_cat y9c date state;
model rate=loan_price_cat y9c | state;
lsmeans loan_price_cat y9c * state / slice=loan_price_cat y9c;
repeated date / type=cs sub=loan_price_cat y9c;
*** and other models (AR, UN, etc);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.