Hi all,
I am having issues with finding a right test for comparing accuracy trends over time. Is Cochran-Armitage Trend test is a right choice? I have a 2x2 table comparing test results with gold standard at different time points. I plotted the accuracy trends over time on a graph, but I need to know if there is a significant trend. My data is not on a excel sheet or on SAS. I had to manually calculate accuracy for each time point and then enter them on a excel sheet to generate graphs. In the 2x2 table, If I have empty cells, I added 0.5 to each cell and then calculated accuracy. Any advise is appreciated for calculating and comparing accuracy significance across time. I am attaching the trend grap on here.
My questions are:
1. How to enter data in SAS from a manual 2x2 table, especially when I have 2x2 table data at 4 different time points and I have to do accuracy calculations for 6 different conditions?
2. How to calcualte accuracy using SAS (and how to adjust missing cells)?
3. How to compare accuracy trends (significance, p-value) across 4 different time points?
4. My sample size is low as it is a piolt study involving 18 students.
Thank you for the help,
Sat
Can you show us what the 'gold standard' data you are using for comparison looks like?
And describe the manual calculations you did?.
In a generic sense your data need would likely be 4 data columns: the time point value, an idicator to show which column,which row and a value.
Suppose the 2X2 looks something like:
Col1 col
row1 10 15
row2 17 33
Then the data would look like
Time Row Col Value
10 1 1 10
10 2 1 17
10 1 2 15
10 2 2 33
If you want to have multple topics then add a topic or condition variable.
The actual order would typically not be import as Proc Sort can fix if needed.
To get as SAS you could enter into the editor as:
data want;
input Time Row Col Value;
datalines;
10 1 1 10
10 2 1 17
10 1 2 15
10 2 2 33
;
run;
If you are careful, use Excel to enter the data as above, save as a CSV and use proc import to read into a SAS data set.
It is possible to direct entry in SAS but is often cumbersome.
Suppose your Col1 and Col2 represent Male Female.
Then we ineffect are saying that a value of 1 for column indicates Male and 2 indicates Female.
A format could be used to get pretty text in that case.
You may have to describe what you intend by "adjust missing cells". Do you mean to impute a missing value or exclude from analysis.
Trends are often evaluated by significance of the slope of a line, or significance of parameters in a non-linear, regression model. But 4 points may not be very sensitive.
Hi Ballard,
Thanks for the reply. I created the excel sheet with the required data. The data is about how trainee fellows report their findings compared to expert (gold standard). So I have these values for several variables (Variable). I need to find accuracy for different time points (baseline, 5, 10, 15, 20- highlighted in RED). I want to know if there is any significant different in accuracy% between different time points (baseline, 5, 10, 15, 20).
I also highligthed a set of 4 values, where there were empty cells on 2x2 table (missing counts). I added 0.5 to each cell.
Thanks for your help!
Sat
Graphically, you could look at the evolution of agreement statistics over time
libname xl Excel "&sasforum\datasets\Data_2_by_2.xlsx" access=readonly;
proc sql;
create table two as
select coalesce(study, 0) as study,
variable, fellow, Gold_standard,
int(count) as n
from xl.'Sheet1$'n
order by variable, study;
quit;
libname xl clear;
proc freq data=two;
by variable study;
table fellow*gold_standard / agree;
weight n / zeros;
ods output Kappa=k;
run;
proc transpose data=k out=ktable;
by variable study;
var nValue1;
id name1;
run;
proc sgplot data=ktable;
where variable ne "Ach_Obs";
BAND x=study lower=L_KAPPA upper=U_KAPPA / group=variable transparency=0.5;
series x=study y=_KAPPA_ / markers lineattrs=(pattern=solid thickness=2) group=variable;
xaxis type=discrete;
run;
Accuracy was determined as the ratio of true predictions (sum of true positive and true negative) to all predictions (sum of false and false positives and negatives). (TP+TN)/ (TP+TN+FP+FN).
So it is repeated measure ? and it is longitude data ? Maybe you should post it at Stat forum,Steve could give you proc glimmix code. proc import datafile='/folders/myfolders/data_2_by_2.xlsx' out=have dbms=xlsx replace; run; data want; set have; if variable in ('Hiatal Hernia' 'Motor Pattern') and study in ('Baseline' '5' '10' '15' '20'); run; proc gee data = want; class study variable fellow gold_standard; model Count =variable study/ dist=poisson link=log ; lsmeans study/ ilink cl diff; repeated subject = variable*fellow*gold_standard / within = study type=unstr covb corrw; run;
The PROC GEE code from @Ksharp should provide you everything you need to get started. I might have included some other factors in the model statement, but I am pressed for time this morning to really think clearly about it.
Steve Denham
Steve, I received a notification that you have replied to the question I posed:
"I am interested in analyzing the generalized impulse response functions with Proc VARMAX. However, I am not sure how to program this with the Proc VAXMAX since none of the examples outlined in SAS indicated 'generalized'. I was wondering if someone would kindly help me with the program". However, I could not find your response to my question. Could you let me know where you placed your response. Thanks.
Mahmud
Mahmud,
Since PROC VARMAX is in SAS/ETS, I moved the question to the Forecasting and Econometrics forum, where it should attract a more specialized audience. i didn't have any specific answer.
Steve Denham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.