I am running a logistic regression on 1714 variables (PheWAS). I followed this guide (https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html) to run the regression the "by way." In my final table, I would like to have the number of cases for each predictor (the predictor/exposure is a SNP (genetic variant) yes/no). In my final logistic table I have removed the reference row. Each row is one logistic regression and unique on varname. Table that I get Varname p-value odds ratio _001 .002 10.2 _002 .6 1 the table that I want Varname p-value odds ratio cases_SNP_yes cases_SNP_no _001 0.002 10.2 100 5 _002 0.6 1.0 30 30 The way I currently get cases is to run a proc means step on the input data set (one row per patient (obs=264,000), one column per variable, and a column that indicates exposure) and then merge it with the logistic output by varname. I then repeat the step to get the number of cases for the other predictor. However, this takes a long time and I would think there is a better way to do this. I am wondering if there is an option statement in the proc logistic statement. Sample code is below * code for how I get my logistic table;
proc logistic data = have / alpha=0.00002927;
by VarName; *this is the "by way" ;
class SNP ;
model value = SNP / rsq expb;
ods output ParameterEstimates=model ;
quit;
data model_formated;
set model (rename=(expest=odds_ratio));
where variable = 'SNP'; *keep the row that contain the p value
run;
proc means data=have sum;
by varname ;
where SNP=1;
var value;
output out=cases
sum=count;
run;
data logistic_with_counts;
merge model_formated cases(keep=varname count);
by varname;
run;
... View more