Background: I am using GENMOD to run a GEE on survey weighted data. My code below includes the correlated aspect of my data which is hospital #. I am using a national dataset, and the total observations are ~54 million and ~50 variables (I cut out many already but I realize I should've cut out all but the ones absolutely needed) ods graphics off;
ods exclude all;
ods results off;
sasfile work.finalscabies open;
proc genmod data=finalscabies;
class hosp_nis Nis_stratum race homeless female agecat pay1 zipinc_qrtl / descending;
model scabies(event='1')= Race female AgeCat Pay1 homeless ZIPINC_QRTL / dist=bin link=logit maxiter=10;
weight discwt;
repeated subject=Hosp_Nis(nis_stratum) / TYPE=EXCH;
estimate 'Black' race 0 0 0 0 1 -1 /exp;
estimate 'Hispanic' race 0 0 0 1 0 -1 /exp;
estimate 'API' race 0 0 1 0 0 -1 /exp;
estimate 'NA' race 0 1 0 0 0 -1 /exp;
lsmeans homeless / OR cl;
lsmeans female / OR cl;
lsmeans agecat / OR cl;
lsmeans pay1 / OR diff=all cl;
lsmeans ZIPINC_QRTL / OR diff=all cl;
ods output Estimates=estimateESTscabies GEEEmpPEst=GEEest GEEFitCritera=GEEFit LSMeans=OR1 Diffs=ORDiffs;
run;
ods exclude none;
sasfile work.finalscabies close; Dilemma: I ran the GEE without any of the estimate statements and without the SASFILE line to load it to memory. This took 1 hour to run and output. I added 4 estimate statements and ran the code and it took 33 hours. Unfortunately I still had tweaks to make. I added the lines lsmeans lines, changed the ods settings to hopefully increase performance and ran the code again. This took 48 hours, at which point my computer did an automatic update without me realizing and all was lost. Finally, I added the SASFILE line to load this massive file (24 GB) to memory of which I have 40GB of RAM. I ran the code for the 3rd time and it's currently running at 36 hours elapsed. Here's where the real question comes in. The log only shows: NOTE: Writing HTML5(EGHTML) Body file: EGHTML
27
28 ods graphics off;
29 ods exclude all
30 ods results off; In prior runs of the code, the log would show: "NOTE: Algorithm Converged" after approximately 15-20 minutes. What is my log reflecting? Surely it must be running the GENMOD after 36 hours? Additional Information: Looking at my RAM usage I saw an initial increase to 28GB used early on in the run. It's now down to 18GB use. Thank you for anyone who can provide advice, suggestions, or an answer!! (Yes I will remove the extra variables if for some reason I have to run this again)
... View more