I see clustering all over the place, and I imagine the reviewer did as well. it is not so much that the surgeons form clusters, but that the patients are clustered by surgeon. I would guess that in all of your data, the proportion of patients who were operated on by more than one of the surgeons for which you have info is so small as to be negligible. Given that sort of clustering, you might want to consider a heirarchical model in PROC MIXED that looks something like:
proc mixed data=<yourdata>;
class <all of your categorical variables, but make sure patient id is in here>;
model response = <all of your covariates>;
/* Be sure that surgeon id is in this list of covariates */
random surgeonid/subject=patientid solution;
<lsmeans, estimates, lsmestimates would go in here>
run;
By including surgeon id as both a random and a fixed effect, you obtain both a level estimate and an estimate of additional variability that is attributable to the patients within each surgeon.
SteveDenham
I will try to run below codes and read more about it. Thank you very much. Will post here again about my results.
Hi,
Below is the SAS code i am using to run my mixed linear regression.
MRN is medical record number for patients (N=19164) and Surgeon is surgeon's name(N=85).
proc mixed data=red.final1 ;
class GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 MRN surgeon;
model MMEDISPENSED = GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 MRN surgeon/SOLUTION;
random surgeon/subject=MRN ;
run;
my log looks like below
120 proc mixed data=red.final1 ;
121 class GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender
121! surg_yop2 MRN surgeon;
122 model MMEDISPENSED = GENDER ETHNICITY year11 DRUGS1 SMOKECURR1
122! INSURENUM surg_gender surg_yop2 MRN surgeon/SOLUTION;
123 random surgeon/subject=MRN ;
124 run;
WARNING: Class levels for MRN are not printed because of excessive size.
ERROR: The SAS System stopped processing this step because of insufficient
memory.
NOTE: PROCEDURE MIXED used (Total process time):
real time 2.54 seconds
cpu time 0.12 seconds
Please suggest. Thanks.
Memory issues are hard to work around. First, make sure that you have set options to allocate the maximum memory (MAXMEMSIZE option). Probably best to set this in your configuration file.
If you still run out of memory, remember that the more important factor for PROC MIXED memory is the size of the design matrix. Even if you are out of memory, the output should give some information on the number of columns in the X and Z matrices. It's just a guess, but I think the Z matrix is too large and too sparse to effectively invert. Here are some things to try - first if MRN is numeric, try removing it from the CLASS statement (but make sure the data set is sorted by MRN).
Second, consider using PROC HPMIXED. It is specifically designed for (quoting from the documentation);
linear mixed models with thousands of levels for the fixed and/or random effects
linear mixed models with hierarchically nested fixed and/or random effects, possibly with hundreds or thousands of levels at each level of the hierarchy
To me, this describes your situation exactly. So you might try this to see if it solves some of the problems (which assumes that MRN is or can be made into a continuous variable). Note that MRN is not in the MODEL statement.
proc hpmixed data=red.final1 ;
class GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 surgeon;
model MMEDISPENSED = GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 surgeon/SOLUTION;
random surgeon/subject=MRN ;
run;
If this works, then you can add appropriate TEST statements for hypotheses of interest about the parameters and LSMEANS to get estimates.
SteveDenham
Hi,
I am trying to increase the SAS memory now.
I tried below code -
proc options option=memsize;
run;
Below is log file-
NOTE: SAS initialization used:
real time 2.88 seconds
cpu time 0.85 seconds
1 proc options option=memsize;
2 run;
SAS (r) Proprietary Software Release 9.4 TS1M6
MEMSIZE=2147483648
Specifies the limit on the amount of virtual memory
that can be used during a SAS session.
NOTE: PROCEDURE OPTIONS used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
I am also attaching screen shot of my configuration file screen.
Please let me know how to increase the memsize there. Thanks in advance!
Method one:
At the head of your program, have an options statement:
options -memsize max;
Method two (much better way):
Open the file sasv9.cfg in an editor, find the options line, and add -memsize max to that line;
Currently you are at the default 2G memory. I can't guarantee that increasing memsize will solve all of your memory issues though; I still think addressing the model and random statements are what you will need to do.
SteveDenham
I tried the below code. However SAS keeps running It never gave me results. I waited for around 2 hours.
proc hpmixed data=red.final1 ;
class GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 surgeon;
model MMEDISPENSED = GENDER ETHNICITY year11 DRUGS1 SMOKECURR1 INSURENUM surg_gender surg_yop2 surgeon/SOLUTION;
random surgeon/subject=MRN ;
run;
I would set it up to run overnight. it may take a while - but at least it seems to be doing something as opposed to running out of memory in 2 seconds. Try inserting the option LOGNOTE in the PROC HPMIXED statement You could then periodically check the log to see the current state of affairs.
SteveDenham
Update:
I used below two codes to do univariate analysis Both of them gives me same results. Surgeon year of practice becomes insignificant now which was highly significant with PROC GLM. Surgeon gender becomes insignificant too. This makes me think if i should really adjust for clustering by patients of surgeon. I want to study association of medication dispensed and surgeon characteristics like year of practice and gender. If i adjust for surgeon, are they even going to be significant at all logically.
proc mixed data=red.merge10 ;
class surgeon surgeon_year_of_practice;
model MMEDISPENSE1= surgeon_year_of_practice/solution;
random surgeon;
run;
proc mixed data=red.merge10 covtest;
class surgeon surgeon_year_of_practice;
model MMEDISPENSE1 = surgeon_year_of_practice /SOLUTION;
random intercept/subject=surgeon;
run;
Thanks and cheers to this community!
These two are essentially identical - they both fit a random intercept model for surgeon, so it is not surprising that the results are the same. The only difference I see would be that the first includes a COVTEST option which does some bad things in PROC MIXED. Do not give much weight to the p values that come from this. If you truly want to look at the variance components, try running this in PROC GLIMMIX. There the COVTEST statement does valid ML tests for hypotheses of interest.
SteveDenham
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.