About Merdock

Merdock · ‎06-21-2022

@tarheel13, thank you, it works for me as well. Accepted it as the solution!

Merdock · ‎06-21-2022

Thank you so much, Tom, this works perfectly!

Merdock · ‎06-18-2022

Sorry, I forgot to mention but my dataset dat2 is sorted by visit so I'm not sure why those visits for ID#011 are not showing up in the order they're supposed to. I couldn't find anything concrete in the documentation about how to get the x-axis to display all 4 visits for all participants though, or if there might be any better plot/way to visualize change in status over time by ID

Merdock · ‎06-18-2022

Just posted it again but it looks the same. It's working and generating the dataset when I paste it into my SAS though, are you having issues with it?

Merdock · ‎06-17-2022

Suppose I have a dataset that looks like the one below where each subject ID can have up to 4 visits (where visit=00 is baseline) and variable disease status across visits (Yes=1, No=0). This means that one subject might have one disease status at one visit but a different status at any of the follow up visits. For example, say subject ID=002 has disease (status=1) at baseline, then no disease at visits 01, 02, and then disease again at visit=03. The issue: I need to create a graph that shows the change in disease status classification from visit to visit for each individual subject ID. But I am having some trouble visualizing how this might look like. What I’ve done: One obvious way I was thinking of achieving this is, putting the disease status categories (yes=1, no=0) on the Y-axis and visit on X-axis and and then plot line plots for each subject ID. Questions: I tried the code below but I cannot figure out what’s causing the x-axis for ID=011 to not display the visit numbers in order? It shows 00, 02, 03 and 01 as displayed in image below. Is there any way to make it such that the x-axis displays all 4 visits for all participants, as opposed to only displaying the visits that each participant has?For example, ID=003 only has visits 01 and 02 but instead of the x-axis for this ID showing only those two visits, I’d like it to show all 4 visits, is that possible? Finally, is this the best way of visualizing this data, especially given the fact that my real dataset has over 100 participants, or are there any other options I should consider? Any help would be appreciated. Thanks! data dat; input ID $ Visit $ Status $; datalines; 001 00 0 001 01 0 001 02 0 001 03 0 002 00 1 002 01 0 002 02 0 002 03 1 003 01 1 003 02 1 004 00 1 004 02 1 004 03 0 005 00 1 005 01 0 005 02 0 005 03 0 006 02 0 007 00 0 007 01 0 008 00 1 008 02 1 008 03 0 009 00 0 009 01 1 009 02 0 009 03 1 010 00 1 011 01 0 011 03 1 ; run; title1"Graph of change in status over time"; proc sgpanel data=dat2; panelby id/ uniscale=row; series x=visit y=status / markers markerattrs=(size=10pt); run;

Merdock · ‎04-23-2022

Hi Steve, You're right - I tried various options and it converges if I comment out the random intercept statement, change the covariance structure to CS for the R matrix and add that NLOPTIONS statement at the end. I might also try playing around with a GEE to see the difference but ultimately I think this is definitely progress. How meaningful the results might be or how well this model fits my data is a different story but at least I got it to converge :). Thank you sooo much for all your step-by-step help, I truly appreciate it! proc glimmix data=final2 noclprint; class id visit (ref="1"); model outcome(event="1")= biom visit/ / dist=binary link=logit solution; random visit/ subject=id residual type=cs; output out=fitglmm pred(ilink noblup)=pred; NLOPTIONS tech=NRRIDG Maxiter=1000; run;

Merdock · ‎04-21-2022

Hi Steve, I tried adding the options you suggested and it looks like that solved the convergence issue though now I’m getting another message saying the G matrix is not positive definite. From what I've been reading one of the possible causes for this might be that there is not enough variation in the response (which would be the case for my data). I also tried removing the random intercept statement and that took care of message about G matrix not being PD, but instead I started getting a message in the output table saying something about the convergence status being indeterminate. Any further ideas or suggestions? I'm starting to think that maybe a GLMM is not that well suited given the scarcity of my data and I should just use good old summary statistics instead of attempting to do any modeling.

Merdock · ‎04-20-2022

Hi Jiltao, thanks for your suggestions. I took out the interaction term (visit*biom), in addition to also trying both the AR(1) covariance structure, as well as removing the random intercept statement but it still doesn’t converge. Here is what I'm getting in each case: When I delete the random intercept and use type=AR(1): Versus when I delete random intercept but still use type=UN: Could this be due maybe to lack of variability in the response variable which would result in a lack of predictive capability, since I only have 9 observations with outcome=1 vs 41 obs with outcome=0?

Merdock · ‎04-20-2022

Hi Steve, Thank you for the suggestions. I tried the following fitting the model on the dataset above and unfortunately there seems to be an issue, as the model does not converge. Could this be due to small sample size (only 15 participants with visit 1 and 5 so total of 30 observations, and only 5 observations with outcome=1 at either visit 1 or 5), or something else? Additionally, I'm also getting a message that the model does not have an intercept, is that because I included the noint option? That message does go away if I remove the noint. That said, does it even make sense or is it necessary to have a random intercept statement here? I know that this would probably not cause the issues mentioned but wouldn't it also be better to use dist=binary instead of binomial since my dependent variable is 0/1 as opposed to events/trial? Sorry for all these questions, I'm completely new to longitudinal/repeated measures analysis and just trying to get a better understanding of things. proc glimmix data=final2 method=RSPL noclprint; class id visit; model outcome = visit biom biom*visit / noint dist=binomial link=logit solution; random intercept / subject=id; random visit/ subject=id type=UN residual; output out=fitglmm pred(ilink noblup)=pred; run;

Merdock · ‎04-18-2022

Hi Steve, Thank you so much for your answers, that's really helpful! To address your question about collapsing the mutinomial, I was also planning on doing a ROC analysis to estimate the predictive ability of the biomarker for the detection of disease, where participants with “normal” and “mildly decreased” values of the ordinal outcome variable would be categorized as belonging to the “non-disease” group, while participants with “moderately decreased” and “severely decreased” values would be considered part of the “disease" group. So I would like to assess the diagnostic performance of the biomarker for several cut-offs by computing accuracy, sensitivity, specificity, positive predictive value, negative predictive value and likelihood ratios. This ROC analysis is the reason why I dichotomized the multinomial response variable. I figured that I can use proc glimmix to explore the relationship between outcome and biomarker from baseline to visit 5, and then take the estimated probabilities of positivity for each observation based on the model use them in PROC LOGISTIC to build the ROC curve, though not sure how to programmatically obtain the positive predictive value, negative predictive value and likelihood ratios? Hopefully this makes sense. Regarding your comment to my question 4 below, about imposing a covariance structure on visit, can I do that by adding another random visit statement, like shown below? I read that proc glimmix does not allow for a repeated statement like proc mixed does but instead we can equivalently use a random statement with the residual option: proc glimmix data=final2 method=quad(qpoints=50) noclprint; class id visit; model outcome= visit biom biom*visit / noint dist=binomial link=logit solution; random intercept/ subject=patid; random visit/ subject=patid type=UN residual; output out=fitglmm pred(ilink noblup)=pred; run;

Merdock · ‎04-17-2022

I have a dataset (data1) with assessments taken at different visits: baseline (visit=1), Day 8 (visit=2), Day 30 (visit=3), Day 60 (visit=4) and Day 180 (visit=5) on an ordinal variable with 5 levels (1=normal, 2=mild, 3=moderate, 4=severe), and a separate dataset (data2) with repeated measures for a continuous variable (biomarker), taken at the same visits (1, 2, 3, 4, 5). One of the exploratory objectives of my exercise problem (I’m trying to teach myself GLMM) asks to investigate the correlation between the biomarker (as predictor) and the ordinal variable (as outcome). Since this is only an exploratory objective, i.e., the study was not specifically designed for this, not all participants in the study will have assessments performed at all those visits. Since this is a longitudinal study, with repeated measurements taken on the same subjects, I am thinking of exploring the correlation between the continuous predictor and categorical outcome from baseline to Day 180 by using repeated measures logistic regression, implemented via PROC GLIMMIX. In preparation for fitting the model, this is what I have done so far: I merged data1 and data2 by id and visit to create one unified dataset (dataset “final” illustrated below) that contains both the outcome and predictor variables at the different visits. For simplicity purposes, I converted my 5-level categorical outcome into a binary one (if “normal or mild” then outcome=0; if “moderate or severe” then outcome=1). I noticed that only 15 out of a total of 46 participants in the study had measurements on both the outcome and predictor variables at both baseline (visit 1) and Day 180 (visit 5). The rest of participants either didn’t have the baseline visit, or didn’t have visit 5, or both. So for my model I used a smaller dataset that only contains these 15 participants who had both visit 1 and visit 5. Outcome variable: Disease status (outcome= 0 means no disease present; outcome=1 means disease) Independent variables: Biomarker (biom) but also visit as fixed-effect factor since we have repeated measurements Dataset: data final; input id $ visit outcome biom; datalines; 001 1 0 59.7 001 2 0 78.4 001 4 0 75.2 001 5 1 80.7 002 2 0 64.6 002 5 0 389 003 1 0 618 003 2 0 469 003 3 0 478 004 1 1 404 004 2 0 47.3 004 3 1 64.5 004 4 0 88.8 004 5 0 86.7 005 1 0 88.3 006 1 0 234 007 1 0 245 007 2 0 243 007 3 0 237 007 5 0 226 008 1 0 22.2 008 2 0 25.5 008 5 1 35.5 009 3 0 35.3 009 5 0 30.3 010 1 0 134 010 5 0 167 011 4 0 146 011 5 0 135 012 1 0 140 012 4 0 74.6 012 5 0 72.9 013 1 0 79.1 013 3 0 75.6 013 4 0 68.9 014 2 1 291 014 3 0 21.3 015 1 0 17 015 5 1 15.6 016 1 0 16.1 016 2 0 13.9 016 5 0 99.9 017 1 0 105 017 3 0 96.2 017 4 0 102 017 5 0 89.2 018 3 1 25.9 019 2 0 27.3 019 3 0 26.2 020 2 0 26.2 020 5 0 28.9 021 1 1 74.2 021 3 0 75.1 021 4 0 60.4 021 5 0 62.2 022 1 0 61.4 023 1 0 12.7 024 2 0 12.1 024 3 0 13.4 025 1 0 11.6 025 5 0 11.5 026 1 0 45.9 026 5 0 47.2 027 1 0 39 027 2 0 38.7 027 3 1 42.7 027 4 0 18.4 027 5 0 15.3 028 1 0 15.9 028 2 0 16.1 028 4 0 15.8 029 1 0 57.8 029 2 1 86.7 029 3 1 88.3 029 5 0 234 030 1 0 245 030 3 0 243 030 5 0 237 031 1 0 226 031 2 0 22.2 031 4 0 18.4 032 4 0 15.3 032 5 0 15.9 033 1 0 16.1 033 2 0 78.4 034 2 0 75.2 034 3 0 80.7 035 1 0 64.6 035 2 0 389 035 3 0 618 035 4 0 469 036 1 0 478 037 1 0 152 038 2 0 148 038 3 0 29.12 039 2 0 421 040 1 0 520 040 2 0 478 040 3 0 18.4 041 2 0 15.3 041 4 0 15.9 042 1 0 16.1 043 1 0 78.4 044 1 0 325 044 2 0 478 044 3 0 452 045 2 0 25.8 045 4 0 15.9 045 5 0 16.1 046 1 0 78.4 046 4 0 16.8 ; run; data final2; input id $ visit outcome biom; datalines; 001 1 0 59.7 001 2 0 78.4 001 4 0 75.2 001 5 1 80.7 004 1 1 404 004 2 0 47.3 004 3 1 64.5 004 4 0 88.8 004 5 0 86.7 007 1 0 245 007 2 0 243 007 3 0 237 007 5 0 226 008 1 0 22.2 008 2 0 25.5 008 5 1 35.5 010 1 0 134 010 5 0 167 012 1 0 140 012 4 0 74.6 012 5 0 72.9 015 1 0 17 015 5 1 15.6 016 1 0 16.1 016 2 0 13.9 016 5 0 99.9 017 1 0 105 017 3 0 96.2 017 4 0 102 017 5 0 89.2 021 1 1 74.2 021 3 0 75.1 021 4 0 60.4 021 5 0 62.2 025 1 0 11.6 025 5 0 11.5 026 1 0 45.9 026 5 0 47.2 027 1 0 39 027 2 0 38.7 027 3 1 42.7 027 4 0 18.4 027 5 0 15.3 029 1 0 57.8 029 2 1 86.7 029 3 1 88.3 029 5 0 234 030 1 0 245 030 3 0 243 030 5 0 237 ; Run; Model: proc glimmix data=final2 method=quad(qpoints=50) noclprint; class id visit; model outcome = visit biom biom*visit / noint dist=binomial link=logit solution; random intercept / subject=id; output out=fitglmm pred(ilink noblup)=pred; run; My questions are: Does my action plan for addressing this exploratory question seem correct? Am I correct in keeping only those participants that have both a baseline visit, as well as visit 5 assessment for the final analysis dataset, and excluding the rest? Or is this problematic? If yes, what are some correct alternatives? Are there any pre-modeling visualization techniques that I can/should use to further explore my data? Is it ok to use boxplots to look at look at the distribution of my continuous variable at each level of the binary outcome? Should I maybe use point-biserial correlation first to see if there’s any evidence of a relationship at all between my predictor and dependent variable before fitting the model? If yes, is there such a thing as point-biserial correlation for repeated measures data, or should I just use the baseline values of the variables? Is my model setup correct/complete? How can I check to see if my model fits the data well? I know that for regular linear models, there’s residual plots, QQ plots, check for outliers and influential points etc. But not sure what kind of model diagnostics are best for GLMMs? Any other suggestions/recommendations? Thank you so much for any help and guidance.

Merdock · ‎03-01-2022

1) Yes, a participant can receive more than 1 drug in 1 obs 2) Yes, the final dataset should only contain patid, evid, date and the drug indicators. Thank you for your follow-up!

Merdock · ‎02-28-2022

Hi everybody, I have dataset have below with participants who were given three meds: drug A, drug B and drug C. I want to get dataset want below, where if the participant was given either drug on a previous date, DRUGA (or DRUGB or DRUGC) then DRUGA (or DRUGB or DRUGC) turns from . to 2. If no drug was used on that given date or a prior date then DRUGA (or DRUGB or DRUGC)=0. Otherwise, if drug was used on given date but not on any prior date, then DRUGA(or DRUGB or DRUGC)=1. I have figured out a way to do, with code below but I want to know if there's a shorter, more efficient/elegant way of programming this. Here are my have and want datasets, as well as the code I've come up with. Thanks in advance for any suggestions! data have; length patid $25; input patid $ event $ date:mmddyy. diff med DRUGA DRUGB DRUGC; format date mmddyy.; datalines; P001 2 06/07/2017 1 1 . 1 . P001 1 06/10/2017 . . . . . P001 1 06/12/2017 . . . . . P001 1 06/13/2017 . . . . . P001 1 06/20/2017 . 1 . 1 . P001 0 06/23/2017 . 1 . . 1 P001 0 06/24/2017 . 1 . . 1 P001 0 06/25/2017 . . . . . P002 1 07/02/2019 . . . . . P002 2 07/03/2019 1 1 . 1 . P002 1 07/06/2019 . 1 1 . . P002 1 07/10/2019 . . . . . ; run; data want; length patid $25; input patid $ event $ date:mmddyy. DRUGA DRUGB DRUGC; format date mmddyy.; datalines; P001 1 06/10/2017 0 2 0 P001 1 06/12/2017 0 2 0 P001 1 06/13/2017 0 2 0 P001 1 06/20/2017 0 1 0 P001 0 06/23/2017 0 2 1 P001 0 06/24/2017 0 2 1 P001 0 06/25/2017 0 2 2 P002 1 07/02/2019 0 0 0 P002 1 07/06/2019 1 2 0 P002 1 07/10/2019 2 2 0 ; run; data want; set have; by patid date; DRUGA_orig=DRUGA; DRUGB_orig=DRUGB; DRUGC_orig=DRUGC; retain DRUGA_imp DRUGB_imp DRUGC_imp; if first.patid then do; DRUGA=DRUGA_orig; DRUGA_imp=DRUGA_orig; DRUGB=DRUGB_orig; DRUGB_imp=DRUGB_orig; DRUGC=DRUGC_orig; DRUGC_imp=DRUGC_orig; end; array x1 DRUGA DRUGB DRUGC; array x2 DRUGA_imp DRUGB_imp DRUGC_imp; do i=1 to dim(x1); if x1[i]=. then x1[i]=x2[i]; else x2[i]=x1[i]; end; drop i; *recode med indicator from 1 to 2 when drug taken on prior date; *recode med indicator from . to 0 when no drug taken prior to, or on the same date; array xdo1 DRUGA_orig DRUGB_orig DRUGC_orig; array xdo2 DRUGA DRUGB DRUGC; do j=1 to dim(xdo1); if xdo2[j]=. then xdo2[j]=0; if xdo2[j]=1 & xdo1[j]=. then xdo2[j]=xdo2[j]+1; end; drop j; if diff=1 then delete; *delete EVID=2/diff=1 records; run;

Merdock · ‎05-16-2021

@Kurt_Bremser, thank you! Not quite sure what the issue was but it looks like adding the order=internal statement fixed it and is now displaying what I expected.

Merdock · ‎05-11-2021

@Reeza, thank you for the link, very helpful! Ideally, I'd like to have a combination of both compact and easily updated to apply to other situations but I think that for this case, I'll settle for compactness.

Online Status	Offline
Date Last Visited	a week ago

Re: Need help figuring out why this code doesn't work as expected

Re: Need help figuring out why this code doesn't work as expected

Re: Need help figuring out why this code doesn't work as expected

Need help figuring out why this code doesn't work as expected

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Need some advice on Sensitivity, Specificity, PPV, NPV for Repeated Me...

Re: Need help generating output in Word

Re: Need help generating output in Word

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: Need help generating output in Word

Need help figuring out why this code doesn't work as expected

Re: Need some advice on Sensitivity, Specificity, PPV, NPV for Repeate...

Re: how to set up dataset for survival analysis with time-dependent co...

Re: how to set up dataset for survival analysis with time-dependent co...

how to set up dataset for survival analysis with time-dependent covari...

Re: proc sgplot for change in binary variable over time by ID

Re: proc sgplot for change in binary variable over time by ID

Re: proc sgplot for change in binary variable over time by ID

Re: proc sgplot for change in binary variable over time by ID

proc sgplot for change in binary variable over time by ID

Re: correlation between binary and continuous variable with PROC GLIMM...

Re: correlation between binary and continuous variable with PROC GLIMM...

Re: correlation between binary and continuous variable with PROC GLIMM...

Re: correlation between binary and continuous variable with PROC GLIMM...

Re: correlation between binary and continuous variable with PROC GLIMM...

correlation between binary and continuous variable with PROC GLIMMIX

Re: indicator if drug taken on same or prior date

indicator if drug taken on same or prior date

Re: manipulations with proc freq

Re: manipulations with proc freq