BookmarkSubscribeRSS Feed
Merdock
Quartz | Level 8

Hi everyone,

 

I have a dataset similar with the one provided below with multiple measurements per patient, taken roughly every 6 months. These patients are enrolled in a kidney disease registry where:

Reason = reason for termination from the registry (categorical variable with 1=transplant, 2=dialysis, 3=death).

Termdt = termination date.

eGFR = glomerular filtration rate value (continuous).

COVA = binary time-varying covariate.

CENSDATE = termination date, if available. Otherwise, it’s the last visit date.

SURVTIME = CENSDATE – ENROLLMENTDT.

Time = VISITDT – FIRSTVISITDT.

 

Primary outcome is reason for termination. I’m interested in modeling the association between reason for termination and race while controlling for rate of eGFR decline (slope) and for a time-varying covariate (COVA). The hypothesis being that a lower proportion of racial minorities will have transplant rather than dialysis at time of end stage renal disease onset (i.e., time of termination from registry), adjusted for covariates.

 

My questions:

1. How do I get the rate of decline or slope for eGFR to then use as a covariate in my model?

I’ve read some studies where they were using a linear mixed effects model to get the individual-specific annual eGFR slopes for each participant. I tried the code below with time from enrollment until date of eGFR measurement (“time” variable in my mock dataset) as a fixed effect factor and patient (“ID”) as random effect. I obtained the slope estimates highlighted below in yellow for each participant but not sure if this is correct because in the articles I’ve been reading, the eGFR slopes seemed to be much bigger whereas mine are all < 1.. Can someone please advise if this seems like the right approach?

2. For modeling the association between reason for termination from Registry (outcome) and race (predictor), controlling for eGFR slope (covariate) and COVA (covariate), would multinomial logistic regression be a good choice? If yes, how do I specify the model to account for the fact that one of my covariates (COVA) is a time-varying binary variable?

 

Thank you for any insights/guidance.

 

data mock_dataset;
input ID$ Visit$ Enrollmentdt:mmddyy. Visitdt:mmddyy. FirstVisitdt:mmddyy. LastVisitdt:mmddyy. reason$ Termdt:mmddyy. eGFR Race$ COVA$ CENSDATE:mmddyy. SURVTIME time;
format Visitdt mmddyy10. Enrollmentdt mmddyy10. FirstVisitdt mmddyy10. LastVisitdt mmddyy10. Termdt mmddyy10. CENSDATE mmddyy10.;
datalines;
001	0	1/1/2022	1/1/2022	1/1/2022	6/15/2023	.	.	25.15	0	0	6/15/2023	530	0
001	1	1/1/2022	7/5/2022	1/1/2022	6/15/2023	.	.	32.33	0	0	6/15/2023	530	185
001	2	1/1/2022	1/7/2023	1/1/2022	6/15/2023	.	.	28.77	0	0	6/15/2023	530	371
001	3	1/1/2022	6/15/2023	1/1/2022	6/15/2023	.	.	35.01	0	1	6/15/2023	530	530
002	0	2/10/2021	2/10/2021	2/10/2021	1/12/2023	2	10/25/2023	42.94	1	0	10/25/2023	987	0
002	1	2/10/2021	8/1/2021	2/10/2021	1/12/2023	2	10/25/2023	40.32	1	0	10/25/2023	987	172
002	2	2/10/2021	3/4/2022	2/10/2021	1/12/2023	2	10/25/2023	38.11	1	0	10/25/2023	987	387
002	3	2/10/2021	7/15/2022	2/10/2021	1/12/2023	2	10/25/2023	36.39	1	0	10/25/2023	987	520
002	4	2/10/2021	1/12/2023	2/10/2021	1/12/2023	2	10/25/2023	34.52	1	0	10/25/2023	987	701
003	0	10/12/2021	10/12/2021	10/12/2021	4/15/2023	1	5/6/2024	58.19	0	0	5/6/2024	937	0
003	1	10/12/2021	5/5/2022	10/12/2021	4/15/2023	1	5/6/2024	61.92	0	1	5/6/2024	937	205
003	2	10/12/2021	11/20/2022	10/12/2021	4/15/2023	1	5/6/2024	54.02	0	1	5/6/2024	937	404
003	3	10/12/2021	4/15/2023	10/12/2021	4/15/2023	1	5/6/2024	46.15	0	1	5/6/2024	937	550
004	0	9/25/2020	9/25/2020	9/25/2020	4/18/2022	.	.	20.2	1	0	4/18/2022	570	0
004	1	9/25/2020	3/5/2021	9/25/2020	4/18/2022	.	.	28.5	1	0	4/18/2022	570	161
004	2	9/25/2020	10/27/2021	9/25/2020	4/18/2022	.	.	26.61	1	0	4/18/2022	570	397
004	3	9/25/2020	4/18/2022	9/25/2020	4/18/2022	.	.	.	1	0	4/18/2022	570	570
005	0	2/9/2021	2/9/2021	2/9/2021	1/15/2022	3	6/5/2023	35.96	0	0	6/5/2023	340	0
005	1	2/9/2021	8/18/2021	2/9/2021	1/15/2022	3	6/5/2023	23.25	0	1	6/5/2023	340	190
005	2	2/9/2021	1/15/2022	2/9/2021	1/15/2022	3	6/5/2023	21.98	0	0	6/5/2023	340	340
006	0	12/23/2022	12/23/2022	12/23/2022	5/15/2023	.	.	30.33	1	0	5/15/2023	143	0
006	1	12/23/2022	5/15/2023	12/23/2022	5/15/2023	.	.	28.06	1	1	5/15/2023	143	143
007	0	9/25/2021	9/25/2021	9/25/2021	4/29/2022	2	2/15/2024	19.8	0	1	2/15/2024	873	0
007	1	9/25/2021	4/29/2022	9/25/2021	4/29/2022	2	2/15/2024	22.01	0	1	2/15/2024	873	216
008	0	11/16/2020	11/16/2020	11/16/2020	12/15/2021	1	6/30/2023	10.2	1	1	6/30/2023	956	0
008	1	11/16/2020	5/20/2021	11/16/2020	12/15/2021	1	6/30/2023	13.51	1	0	6/30/2023	956	185
008	2	11/16/2020	12/15/2021	11/16/2020	12/15/2021	1	6/30/2023	12.85	1	0	6/30/2023	956	394
009	0	9/17/2020	9/17/2020	9/17/2020	11/8/2022	1	1/25/2024	15.58	0	0	1/25/2024	1225	0
009	1	9/17/2020	4/15/2021	9/17/2020	11/8/2022	1	1/25/2024	20.81	0	.	1/25/2024	1225	210
009	2	9/17/2020	10/10/2021	9/17/2020	11/8/2022	1	1/25/2024	28.28	0	1	1/25/2024	1225	388
009	3	9/17/2020	5/25/2022	9/17/2020	11/8/2022	1	1/25/2024	25.4	0	.	1/25/2024	1225	615
009	4	9/17/2020	11/8/2022	9/17/2020	11/8/2022	1	1/25/2024	26.9	0	.	1/25/2024	1225	782
010	0	7/21/2020	7/21/2020	7/21/2020	8/8/2022	2	10/9/2023	49.8	1	0	10/9/2023	1175	0
010	1	7/21/2020	1/15/2021	7/21/2020	8/8/2022	2	10/9/2023	35.91	1	0	10/9/2023	1175	178
010	2	7/21/2020	8/25/2021	7/21/2020	8/8/2022	2	10/9/2023	.	1	.	10/9/2023	1175	400
010	3	7/21/2020	2/12/2022	7/21/2020	8/8/2022	2	10/9/2023	28.45	1	0	10/9/2023	1175	571
010	4	7/21/2020	8/8/2022	7/21/2020	8/8/2022	2	10/9/2023	25.36	1	1	10/9/2023	1175	748
011	0	12/14/2022	12/14/2022	12/14/2022	12/14/2022	.	.	52.4	0	0	12/14/2022	0	0
012	0	4/3/2021	4/3/2021	4/3/2021	10/18/2021	2	12/28/2023	14.3	1	0	12/28/2023	999	0
012	1	4/3/2021	10/18/2021	4/3/2021	10/18/2021	2	12/28/2023	10.82	1	0	12/28/2023	999	198
013	0	6/7/2019	6/7/2019	6/7/2019	7/18/2020	1	9/5/2023	28.2	0	0	9/5/2023	1551	0
013	1	6/7/2019	12/29/2019	6/7/2019	7/18/2020	1	9/5/2023	25.16	0	0	9/5/2023	1551	205
013	2	6/7/2019	7/18/2020	6/7/2019	7/18/2020	1	9/5/2023	18.74	0	0	9/5/2023	1551	407
014	0	5/18/2022	5/18/2022	5/18/2022	5/18/2022	.	.	13.1	1	0	5/18/2022	0	0
015	0	3/27/2019	3/27/2019	3/27/2019	4/20/2021	2	6/22/2023	22.01	0	0	6/22/2023	1548	0
015	1	3/27/2019	9/20/2019	3/27/2019	4/20/2021	2	6/22/2023	22.17	0	0	6/22/2023	1548	177
015	3	3/27/2019	10/12/2020	3/27/2019	4/20/2021	2	6/22/2023	25.6	0	1	6/22/2023	1548	565
015	4	3/27/2019	4/20/2021	3/27/2019	4/20/2021	2	6/22/2023	.	0	0	6/22/2023	1548	755
;
run;
proc print data=mock_dataset; run;

proc mixed data=mock_dataset;  
	class ID;                    
	model eGFR=time/solution;
	random int time/subject=ID type=un solution;
     ods output solutionf=sf(keep=effect estimate rename=(estimate=overall));
 ods output solutionr=sr(keep=effect ID estimate /*rename=(estimate=ssdev)*/);
run;
proc print data=sr; run;

Merdock_0-1684877627518.png

 

8 REPLIES 8
StatsMan
SAS Super FREQ

With this mixed model:

 

proc mixed data=mock_dataset;  
	class ID;                    
	model eGFR=time/solution;
	random int time/subject=ID type=un solution;

TIME is a covariate and EGFR is the dependent variable. You can get the slope on TIME for each level of ID by combining the results from the two SOLUTIONs tables. SOLUTIONF, from the MODEL statement, gives you the overall slope on TIME while SOLUTIONR, from the RANDOM statement, gives you the adjustment to that overall slope for each level of ID. 

 

If REGISTRY is a nominal or ordinal variable, then you can use PROC GLIMMIX to model that using a multinomial logistic regression. Put COVA on the MODEL statement as a predictor, and GLIMMIX will use the correct DF to test that effect. You do not need an option to specify COVA as a time-varying covariate. GLIMMIX will detect that. 

Merdock
Quartz | Level 8

@StatsMan , so you're suggesting something like this, where sscoeff will give the individual specific slope estimate?

proc mixed data=mock_dataset;
         class ID;
         model eGFR = time/solution;
         random int time/ type=un subject=ID solution;
         ods output solutionf=sf(keep=effect estimate  
                                 rename=(estimate=overall));
         ods output solutionr=sr(keep=effect variety estimate
                                 rename=(estimate=ssdev));
run;
proc sort data=sf; 
         by effect; 
 run;
      
proc sort data=sr; 
         by effect; 
run;

data final; merge sf sr; by effect; sscoeff = overall + ssdev; run;

 So then when I want to include the eGFR slope as covariate in my PROC GLIMMIX, what do I use, sscoeff? the outcome variable is REASON for termination from Registry and it is nominal with three levels: transplant, dialysis or death (though one thing to note -which isn't obvious in the mock dataset- is that the vast majority of participants have missing data for REASON).

StatsMan
SAS Super FREQ

You will need to be careful with the merging. If the merging is done correctly then SSCOEF will have the slope for each subject. You might want to take out the intercept terms in the two SOLUTION tables before doing the merge, since that is not of interest to you. 

Merdock
Quartz | Level 8

Noted, thank you for the helpful suggestion! Another question I had was, I assume that these individual slopes obtained with this code are overall slopes over the entire 5 years of follow-up(like, from time=0 to time=5 years), right? If that's the case, could I make it such that I also obtain the individual slopes at each of the 6 months intervals from baseline to 5 years?

SteveDenham
Jade | Level 19

I suppose you could do that. You will need to segment the interval so that you have enough data points to do a regression with the 6 month point at the middle.

 

SteveDenham

Merdock
Quartz | Level 8
@SteveDenham, got it, thank you very much!
StatsMan
SAS Super FREQ

What @SteveDenham said. You will need a lot of data to segment out the slopes into 10 6 month intervals. 

Merdock
Quartz | Level 8
@StatsMan, thank you!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 846 views
  • 0 likes
  • 3 in conversation