Solved: Re: Proc MIXED MODEL Warning: For covariance type Unstructured

cjacks21 · Posted 05-12-2021 03:46 PM

Hello,

I am having trouble getting the result I need from my proc mixed models. I am comparing 3 different types but for some reason my Proc Mixed model with the covariance structure Type=un (unstructured) ran and I add the intercept on the random line the result are an warning, the solution LSMEANS is missing, and no out put of my predicted values. For the proc mixed model type= cs everything is fine with the intercept added. I attached the output to this post and the code and code log is below. What do I need to change for me to used intercept and year in my un proc mixed model?

NOTE: 7898 observations are not included because of missing values.
NOTE: The data set WORK.PRED has 0 observations and 0 variables.

------>WARNING: Data set WORK.PRED was not replaced because new file is incomplete.<-----
NOTE: PROCEDURE MIXED used (Total process time):
real time 0.88 seconds
cpu time 0.68 seconds

SAS Code:

proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
by rfa_id year time avgvlyryouth; 
run;
*When I ran the procedure without adding intercept or int after random the output showed everything but with it some outputs where
cut out;
PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml PLOTS(MAXPOINTS=NONE);
    class   sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
    model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 
			agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4 
			ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined ctimeyr*hiv_risk 
			ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = pred;
	random intercept / subject=rfa_id Type= un;
	lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup /diff;
title 'Mixed Model Test for Youth Unstructured';
RUN;

proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
by rfa_id year time avgvlyryouth; 
run;

PROC MIXED data= youthavgvl_yr_sorted order=data PLOTS(MAXPOINTS=NONE);
    class   sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
    model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 
			agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4 
			ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined ctimeyr*hiv_risk 
			ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = pred_cs;
	random  intercept year / subject=rfa_id Type= cs;
	lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup /diff;
title 'Mixed Model Test for Youth Compound Symmetry';
RUN;

SAS log:

SAS log

458  proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
459  by rfa_id year time avgvlyryouth;
460  run;

NOTE: There were 57534 observations read from the data set WORK.YOUTHAVGVL_YR.
NOTE: 10467 observations with duplicate key values were deleted.
NOTE: The data set WORK.YOUTHAVGVL_YR_SORTED has 47067 observations and 19 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds


461  *When I ran the procedure without adding intercept or int after random the output showed
461! everything but with it some outputs where
462  cut out;
463  PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml PLOTS(MAXPOINTS=NONE);
464      class   sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black")
464! hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
465      model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk
465! baseline_cd4
466              agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk
466!  ccd4c*baseline_cd4
467              ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined
467! ctimeyr*hiv_risk
468              ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = pred;
469      random intercept / subject=rfa_id Type= un;
470      lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup /diff;
471  title 'Mixed Model Test for Youth Unstructured';
472  RUN;

NOTE: 7898 observations are not included because of missing values.
NOTE: The data set WORK.PRED has 0 observations and 0 variables.
WARNING: Data set WORK.PRED was not replaced because new file is incomplete.
NOTE: PROCEDURE MIXED used (Total process time):
real time 0.88 seconds
cpu time 0.68 seconds

NOTE: PROCEDURE MIXED used (Total process time):
      real time           0.88 seconds
      cpu time            0.68 seconds


473
474  proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
475  by rfa_id year time avgvlyryouth;
476  run;

NOTE: There were 57534 observations read from the data set WORK.YOUTHAVGVL_YR.
NOTE: 10467 observations with duplicate key values were deleted.
NOTE: The data set WORK.YOUTHAVGVL_YR_SORTED has 47067 observations and 19 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds


477
478  PROC MIXED data= youthavgvl_yr_sorted order=data PLOTS(MAXPOINTS=NONE);
479      class   sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black")
479! hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
480      model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk
480! baseline_cd4
481              agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk
481!  ccd4c*baseline_cd4
482              ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined
482! ctimeyr*hiv_risk
483              ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = pred_cs;
484      random  intercept year / subject=rfa_id Type= cs;
485      lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup /diff;
486  title 'Mixed Model Test for Youth Compound Symmetry';
487  RUN;

NOTE: 7898 observations are not included because of missing values.
NOTE: Convergence criteria met.
NOTE: Estimated G matrix is not positive definite.
NOTE: The data set WORK.PRED_CS has 47067 observations and 26 variables.
NOTE: PROCEDURE MIXED used (Total process time):
      real time           50.90 seconds
      cpu time            50.49 seconds

Rick_SAS · Posted 06-24-2021 11:56 AM

I think your DATA step to compute group from year is wrong. You didn't set the LENGTH of group, so the first assignment sets the length. Depending on the data, that might be one character, which means that the group variable has mostly value "1" because "10" through "19" get truncated to "1".

Use

LENGTH group $2;

or even better use a numeric variable:

group = year-1998;

View solution in original post

SteveDenham · Posted 05-13-2021 02:22 PM

It is not obvious to me what the problem might be. At first I thought you may be trying to estimate too many parameters in the UN structure, but as you have it written you should get a single parameter estimate called UN(1,1). The mysterious part of all this is that there are no other relevant NOTEs, WARNINGs or ERRORs in the log. There may be something in the .lst file that would help explain what is happening.

I would like to push this to some folks who are really good with this sort of thing, so @jiltao , @STAT_Kathleen , you guys have helped me in the past, so please give @cjacks21 something to work with.

SteveDenham

cjacks21 · Posted 05-13-2021 02:29 PM

Thank you so much for you help in advance. I have be trouble shooting this for a couple days.

jiltao · Posted 05-13-2021 02:53 PM

If you want to fit a random intercept model, you might want to use --

random intercept / subject=rfa_id;

If you want to fit a random intercept and slope model, you might want to use --

random intercept year / subject=rfa_id type=un;

I would not use type=CS for a random coefficients model.

Are you saying that the random intercept model (model 1 above) worked but the random intercept and slope model (model 2 above) did not? What exactly happened for the model 2 above?

cjacks21 · Posted 05-13-2021 08:28 PM

Hello very close. I am sorry for the confusion. Model 1 is the TYPE= UN and Model 2 = CS. For both models I am trying to use the random intercept and time which equal year on my model. When I use:

random intercept year / subject=rfa_id type=un

I do not get any outputs. It seems as if the code stops. For the compound symmetry type (model 2) I get a result.

SteveDenham · Posted 05-14-2021 08:34 AM

For the UN structure situation, how many levels of year are in your data?

SteveDenham

cjacks21 · Posted 05-17-2021 11:16 AM

Hello,

I added the year to the class statement in the proc mixed model and for year there are 21 levels.

Courtney Jacks

jiltao · Posted 05-14-2021 09:25 AM

For this model --

random intercept year / subject=rfa_id type=un;

can you send in the Log (including the program and messages) and Output?

cjacks21 · Posted 05-17-2021 11:04 AM

Hello,

Yes. The log out put is below for that structure.

268  proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey;
268! *remove duplicates;
269  by rfa_id year time avgvlyryouth;
270  run;

NOTE: There were 57534 observations read from the data set
      WORK.YOUTHAVGVL_YR.
NOTE: 10467 observations with duplicate key values were deleted.
NOTE: The data set WORK.YOUTHAVGVL_YR_SORTED has 47067 observations and
      19 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.03 seconds


271  *When I ran the procedure without adding intercept or int after
271! random the output showed everything but with it some outputs where
272  cut out;
273  PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml
273! PLOTS(MAXPOINTS=NONE);
274      class   sex_hars (ref="F") res_at_hiv_dx (ref="Urban")
274! race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref=">
274!  500") agegroup(ref="20-24");
275      model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx
275! race_combined hiv_risk baseline_cd4
276              agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx
276! ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4
277              ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx
277! ctimeyr*race_combined ctimeyr*hiv_risk
278              ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp =
278! pred_un ;
279      random intercept year/  Type= un subject=rfa_id;
280      *random year / type= un subject= rfa_id;
281      lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk
281! baseline_cd4 agegroup /diff;
282  title 'Mixed Model Test for Youth Unstructured';
283  RUN;

NOTE: 7898 observations are not included because of missing values.
NOTE: The data set WORK.PRED_UN has 0 observations and 0 variables.
NOTE: PROCEDURE MIXED used (Total process time):
      real time           1.21 seconds
      cpu time            0.88 seconds

jiltao · Posted 05-17-2021 03:15 PM

Please also send the Output from your PROC MIXED program. Thanks!

cjacks21 · Posted 05-18-2021 11:03 AM

Hello,

I am so sorry. I missed that part of the reply. My Output window was empty but I did have the Result. I attached it to the reply as a word document. The code log is below again:

346  ************************************* 1st Proc Mixed Code
346! *****************************************************;
347  *************************************    Youth Part 1
347! *****************************************************;
348  *sort data set with average viral load by rfa_id time and average
348! viral load by year;
349  *proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey;
349! *remove duplicates;
350  *by rfa_id year time avgvlyryouth;
351  *run;
352
353  *ods rtf;
354  *Proc mix to look at youth mean viral load by residence, cd4, sex,
354! race, and gender ;
355  * multilevel (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref=">
355! 500") agegroup(ref="20-24");
356  *PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml
356! PLOTS(MAXPOINTS=NONE);
357      *class   rfa_id sex_hars (ref="F") res_at_hiv_dx (ref="Urban")
357! race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref=">
357!  500") agegroup(ref="20-24");
358      *model avgvlyryouth = year ccd4c ctimeyr sex_hars res_at_hiv_dx
358! race_combined hiv_risk baseline_cd4 agegroup/ solution outp = pred;
359     * random  int year / subject=rfa_id Type= un vcorr;
360    *title 'Mixed Model Results For Viral Load Trend Over Time For
360! Youth Visits (Reference 1)';
361  *RUN;
362  *ods rtf close;
363
364   *ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined
364! ccd4c*hiv_risk ccd4c*baseline_cd4 ccd4c*agegroup ctimeyr*sex_hars
364! ctimeyr*res_at_hiv_dx ctimeyr*race_combined ctimeyr*hiv_risk
364! ctimeyr*baseline_cd4 ctimeyr*agegroup
365
366
367  *************************************    Youth Proc Mixed Models
367! *****************************************************;
368  *CJacks notes;
369  *3 proc mixed models where used to determine which TYPE= was the best
369!  for the data analysis of mean vl over time;
370  *TYPE= UN, CS, and AR(1);
371  *sort data set with average viral load by rfa_id time and average
371! viral load by year;
372
373  proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey;
373! *remove duplicates;
374  by rfa_id year time avgvlyryouth;
375  run;

NOTE: There were 57534 observations read from the data set
      WORK.YOUTHAVGVL_YR.
NOTE: 10467 observations with duplicate key values were deleted.
NOTE: The data set WORK.YOUTHAVGVL_YR_SORTED has 47067 observations and
      19 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.04 seconds
      cpu time            0.04 seconds


376  *When I ran the procedure without adding intercept or int after
376! random the output showed everything but with it some outputs where
377  cut out;
378  PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml
378! PLOTS(MAXPOINTS=NONE);
379      class sex_hars (ref="F") res_at_hiv_dx (ref="Urban")
379! race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref=">
379!  500") agegroup(ref="20-24");
380      model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx
380! race_combined hiv_risk baseline_cd4
381              agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx
381! ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4
382              ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx
382! ctimeyr*race_combined ctimeyr*hiv_risk
383              ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp =
383! pred_un ;
384      random intercept year/  Type= un subject=rfa_id;
385      *random year / type= un subject= rfa_id;
386      lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk
386! baseline_cd4 agegroup /diff;
387  title 'Mixed Model Test for Youth Unstructured';
388  RUN;

NOTE: 7898 observations are not included because of missing values.
NOTE: The data set WORK.PRED_UN has 0 observations and 0 variables.
WARNING: Data set WORK.PRED_UN was not replaced because new file is
         incomplete.
NOTE: PROCEDURE MIXED used (Total process time):
      real time           1.21 seconds
      cpu time            0.92 seconds


389  ods rtf close;

jiltao · Posted 05-18-2021 01:11 PM

Your PROC MIXED program did not converge, and that is why your OUTP= data set is empty.

I wonder if your response variable has very large values. If so you might want to rescale it so the values are within a reasonable range. If you can send in your SAS data set I might take a look to see what I can recommend.

cjacks21 · Posted 05-18-2021 03:29 PM

Hello,

Okay. I will look into that. The response variable is the viral load which was converted to log_10. I can not send the data set but I will take a look at what you suggested.

jiltao · Posted 05-18-2021 07:23 PM

Another thought: You might want to recode the values for year. I assume the year values are in the thousands? That can produce very small slope estimates and cause some convergence issues. See if you can record the values to, for example, 1, 2, 3, 4, etc.

cjacks21 · Posted 06-23-2021 11:58 AM

@jiltao Hello,

So I am still working on this issue. I round the dependent variables to just one decimal place after the decimal. I also recoded the years for example 1999 and put for example, 1, 2, 3, 4, etc.It still did not work. It also removed the warning but the OUTP=data set is empty. With no columns nor rows.

Courtney Jacks

Ready to join fellow brilliant minds for the SAS Hackathon?