BookmarkSubscribeRSS Feed
ZC
Calcite | Level 5 ZC
Calcite | Level 5

Hello,

I have a stacked data set mydata with below the structure:

1. some are repeatedly measured and others are not;

2. sbp_change = sbp_v2 - sbp_v1;

id  group sbp_v1 sbp_v2 sbp_change

1     1     113   116      3

1     1     110   119      9

2     1     124   115     -9

3     2     125   126      1

3     2     126   134      8

...

%macro mean (a, b, c, d ,e, f, g, h);

PROC MIXED DATA = mydata METHOD=ML ABSOLUTE CONVF = 0.00001 NOCLPRINT;

   CLASS ID group;

   MODEL &a =  group/NOINT;

   LSMEANS group /CL;

   REPEATED INT / TYPE = CS SUBJECT = ID;

RUN;

%mend;

%mean (sbp_v1)

%mean (sbp_v2)

%mean (sbp_change)

Theoretically, the lsmean sbp_change should be equal to the difference between lsmean sbp_v2 and lsmean sbp_v1. But my analyis results showed they are not equall. What is the reason?

Thanks,

6 REPLIES 6
ballardw
Super User

Any chance that sbp_v1 and sbp_v2 have missing values? If so would only one be missing for some?

Your sbp_change would be missing when either of the other two are missing. So there could be different numbers of records used in each model.

ZC
Calcite | Level 5 ZC
Calcite | Level 5

Thanks for your response. In my data, all the three variables (sbp_v1, sbp_v2 and sbp_change) are either missing or non-missing at the same time.

SteveDenham
Jade | Level 19

Well, that shoots down my major idea as to what was causing the non-equality.

Can you share the lsmeans (and standard errors) for each of the three variables?

ZC
Calcite | Level 5 ZC
Calcite | Level 5

Results from the current SAS syntax:

                       mean (SE)

sbp_v1:           110.18 (0.42)

sbp_v2:           111.68 (0.58)

sbp_change:    0.67 ( 0.47)

If I remove the "subject = id" from the current SAS syntax, the lsmean sbp_change equal to the difference in lsmean sbp_v2 and sbp_v1:

                       mean (SE)

sbp_v1:           109.43 (0.48)

sbp_v2:           110.43 (0.63)

sbp_change:   1.00 (0.51)

SteveDenham
Jade | Level 19

Then the reason must be in the use of the REPEATED statement, and the implied ordering within subject. The correlations off diagonal aren't the same, but I was under the impression that this should only affect the standard errors.  Obviously, it means that the marginal and conditional estimates are not the same.  Without subject=id, the estimates are marginal, with subject=id, they are conditional on the ordering.  (I THINK.)  If you wanted to check on this, you could reorder the observations within subject, and see if it had any effect.

What happens if you replace the REPEATED with RANDOM?  The code would look like:

PROC MIXED DATA = mydata METHOD=ML ABSOLUTE CONVF = 0.00001 NOCLPRINT;

   CLASS ID group;

   MODEL &a =  group/NOINT;

   LSMEANS group /CL;

   RANDOM INT / TYPE = CS SUBJECT = ID;

RUN;

Message was edited by: Steve Denham

deb193
Fluorite | Level 6

If the net result of missing data is that not all subjects have the same number of observations, I think this alone can produce different LSMEANS.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1039 views
  • 0 likes
  • 4 in conversation