Hi, I am exploring the slope of two measures (e.g., weight and height) z score.
Then I got the slope for each measure. I want to know whether the coefficient was different significantly.
Though I could see their 95%CI was not overlapped.
Estimate | Standard Error | DF | t Value | Pr > |t| | Alpha | Lower | Upper | |
weight | -0.02 | 0.002 | 1440 | -12.79 | <.0001 | 0.05 | -0.023 | -0.017 |
height | -0.031 | 0.002 | 1546 | -17.22 | <.0001 | 0.05 | -0.034 | -0.027 |
which procedure could I use?
Based on what you've told me, I don't think comparing your analyses is meaningful. If I understand you correctly, you have the same subjects in two different data sets. In each data set, you rescale the response variable before you perform a regression, based only on the values in that data set.
I think the standard way to analyze the longitudinal data is to have one data set with variables
SubjectID, Y, timepoint, weight, height, ...
Are these two different models you fit, or one model with two x-variables?
Did you do this in PROC REG, or PROC GLM, or elsewhere?
Assuming this is one model with two x-variables:
In PROC REG, you can use the TEST statement
In PROC GLM, there is a TEST statement, but it does different things than PROC REG. In PROC GLM, you want the CONTRAST statement: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statu...
Also, comparing confidence intervals to see if they overlap isn't the correct thing to do.
They are from different models. I got the results from the mixed-effect model.
If these are slopes from two different models, I am not aware of any way to statistically compare the slopes. In fact, the idea of statistically comparing slopes from two different variables from two different fitted models doesn't really make sense to me.
Thanks for the quick reply.
Since we used the Z score, we thought they might be comparable.
You say you "use the Z score." What does that mean? Did you standardize each variable by subtracting the mean and then divide by the standard deviation?
Please post your SAS code and we'll be able to answer your questions with more confidence.
data atl_two_bmi_3; set atl_two_bmi_3; if sex=0 then zscore=(col1-26.208)/3.320; if sex=1 then zscore=(col1-25.552)/4.143; run; proc mixed data=atl_two_bmi_3 order=formatted; class t; model zscore=timepoint/ outp=predbmi solution ddfm=kr; repeated t /subject=l_pnr type=un; estimate "slope for the total population" timepoint 1 /cl; run; data atl_two_measure2_3; set atl_two_measure2_3; if sex=0 then zscore=(col1-37.226)/3.086; if sex=1 then zscore=(col1-35.878)/3.356; run; proc mixed data=atl_two_measure2_3 order=formatted; class t; model zscore=timepoint/ outp=predmeasure2 solution ddfm=kr; repeated t /subject=l_pnr type=un; estimate "slope for the total population" timepoint 1 /cl; run;
Hi, I did the z score using the mean and std of each measure.
Then I used the mixed effect model to get the slope.
Thanks.
So this is not what you described earlier, this is the same variable in two different models using two different data sets (earlier you had two different variables).
In that case, I would combine the data sets into one (if that makes sense from a subject matter expertise standpoint), create an indicator variable to indicate which data set it comes from, then test the slopes in one model.
Something like this:
proc mixed data=combined_data_set;
class t indicator_variable;
model zscore=indicator_variable timepoint
indicator_variable*timepoint/ outp=predbmi solution ddfm=kr;
repeated t /subject=l_pnr type=un;
run;
Then the interaction indicator_variable*timepoint tests whether or not the slopes of timepoint are the same across the two data sets.
According to your code, the data used for the first model is from a different data set than the second model.
Are the data from the same subjects at the same time points in the same order? In other words, does the i_th row in DataSet1 correspond to the i_th row of DataSet2 for every row?
Hi,
Thanks for the reply.
The datasets are from the same population. But for the same subject, one might have BMI at time1 but not have Muscle measure at time1.
I provided the datasets first 10 lines for BMI and Muscle.
Dataset for BMI z score | Dataset for muscle mass z score | ||||||||
ID | BMI | zscore | timepoint | ID | Muscle mass | zscore | timepoint | ||
1 | 24.7409 | -0.19577 | 0 | 1 | 36 | 0.03635 | 0 | ||
1 | 24.7409 | -0.19577 | 6 | 1 | 39 | 0.93027 | 6 | ||
1 | 25.7117 | 0.03854 | 12 | 1 | 27 | -2.64541 | 12 | ||
3 | 24.8016 | -0.18113 | 0 | 3 | 38 | 0.6323 | 0 | ||
3 | 36 | 0.03635 | 6 | ||||||
3 | 21.6128 | -0.95081 | 12 | 3 | 37 | 0.33433 | 12 | ||
4 | 22.8625 | -0.64916 | 0 | 4 | 37 | 0.33433 | 0 | ||
4 | 23.6652 | -0.45541 | 6 | ||||||
4 | 24.2424 | -0.31609 | 12 | 4 | 38 | 0.6323 | 12 | ||
6 | 26.5021 | 0.22933 | 0 | 6 | 35 | -0.26162 | 0 |
So again this is different than the anything you have described above. Now you have the same X variable each time (timepoint) but two different Y variables, despite the fact that you call them by the same variable name ZSCORE.
As far as I know, you would have to do this with two different models, and I go back to my earlier statement, there is no statistically meaningful way to do this comparison.
@Jie111 wrote:
Hi, I did the z score using the mean and std of each measure.
Then I used the mixed effect model to get the slope.
Thanks.
Depending on exactly how you got the values to create the z-score in this code:
data atl_two_bmi_3; set atl_two_bmi_3; if sex=0 then zscore=(col1-26.208)/3.320; if sex=1 then zscore=(col1-25.552)/4.143; run;
You may have been able to accomplish the same thing from your raw data by 1) sorting by sex (as a BY statement is involved) and use something like:
proc stdize data=have out=standardized; by sex; var <variables of interest>; run;
By default Proc Stdize will use METHOD=STD to standardize the variables, which will use the Mean and Standard Deviation as the data set does without having to use another procedure to the mean and std deviation then use a data step. Also there may be less "error" in the standardization based on rounding of the mean and std values.
I also suggest that you do not use the
Data somesetname;
set somesetname;
until you are pretty experienced with SAS. Using the same set name as source and output completely overwrites the data set. So if you have an error, like misspelling a variable, or using the wrong variable because of similar names, you can effectively either destroy your values or create errors that may take hours if not days to trace down as to the exact cause.
Based on what you've told me, I don't think comparing your analyses is meaningful. If I understand you correctly, you have the same subjects in two different data sets. In each data set, you rescale the response variable before you perform a regression, based only on the values in that data set.
I think the standard way to analyze the longitudinal data is to have one data set with variables
SubjectID, Y, timepoint, weight, height, ...
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.