BookmarkSubscribeRSS Feed
BenBine
Calcite | Level 5

Hello,

 

I am new to statistics and even more new to programing (studying health sciences). I am currently working on a project with a set of longitudinal data with 7 repeated time measures (YEARS0-YEARS6). The outcome is the MWT (MWT0-MWT6) measured in each 7 instances for every person (ID). I also have multiple predictors which I will add in the model later on to see whether they have an effect on the outcome over time (I did not include the predictors in the sample). 

For this, I wish to perform an HLM Model. I performed a HLM with YEARS and an other one with YEARS^2 using PROC MIXED to compare wich model fits my data best. 

 

Here are some example of the PROC MIXED I ran for now:

Proc Mixed METHOD=ML covtest noclprint;
 Class ID;
ID ID YEARS MWT;
Model MWT = YEARS /S DDFM = KR /*OUTP = PMWT (keep = ID YEARS MWT PRED) --> RUN LATER*/;
Random Int YEARS /subject = ID G TYPE = UN;
RUN;

YEARS2=YEARS*YEARS;
Proc Mixed covtest noclprint IC;
ID ID MWT YEARS
 Class ID;
Model MWT = YEARS YEARS2/S DDFM = KR NOTEST;
Random Int YEARS YEARS2/subject = ID G TYPE = UN;
RUN;

 

However, I expect my data to follow an exponential curve given the outcome will gradually increase overtime and reach a plateau (established from clinical experience). Hence, I would like to fit an exponential growth curve which I think I have to run using PROC NLMIXED or %NLINMIX. Unfortunately, I can't figure out how to fit my data with PROC NLMIXED in order to do an exponnential equation which I suppose should look like that: 

MWTij = µ0i e^(µ1i * YEARSij) + εij

 Here is a sample of my data:

data work.MWTDATA;
  infile datalines dsd truncover;
  input ID:BEST12. VISIT:32. MONTH:32. YEARS:32. MWT:32.;
  format ID BEST12.;
datalines4;
9000099,1,,0,15.12
9000099,2,,1.01,15.34
9000099,3,,2.09,14.75
9000099,4,,3.03,15.55
9000099,5,,4.06,15.51
9000099,6,,6.01,15
9000099,7,,7.88,15.87
9000622,1,,0,13.75
9000622,2,,1.01,12.65
9000622,3,,,
9000622,4,,,
9000622,5,,,
9000622,6,,,
9000622,7,,,
9000798,1,,0,18.47
9000798,2,,1.15,16.31
9000798,3,,2.3,16.73
9000798,4,,3.2,17.13
9000798,5,,4.12,16.09
9000798,6,,5.96,17.59
9000798,7,,8.02,18.16
9002116,1,,0,16.32
9002116,2,,1.06,14.12
9002116,3,,2.07,19.41
9002116,4,,2.99,22.01
9002116,5,,3.99,18.01
9002116,6,,6.02,16.6
9002116,7,,8.01,19.84
9003380,1,,0,13.26
9003380,2,,1.02,12.91
9003380,3,,2.04,12.54
9003380,4,,3.05,14.31
9003380,5,,4.05,13.53
9003380,6,,5.95,14.61
9003380,7,,7.84,15.65
9003406,1,,0,18.8
9003406,2,,1.36,16.78
9003406,3,,1.93,19.35
9003406,4,,3.22,17.18
9003406,5,,4.01,19.03
9003406,6,,5.98,16.94
9003406,7,,7.84,19.79
9004184,1,,0,28.52
9004184,2,,,
9004184,3,,,
9004184,4,,,
9004184,5,,,
9004184,6,,,
9004184,7,,,
9004905,1,,0,17.63
9004905,2,,1.33,18.39
9004905,3,,1.94,19.57
9004905,4,,2.94,19.81
9004905,5,,3.97,18.93
9004905,6,,5.99,14.93
9004905,7,,8.02,20.69
9005132,1,,0,16.56
9005132,2,,1.01,16.78
9005132,3,,2,15.37
9005132,4,,,
9005132,5,,,
9005132,6,,,
9005132,7,,,
9005656,1,,0,15.29
9005656,2,,1.45,14.15
9005656,3,,,
9005656,4,,,
9005656,5,,4.2,
9005656,6,,,
9005656,7,,,
9007827,1,,0,17.53
9007827,2,,1.2,16.54
9007827,3,,1.95,15.47
9007827,4,,2.95,15.89
9007827,5,,3.91,15.72
9007827,6,,6.08,17.93
9007827,7,,7.96,19.62
9008934,1,,0,13.5
9008934,2,,1.12,12.81
9008934,3,,2.13,14.69
9008934,4,,3.34,13
9008934,5,,4.15,14.11
9008934,6,,5.99,14.03
9008934,7,,8.02,13.32
9009623,1,,0,18.82
9009623,2,,0.95,17.29
9009623,3,,,
9009623,4,,,
9009623,5,,,
9009623,6,,,
9009623,7,,,
9011918,1,,0,14.82
9011918,2,,1.09,14.81
9011918,3,,,
9011918,4,,,
9011918,5,,,
9011918,6,,,
9011918,7,,,
9013161,1,,0,17.82
9013161,2,,1.42,
;;;;

I figured I'd ask for help since this might not be so hard for some of you expert:)

I am using SAS 9.4 for your reference!

Thank You!

4 REPLIES 4
ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

Rick_SAS
SAS Super FREQ

Welcome to the wonderful world of statistics and data analysis with SAS. An HLM Model will be a tough first project for someone who is new to statistics and SAS.

 

 

The first thing you should do is graph your data:

proc sgplot data=MWTDATA;
series x=years y=MWT / group=ID;
run;

I don't see any evidence of an exponential behavior.

 

If you decide to pursue the exponential model, you might not need to use NLMIXED. The model you state is a generalized linear model with a log link function: 

model MWT = years / dist=normal link=log;

For a discussion of this model, see "Error distributions and exponential regression models."

 

 

BenBine
Calcite | Level 5

Thank you for your response Rick,

I did graph my data in subsets of 10 IDs/Graps & 100 IDs/graph to see a trend. However, the data varies a lot (some show a linear pattern, some increase over time, some decrease over time some show exponential, etc). I have 1390 subject so I am not sure how to decide of a model just by looking at general trends since the data seems to varies a lot.

 For this reason, I decided to look at different model (HLM linear and quadratic as shown above) and see which model seems to fit best looking at the -2LL/DoF/Chi-Square & BIC. I also considered the fact that this is clinical data and that patients in this study should be getting worst over time (MWT increasing). However, I am not sure whether it will happen in a quadratic, exponential, negative exponential or Logistic pattern. 

That being said, if I look at my data I would say that there is not much change and that I will get significance either way because my sample size is so large (1390 ID or 4797 observations to be exact). 

So here are my questions:

1. Should I keep a simpler model such as quadratic given that graphing my data does not seem to point towards a specific model?

2. How can I account or prevent false significance due to large sample size?

3. It seems to me that the link you provided shows a way of plotting exponential curves but without using PROC MIXED, PROC GLMIXED or PROC NLMIXED which I think are designed for HLM models. Would the method listed in the link still be considered an HLM Model?

 

Also I cannot diverge from the HLM model as per request from my supervisor, hence any statistical procedure needs to respect the HLM model. 

 

Thank you very much!

 

 

Rick_SAS
SAS Super FREQ

> 1. Should I keep a simpler model such as quadratic given that graphing my data does not seem to point towards a specific model?

I prefer to start with a simple model.

 

> 2. How can I account or prevent false significance due to large sample size?

I personally wouldn't worry about this at first. If you develop several competing models, some people use cross-validation.

 

> 3. It seems to me that the link you provided shows a way of plotting exponential curves but without using PROC MIXED, PROC GLMIXED or PROC NLMIXED which I think are designed for HLM models. Would the method listed in the link still be considered an HLM Model?

No. it was a toy example to show the difference between modeling log(Y) and modeling Y with a log link.

 

Also I cannot diverge from the HLM model as per request from my supervisor

OK, that's fine. Good luck.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2702 views
  • 1 like
  • 3 in conversation