BookmarkSubscribeRSS Feed
hellorc
Obsidian | Level 7

.

 

1 REPLY 1
ballardw
Super User

"1) Why are there sometimes two rows for the same subject id in same group and year?"

Documentation of the study data collection is about the only way to know. Could be that "id" is not a per person but perhaps a location or other factor in common. I don't guess about that. If you don't really have any documentation I would be strongly tempted to say the data should not be used.

 

"2) How should we handle those subjects with "paired sample"?"

Without knowing exactly what that collection I wouldn't make any recommendation at all. "Paired Sample" I would expect to see something else like "before/after" of which side of the pair each value is from as that is important for most paired sample analysis. Repeated measures not the same as paired, so it may be that perhaps this is a repeated measure problem.

 

"3) About the cytokine level, how do we deal with the "<4" for variable IL10 for example?" If you want to use those records for much I would impute some "reasonable" value. I do not know the field or the instrumentation used or its sensitivity so have no idea whether it might be best to treat those as 4, 2 or 0.00001.  Once you pick what that value should be then use a custom informat to read the data that will do that assignment.

Example that reads the value as numeric 1. The Other=[8.1] means that other values are read as 8 digits.

proc format;
invalue lessthanfour
'<4'=1
other=[8.]
;
run;

data have;
input id group $ year $ IL10 :lessthanfour. IL11 diabetes @@;
datalines;
1 control 2018 <4 5.67 1
1 control 2018 <4 15.80 1
1 control 2019 6.11 4.43 1 
1 control 2019 7.21 4.43 1
2 treatment 2018 6.33 7.31 0
2 treatment 2018 6.33 10.32 0 
2 treatment 2019 11.55 7.32 0 
2 treatment 2019 12.34 7.32 0 
3 treatment 2018 6.33 10.32 0 
3 treatment 2019 <4 8.61 1 
3 treatment 2019 <4 6.77 1 
;
run;

"4) It looks like a longitudinal study to me, so I believe we should be using Proc Mixed?"

Without documentation of how the data was collected I would remember the saying about "Assume makes an a$$ out of u and me".

 

Maybe I'm too conservative but if you are doing real work that can affect people's lives too many assumptions about what data means without documentation is dangerous.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 369 views
  • 0 likes
  • 2 in conversation