BookmarkSubscribeRSS Feed
TZaihra
Fluorite | Level 6

Dear All,

I have a longitudinal data and in order to run multiple imputations on it I am following the UCLA tutorials which first convert the long form to wide form and then impute and convert it back to long form. However, they have 6 visits per patient but in my case every patient has different number of visits since severe patients were followed every month while moderates were followed every 3 months for a year. I would really appreciate any help.

VISIT Frequency Percent Frequency Percent

----------------------------------------------------------

  1      61 20.68 61 20.68

  2      53 17.97 114 38.64

  3      46 15.59 160 54.24

  4      45 15.25 205 69.49

  5      45 15.25 250 84.75

  6     45 15.25 295 100.00


proc sort data = dp_miss;
  by subj visit;
run;
data wide;
  set dp_miss;
  array yt(6);
  array xt(6);
  by subj;
  retain yt xt;
  if first.subj then do i = 1 to 6;
  yt(i) = .;
xt(i) = .;
  end;
  yt(visit) = y;
  xt(visit) = x1;
  if last.subj;
  drop visit y x1 i;
run;
proc print data = wide (obs=10) noobs;
run;

Also, I will be ruining cluster analysis picking a few variables, so is it better to impute just those variables or the entire dataset?

Thanks

Tasneem

Multiple Imputation in SAS, Part 2

2 REPLIES 2
1zmm
Quartz | Level 8

Read the documentation of the UCLA tutorial more carefully.  Its longitudinal data set example also has a variable number of visits per individual.  Although the imputation method that the tutorial describes yields the same number (maximum=6) of visits per individual, the tutorial shows how to use PROC SQL to obtain the original varying number of visits per individual as in the original data set. In general, including more variables is preferable than fewer variables in your imputation model (Ref.:  Rubin DB.  Multiple imputation after 18+ years.  Journal of the American Statistical Association 1996 June;91(434):473-489 [cf., section 2.6]).

TZaihra
Fluorite | Level 6

Thanks a lot.I will read it more carefully.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3397 views
  • 4 likes
  • 2 in conversation