BookmarkSubscribeRSS Feed
Fluorite | Level 6

Dear All,

I have a longitudinal data and in order to run multiple imputations on it I am following the UCLA tutorials which first convert the long form to wide form and then impute and convert it back to long form. However, they have 6 visits per patient but in my case every patient has different number of visits since severe patients were followed every month while moderates were followed every 3 months for a year. I would really appreciate any help.

VISIT Frequency Percent Frequency Percent


  1      61 20.68 61 20.68

  2      53 17.97 114 38.64

  3      46 15.59 160 54.24

  4      45 15.25 205 69.49

  5      45 15.25 250 84.75

  6     45 15.25 295 100.00

proc sort data = dp_miss;
  by subj visit;
data wide;
  set dp_miss;
  array yt(6);
  array xt(6);
  by subj;
  retain yt xt;
  if first.subj then do i = 1 to 6;
  yt(i) = .;
xt(i) = .;
  yt(visit) = y;
  xt(visit) = x1;
  if last.subj;
  drop visit y x1 i;
proc print data = wide (obs=10) noobs;

Also, I will be ruining cluster analysis picking a few variables, so is it better to impute just those variables or the entire dataset?



Multiple Imputation in SAS, Part 2

Quartz | Level 8

Read the documentation of the UCLA tutorial more carefully.  Its longitudinal data set example also has a variable number of visits per individual.  Although the imputation method that the tutorial describes yields the same number (maximum=6) of visits per individual, the tutorial shows how to use PROC SQL to obtain the original varying number of visits per individual as in the original data set. In general, including more variables is preferable than fewer variables in your imputation model (Ref.:  Rubin DB.  Multiple imputation after 18+ years.  Journal of the American Statistical Association 1996 June;91(434):473-489 [cf., section 2.6]).

Fluorite | Level 6

Thanks a lot.I will read it more carefully.


Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 2 in conversation