DATA Step, Macro, Functions and more

Missing data on repeat measurements

Reply
Frequent Contributor
Posts: 86

Missing data on repeat measurements

I want to follow-up: missing data on repeat measurement.

Anyone helps much appreciated and thanks in advanced.

 

 

Assuming I measured blood pressure for 6 subjects at 3 times, week 1, week 2 week3.
How can I deal with the missing value?

data bp;
input id age bp1 bp2 bp3;
datalines;
1 45 110 165 90
2 56 124 . .
3 60 142 . 137
4 61 . 94 120
5 39 130 130 .
6 55 157 130 124
;
run;

Phan S.  

Super User
Posts: 23,937

Re: Missing data on repeat measurements

[ Edited ]

Note that I've moved your post to the Base Programming forum, you've been posting them in Community Matters. 

 

To answer your question:

How can I deal with the missing value?

 

There's two components to this, one is how to program it and the second is methodological. The methods to fill in data depend on the subject matter a lot of times. Sometimes its very inappropriate to impute missing data because it leads to misleading and sometimes it makes perfect sense such as continuing stock data over the weekend. 

 

Once you have the answer for the methodological component the technical aspects can be easier to answer. Typing code is easy, figuring out what to type is hard Smiley Happy.

 

Some methods that come to mind are:

  • Nothing - leave missing as missing and use all available data
  • LOCF - last observation carried forward
  • MEAN/MEDIAN - replace with the a statistical summary of the non-missing values
  • Regression - if you have enough data you can run a single regression model and calculate predicted values
  • Modelling - build model with just available data. Use model to predict for missing data and then re-validate the model. 
  • Random - randomly assign values based on some general rules and then run a bunch of simulations to see the effect on your analysis. 
  •  

Which approach are you looking to implement?

 

PS. BP has two components, systolic and diastolic and BP as a single value isn't meaningful as far as I understand it, but thats not particularly relevant here. 

 


PhanS wrote:

I want to follow-up: missing data on repeat measurement.

Anyone helps much appreciated and thanks in advanced.

 

 

Assuming I measured blood pressure for 6 subjects at 3 times, week 1, week 2 week3.
How can I deal with the missing value?

data bp;
input id age bp1 bp2 bp3;
datalines;
1 45 110 165 90
2 56 124 . .
3 60 142 . 137
4 61 . 94 120
5 39 130 130 .
6 55 157 130 124
;
run;

Phan S.  


 

 

Frequent Contributor
Posts: 86

Re: Missing data on repeat measurements

Hello Reeza,

 

Thank you for your suggestions. 

For methodologies, I will choose either use only available data, replacing mean/median or imputation (I know a bit). 

The LOCF, Random, and calculate predict value -- I do not know. 

 

Regarding programming, I would like to ask your help, assuming I have a data set that collected data for 3 cycles same subjects.

I would like to know each cycle how many subjects participated in the study. In the other word, I would like to know many subjects were not returned in cycle 2 and cycle 3.

 

I thank you very much in advanced, Reeza. 

 

PS: BP -- you are absolutely correct, systolic and diastolic BP. It was a fake number. 

 

Phan S.  

 

Frequent Contributor
Posts: 112

Re: Missing data on repeat measurements

I have created an sample macro I hope it servers the purpose.

 

I had assumed that the bp1 column can never be blank. Please let know if it is not correct.

 

data bp;
input id age bp1 bp2 bp3;
datalines;
1 45 110 165 90
2 56 124 . .
3 60 142 . 137
4 61 . 94 120
5 39 130 130 .
6 55 157 130 124
;
run;

%macro bp_fix(number_of_bp=);
data final_bp;
set bp;
%do i=1 %to &number_of_bp.;
    %if &i gt 1 %then %do;
        if bp&i =. then %EVAL(bp&i.-1);
    %end;  
%end;
run;
%mend bp_fix;

/*In this case number_of_bp=3 so*/
%bp_fix(number_of_bp=3);

This Can be used for more than 3 columns.

Frequent Contributor
Posts: 86

Re: Missing data on repeat measurements

Posted in reply to Satish_Parida

Hello Satish,

 

Your code appears not work - I am not sure why. May you retest with a mock dataset.

 

Thank you.

 

Phan S. 

 

Ask a Question
Discussion stats
  • 4 replies
  • 168 views
  • 0 likes
  • 3 in conversation