Hi there,
I need your kind help to replace the missing value of a column with the previous date's value where as in case of missing first observation, the value will be replaced by next available value.
data have ;
format date date9.;
input id $ LHIN 2. date date9. pulse 3. ;
datalines;
101 1 01JAN2017 90
101 1 03JAN2017 80
101 1 05JAN2017
102 2 01JAN2017 90
102 2 03JAN2017
102 3 05JAN2017 70
103 2 03JAN2017
103 3 05JAN2017 70
103 3 06JAN2017 70
104 2 03JAN2017
104 3 05JAN2017
104 3 06JAN2017 70
;
run;
data want;
format date date9.;
input id $ LHIN 2. date date9. pulse 3. ;
datalines;
101 1 01JAN2017 90
101 1 03JAN2017 80
101 1 05JAN2017 80
102 2 01JAN2017 90
102 2 03JAN2017 90
102 3 05JAN2017 70
103 2 03JAN2017 80
103 3 05JAN2017 80
103 3 06JAN2017 70
104 2 03JAN2017 70
104 3 05JAN2017 70
104 3 06JAN2017 70
;
run;
Thank you in advance for your kind reply.
Regards,
There is quite a few posts on this subject. There are various methods, some guys like the update or hashtable. Me I would just go with retained value:
data want (drop=p); set have (rename=(pulse=p)); retain pulse; if p ne . then pulse=p; run;
Hi RW9,
This code seems to be very simple and great. Is it possible to do little modification so that the value carried forward wiil be depenedednt on ID.
Regards,
Yep:
data want (drop=p); set have (rename=(pulse=p));
by id; retain pulse;
if first.id then pulse=p; if p ne . then pulse=p; run;
Hi RW9,
Unfortunately for ID 101 ,
I am getting:
date id LHIN pulse
01-Jan-17 101 1 90
03-Jan-17 101 1 80
05-Jan-17 101 1 70
Expected:
101 1 01JAN2017 90
101 1 03JAN2017 80
101 1 05JAN2017 80
Regards,
Please provide test data exactly as it is. I have run the code:
data have ; format date date9.; input id $ LHIN 2. date date9. pulse 3. ; datalines; 101 1 01JAN2017 90 101 1 03JAN2017 80 101 1 05JAN2017 102 2 01JAN2017 90 102 2 03JAN2017 102 3 05JAN2017 70 103 2 03JAN2017 103 3 05JAN2017 70 103 3 06JAN2017 70 104 2 03JAN2017 104 3 05JAN2017 104 3 06JAN2017 70 ; run; data want (drop=p); set have (rename=(pulse=p)); by id; retain pulse; if first.id then pulse=p; if p ne . then pulse=p; run;
And it shows the correct value of 80, I suspect your data is not sorted correctly or something.
Hi RW9,
It is not working for ID 103 and 104:
03-Jan-17 103 2 .
05-Jan-17 103 3 70
06-Jan-17 103 3 70
03-Jan-17 104 2 .
05-Jan-17 104 3 .
06-Jan-17 104 3 70
Still the missing values are there. The missing values are expected to be filled with latest value, if older value is not available.
Regards,
In which case you need to do a couple of steps:
proc sort data=have out=temp; by id date; where val ne .; run; data temp (rename=(val=base)); set temp; by id; if first.id; run; data want (drop=base); merge have want; by id; if val=. then val=base; run;
This gets the first non-missing value, if there is one, and merges that back to your original data and replaces that value with the missing in original data.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.