Hi Team,
I need help in programming part for date imputation. Here is the raw data below:
Raw data:
data vy;
input id date $ 10.;
cards;
101 21aug2020
102 ukfeb2016
103 ukaug2019
104 ukunk2020
105 07aug2018
106 ukdec2020
107 ununkunkk
108 ukfeb2019
;
I need to do date imputation based on two points as below:
1. If both month and day are missing, then set to December 31.
2. If only day is missing, then set to last day of the month.
data vy1;
set vy;
/*Seperate date into day, month, year */
dayc=substr(date,1,2);
monthc=substr(date,3,3);
yearc=substr(date,6,4);
if yearc ne "unkk" then do; /*One row has year missing, not able to impute */
/*both month and day missing*/
if dayc="uk" and monthc="unk" then do;
dayi="31";
monthi="DEC";
end;
/*only day missing*/
if dayc="uk" and monthc ne"unk" then do;
myrc=cats(monthc)||cats(yearc);
myn=input(myrc, anydtdte7.);
lastday=intnx('month',myn,0,'E');
end;
/*get imputed date*/
if dayi ne "" and monthi ne "" then
date_impc=cats(dayi)||cats(monthi)||cats(yearc);
if lastday ne . then date_impc=put(lastday, date9.);
format date_imp date9.;
date_imp=input(date_impc, date9.);
end;
run;
I already posted an answer to this in the thread that has been deleted in the meantime:
data want;
set have (rename=(date=_date));
if substr(_date,6) = "unkk"
then date =.;
else do;
if substr(_date,3,3) = "unk" then substr(_date,3,3) = "dec";
if substr(_date,1,2) = "uk"
then do;
substr(_date,1,2) = "01";
date = input(_date,date9.);
date = intnx('month',date,0,'e');
end;
else date = input(_date,date9.);
end;
format date yymmdd10.; * always use the ISO format for clarity;
drop _date;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.