Hi Team,
I need help in programming part for date imputation. Here is the raw data below:
Raw data:
data vy;
input id date $ 10.;
cards;
101 21aug2020
102 ukfeb2016
103 ukaug2019
104 ukunk2020
105 07aug2018
106 ukdec2020
107 ununkunkk
108 ukfeb2019
;
I need to do date imputation based on two points as below:
1. If both month and day are missing, then set to December 31.
2. If only day is missing, then set to last day of the month.
data vy1;
set vy;
/*Seperate date into day, month, year */
dayc=substr(date,1,2);
monthc=substr(date,3,3);
yearc=substr(date,6,4);
if yearc ne "unkk" then do; /*One row has year missing, not able to impute */
/*both month and day missing*/
if dayc="uk" and monthc="unk" then do;
dayi="31";
monthi="DEC";
end;
/*only day missing*/
if dayc="uk" and monthc ne"unk" then do;
myrc=cats(monthc)||cats(yearc);
myn=input(myrc, anydtdte7.);
lastday=intnx('month',myn,0,'E');
end;
/*get imputed date*/
if dayi ne "" and monthi ne "" then
date_impc=cats(dayi)||cats(monthi)||cats(yearc);
if lastday ne . then date_impc=put(lastday, date9.);
format date_imp date9.;
date_imp=input(date_impc, date9.);
end;
run;
I already posted an answer to this in the thread that has been deleted in the meantime:
data want;
set have (rename=(date=_date));
if substr(_date,6) = "unkk"
then date =.;
else do;
if substr(_date,3,3) = "unk" then substr(_date,3,3) = "dec";
if substr(_date,1,2) = "uk"
then do;
substr(_date,1,2) = "01";
date = input(_date,date9.);
date = intnx('month',date,0,'e');
end;
else date = input(_date,date9.);
end;
format date yymmdd10.; * always use the ISO format for clarity;
drop _date;
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.