BookmarkSubscribeRSS Feed
190119
Calcite | Level 5

Hi Team,

I need to do date imputation based on two points as below:

    1.  If both month and day are missing, then set to December 31.
     2. If only day is missing, then set to last day of the month.

I need help in programming part for date imputation. Here is the raw data below attachement

4 REPLIES 4
Kurt_Bremser
Super User
data want;
set have (rename=(date=_date));
if upcase(substr(_date,6)) = "UNKK"
then date = .;
else do;
  if upcase(substr(_date,3,3)) = "UNK"
  then substr(_date,3,3) = "DEC";
  if upcase(substr(_date,1,2)) = "UK"
  then do;
    substr(_date,1,2) = "01";
    date = input(_date,date9.);
    date = intnx('month',date,0,'e');
  end;
  else date = input(_date,date9.);
end;
format date yymmdd10.;
drop _date;
run;

Untested, posted from my tablet.

Kurt_Bremser
Super User

Please do not use attachments for code. Copy/paste the code into a window opened with the "little running man" button. That way we can immediately see and use the code (e.g. copy/paste into SAS Studio) without having to download first.

ballardw
Super User

Here's the attached code:

data vy;
input id date  $  10.;
cards;
101 21aug2020
102 ukfeb2016
103 ukaug2019
104 ukunk2020
105 07aug2018
106 ukdec2020
107 ununkunkk
108 ukfeb2019
;

The real question in my mind is the un unk and unkk. Where did they come from? Someone made them. Perhaps address that so that you get values like

101 21aug2020
102 feb2016
103 aug2019
104 2020
105 07aug2018
106 dec2020
107 .
108 feb2019
;

Testing the length of the variable would then allow selecting the proper handling...

mkeintz
PROC Star

 

You can:

 

data want;
  set have;
  if date='ununkunkk'   then sasdate=.;
  else if date=:'ukunk' then sasdate=input(cats('31dec',substr(date,6,4)),date9.);
  else if date=:'uk'    then sasdate=intnx('month',input(substr(date,3,7),monyy7.),0,'end');
  else sasdate=input(date,date9.);
  format sasdate date9.;
run;

You did provide your sample data inside a working data step - thanks.  BUT ...

Please make it viewable by using the insert code icon ("running man") rather than as an attachment.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 863 views
  • 0 likes
  • 4 in conversation