BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TL93
Obsidian | Level 7

Hi all! I know there's a thread on working with incomplete character dates but my situation is slightly different.

 

I have a date of diagnosis variable formatted as YYYYMMDD, and it's a character variable (length=8). I would like to change it to a SAS date variable; however some dates contain missing values. For example, the DD portion will be coded as "99" if the day of diagnosis is unknown. So my data look like:

 

ID        Date of Diagnosis
1          20000314
2          20120523
3          19981299
4          19930118
5          20090899
6          19990799

*fictitious date values for exemplary purposes; my data contain over thousands of people*

 

This date conversion code would return the dates 3, 5 and 6 as missing (".")

date_new=input(date_old, yymmdd8.);
format date_new yymmdd8.;

 

Is there a way I can tell SAS to only read the first 6 characters of the variable (YYYYMM portion) and convert it to a SAS date formatted as MONYY. or YYMON.?

 

Alternatively, before converting, is there a way I can tell SAS to replace any YYYYMM99 dates for the day of diagnosis variable with YYYYMM15 (set to the middle of that month)? I am asking this second question because there will be scenarios where MM(month) is 99 too (i.e. 20059914) and if I can learn how to deal with the DD(day) portion then I can apply it to missing month dates.

 

I am using SAS Enterprise 7.1.

 

Thank you for your help!

1 ACCEPTED SOLUTION
5 REPLIES 5
PGStats
Opal | Level 21

Lots of flexibility with regular expressions:

 

data test;
input ID DateStr $;
datalines;
1          20000314
2          20120523
3          19981299
4          19930118
5          20090899
6          19990799
7          19999907
;

data want;
set test;
dateStr = prxchange("s/(\d{6})99/\1\Q15/",1,dateStr);
dateStr = prxchange("s/(\d{4})99\d\d/\1\Q0701/",1,dateStr);
date = input(dateStr, yymmdd8.);
format date yymmdd10.;
run;

proc print; run;


PG
sks521
Quartz | Level 8

Can I ask how you would write the syntax if there are multiple date variables please?

 

Thanks

S

PGStats
Opal | Level 21

Just repeat the 4 lines of code for every variable. Or use arrays and a do loop if that's more convenient.

PG
s_lassen
Meteorite | Level 14

You can't set a format conditionally, or to put it another way: you cannot show different formats for the same variable in different rows.

 

I can see two solutions. You can either repair the date, and set the day to 15 if it is out of range:

Data want;
  set have;
  date_new=input(date_old,?? yymmdd8.); /* ?? Means do not raise a log error */
  if missing(date_new) then do;
    date_new=input(substr(date_old,1,6)!!'15',yymmdd8.);
    date_corrected=1; 
    end;
  else
    date_corrected=0;
run;

Or you can have two variables, one that shows the date (when valid), and one that shows the month (which is presumably always valid):

data want;
  set have;
  date_new=input(date_old,?? yymmdd8.);
  month_date=input(date_old,yymmn6.);
  format month_date yymon7.;
run;

The month_date will then always be the first of the month, if date_new is missing it means that the day was invalid.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 6190 views
  • 6 likes
  • 5 in conversation