BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TL93
Obsidian | Level 7

Hi all! I know there's a thread on working with incomplete character dates but my situation is slightly different.

 

I have a date of diagnosis variable formatted as YYYYMMDD, and it's a character variable (length=8). I would like to change it to a SAS date variable; however some dates contain missing values. For example, the DD portion will be coded as "99" if the day of diagnosis is unknown. So my data look like:

 

ID        Date of Diagnosis
1          20000314
2          20120523
3          19981299
4          19930118
5          20090899
6          19990799

*fictitious date values for exemplary purposes; my data contain over thousands of people*

 

This date conversion code would return the dates 3, 5 and 6 as missing (".")

date_new=input(date_old, yymmdd8.);
format date_new yymmdd8.;

 

Is there a way I can tell SAS to only read the first 6 characters of the variable (YYYYMM portion) and convert it to a SAS date formatted as MONYY. or YYMON.?

 

Alternatively, before converting, is there a way I can tell SAS to replace any YYYYMM99 dates for the day of diagnosis variable with YYYYMM15 (set to the middle of that month)? I am asking this second question because there will be scenarios where MM(month) is 99 too (i.e. 20059914) and if I can learn how to deal with the DD(day) portion then I can apply it to missing month dates.

 

I am using SAS Enterprise 7.1.

 

Thank you for your help!

1 ACCEPTED SOLUTION
5 REPLIES 5
PGStats
Opal | Level 21

Lots of flexibility with regular expressions:

 

data test;
input ID DateStr $;
datalines;
1          20000314
2          20120523
3          19981299
4          19930118
5          20090899
6          19990799
7          19999907
;

data want;
set test;
dateStr = prxchange("s/(\d{6})99/\1\Q15/",1,dateStr);
dateStr = prxchange("s/(\d{4})99\d\d/\1\Q0701/",1,dateStr);
date = input(dateStr, yymmdd8.);
format date yymmdd10.;
run;

proc print; run;


PG
sks521
Quartz | Level 8

Can I ask how you would write the syntax if there are multiple date variables please?

 

Thanks

S

PGStats
Opal | Level 21

Just repeat the 4 lines of code for every variable. Or use arrays and a do loop if that's more convenient.

PG
s_lassen
Meteorite | Level 14

You can't set a format conditionally, or to put it another way: you cannot show different formats for the same variable in different rows.

 

I can see two solutions. You can either repair the date, and set the day to 15 if it is out of range:

Data want;
  set have;
  date_new=input(date_old,?? yymmdd8.); /* ?? Means do not raise a log error */
  if missing(date_new) then do;
    date_new=input(substr(date_old,1,6)!!'15',yymmdd8.);
    date_corrected=1; 
    end;
  else
    date_corrected=0;
run;

Or you can have two variables, one that shows the date (when valid), and one that shows the month (which is presumably always valid):

data want;
  set have;
  date_new=input(date_old,?? yymmdd8.);
  month_date=input(date_old,yymmn6.);
  format month_date yymon7.;
run;

The month_date will then always be the first of the month, if date_new is missing it means that the day was invalid.

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4977 views
  • 6 likes
  • 5 in conversation