BookmarkSubscribeRSS Feed
di_niu0
Obsidian | Level 7

Hello All, 

 

I have been cleaning a dataset with a very messy date variable. Having trouble in converting the variable into a uniform date format. Here is what I have:

 

ID      date

1       SEPT01/2001

2       9-Apr-10

3       20-JAN-18

4       Jan 7-2010

5       oct1/2018

6       NOV18 2010

7       FEB. 2, 2016

8       6/1/2010

9       2013/02/01 lost to fup

 

Many thanks! 

 

 

 

 

 

 

 

7 REPLIES 7
SASKiwi
PROC Star

Is 6/1/2010  - 6 Jan or 1 Jun? 

Is 2013/02/01 - 1 Feb or 2 Jan?

 

Let me guess this has come from a spreadsheet which will accept any old rubbish...

di_niu0
Obsidian | Level 7
I looked at other values. It should be June 1, 2010.
Patrick
Opal | Level 21

I guess you will have to go through some "trial and error" process to read the data into SAS as desired. 

Below some approach you could take. Any string that you can't convert into a SAS Date value will end up in table investigate. You then need to further add to your code or fix the data directly in your source.

data have;
  infile datalines dsd dlm='|' truncover;
  input id string $30.;
  datalines;
1|SEPT01/2001
2|9-Apr-10
3|20-JAN-18
4|Jan 7-2010
5|oct1/2018
6|NOV18 2010
7|FEB. 2, 2016
8|6/1/2010
9|2013/02/01 lost to fup
;

%let sv_datestye=%sysfunc(getoption(datestyle,,keyexpand));
options datestyle=mdy;

data want investigate;
  set have;
  length sas_date 8;
  format sas_date date9.;

  if _n_=0 then _string2=string;

  if missing(string) then 
    do;
      output want;
      return;
    end;

  sas_date=input(string, ?? anydtdte.);
  if not missing(sas_date) then
    do;
      output want;
      return;
    end;

  _string2=prxchange('s/^([a-z]+)(\d+)[^\d](\d+)/\2\1\3/oi',-1,strip(string));
  sas_date=input(compress(_string2), ?? anydtdte.);
  if not missing(sas_date) then
    do;
      output want;
      return;
    end;

  /*** and here more string manipulations until no rows remain in Investigate ***/

  else output investigate;

run;

options &sv_datestye;
Kurt_Bremser
Super User

Your task is logically impossible.

At least a date like this

20-JAN-18

is ambiguous (could be 2018-01-20 or 2020-01-18)

 

<RANT>Anybody who still uses 2-digit years is an idiot with terminal brain damage. Such poor creatures have no place near a computer except under adult supervision.</RANT>

di_niu0
Obsidian | Level 7
That's what I get. It's quit difficult to interpret the exact time. However, when I look at other values in the dataset, it should be interpreted as Jan 20, 2018.
andreas_lds
Jade | Level 19

Summing up what others already said: 4 of 9 strings can't translated into dates, because the available information is not sufficient to decide which number is day and which is year (IDs 2, 3) or day and month (IDs 8, 9). So either you have the knowledge, so that it can be coded, or the mission will fail.

di_niu0
Obsidian | Level 7

Hi guys. I really appreciated that you guys are trying to help me with this problem. It's quite messy but it's what I have to deal with. When I posted this question, I didn't realize I provided less information than my dataset had given to me. Anyways, I wrote some code that worked fine for me. For anyone who's interested, that's what I used:

 

reformat_date = input(compress(date, '.'), anydtdte13.);

reformat_date_2 = input(date, yymmdd10,);

format reformat_date reformat_date_2 date9.;

 

It didn't convert all dates but it did most of the work. I did other modifications too. 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1475 views
  • 5 likes
  • 5 in conversation