BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
camfarrell25
Quartz | Level 8

Hello,

I am trying to convert a variable that contains multiple format (LASTPAY) to a single/ standardized format (NEWEXPDATE).

 

Any idea how one would go about that?

 

Thanks!


CF

 

sasquestion.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Here is one way:

data have;
  input havedate $10.;
  cards;
42916
6/23/2017
20170630
2017-06-21
;

data want;
  set have;
  format wantdate date9.;
  wantdate=input(havedate,anydtdte10.);
  if missing(wantdate) then wantdate=input(havedate,8.)-21916;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

15 REPLIES 15
collinelliot
Barite | Level 11

How many variations of the date do you have? I see just two in your example. But you can conditionally convert using the approprirate format based on a pattern match. That is, if it looks like it's already a SAS date value, just input(lastpay, best.). If it's a date format, input(lastpay, mmddyy10.), for example.

camfarrell25
Quartz | Level 8

Hello!

There is four (the data wasn't assembled by me and it the process of aggregating the dataset, the conversion wasn't done at the source unfortunately):

42916
6/23/2017
20170630

2017-06-21

 

I've attached a sample.

Let me know!

 

Thank!

Jagadishkatam
Amethyst | Level 16
data have;
input lastpay$10.;
newexpdate=input(lastpay,mmddyy10.);
format newexpdate mmddyy10.;
cards;
42917
6/27/2017
 ;
run;
Thanks,
Jag
Reeza
Super User

Are the values you've shown correct, ie the is the correct new variable in that data set?

Can you provide data as text to test?

camfarrell25
Quartz | Level 8

I've attached a sample, there are at least 4 different formats (that I could find).

 

Thanks for the help!

ballardw
Super User

Most of your "dates" look like the Excel days from their offset date of 1 Jan 1900 except for a few that were entered as text in a manner that Excel did not convert to its internal date.

 

One way: Make sure that the entire column in the Excel sheet has the cell format set to a date.

Then export to CSV and import into SAS.

art297
Opal | Level 21

Here is one way:

data have;
  input havedate $10.;
  cards;
42916
6/23/2017
20170630
2017-06-21
;

data want;
  set have;
  format wantdate date9.;
  wantdate=input(havedate,anydtdte10.);
  if missing(wantdate) then wantdate=input(havedate,8.)-21916;
run;

Art, CEO, AnalystFinder.com

 

camfarrell25
Quartz | Level 8

This code I found much more accurate:

data work.want;

  set work.have;

  format datenew date9.;

  datenew =input(datehave,8.)-21916;

  if missing(datenew ) then datenew =input(datehave,anydtdte10.);

run;

ballardw
Super User

There is another complication in your example data:

You are showing a Lastpay value of 42909 that, depending on the row, apparently is supposed to transform to 09/01/2020, 11/22/2019, 05/01/2021. Is that actually the case?

 

Against my better judgement I opened that Excel file.

I copied the data column to another column to manipulate in Excel. Highlight the new column, go to the data tab, select the Text to columns and convert everything to numeric. Then applied the DATE format to the cells.

This is what i get for the first few rows (tab delimited text with a blank column to help readability)

LASTPAY		After Text to column and apply date format.
42887		6/1/2017
6/24/2017		6/24/2017
42887		6/1/2017
42917		7/1/2017
42906		6/20/2017
2017-06-30		6/30/2017
42911		6/25/2017
6/30/2017		6/30/2017
42917		7/1/2017
42916		6/30/2017
42907		6/21/2017
42887		6/1/2017
42916		6/30/2017
42901		6/15/2017
42905		6/19/2017
42908		6/22/2017
42916		6/30/2017
42917		7/1/2017
42922		7/6/2017
42907		6/21/2017
42923		7/7/2017
42915		6/29/2017

I suspect that your process somewhere along the line turned most of your dates into character values after removing the date format for some reason.

 

art297
Opal | Level 21

@ballardw: You have more courage than I do. I attempted to open the file using Chrome's fairly safe viewer, but it was unable to open the file so I decided not to open it in Excel.

 

Art, CEO, AnalystFinder.com

 

camfarrell25
Quartz | Level 8

Ya I think that's the problem - unfortunately, all I have is a serveral million observations with that glitch in it and I can't really go back to the source and ask them to fix the raw variables (the dataset is an amalgation of multiple sources, all with different formats) - so I was hoping that I could somehow just fix it within SAS.

Tom
Super User Tom
Super User

This question has come up before.  Basically you have a column in Excel that has mixed dates and character strings. So when you convert it into a SAS dataset using PROC IMPORT or LIBNAME engine SAS will set the variable type to character. The dates come over as the text version of the integer value that EXCEL uses for dates.  You just need to convert them.  I would test if the value is an integer and then convert the integer by correcting for the difference in offset dates. And then use the ANYDTDTE informat to convert the other values.

 

data want ;
  set have ;
  date = input(chardate,??11.);
  if missing(date) then date=input(chardate,anydtdte11.);
  else date=date + '01JAN1900'd -2 ;
  format date date9.;
run;

The -2 in the conversion formula is because SAS starts counting from zero and Excel starts counting from one and Excel mistakenly thinks that 1900 was a leap year.

 

Note you could still have trouble if your multiple sources come from locations where people use MDY ordering for dates and also from places where people use DMY ordering for dates. In that case there really is no way to tell which is the right date without some additional knowledge.  For example if you know the dates should always be the first of the month then you would know the '1/7/2017' meant 01JUL2017 and not 07JAN2017.  Or if you know the source location of the record you could use that.  Otherwise there will be no way to convert a value like '10/12/2016' with 100% convidence.

camfarrell25
Quartz | Level 8
I found this one to be quite helpful:
data work.want;

set work.have;

format datenew date9.;

datenew =input(datehave,8.)-21916;

if missing(datenew ) then datenew =input(datehave,anydtdte10.);

run;
art297
Opal | Level 21

I prefer the order the way I proposed as, otherwise, a date like 20170630 doesn't translate correctly and you end up with all kinds of notes in your log.

 

Art, CEO, AnalystFinder.com

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 7202 views
  • 2 likes
  • 7 in conversation