BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PharmDoc
Fluorite | Level 6

I have a dataset called impact with a column called contact_date .

The contact_date column is in a character format  [MM/DD/YY:HH:MM:SS AM/PM], that is, for example, one of the observations literally reads "[10/29/20:04:47:26 PM]". 

 

I would like to use proc sql to convert the contact_date column from character to datetime format. 

 

I'v tried the following code, but I want the date to read as 10/29/20 and not 29OCT20:

 

proc sql;
select *,input(contact_date, anydtdtm.) as sas_contact_date format=datetime20.
from impact;
quit;

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@PharmDoc wrote:

The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:

[10/29/20:04:47:26 PM]     10/29/2020 4:47 AM

 

proc sql;
select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. 
FROM impact;
quit;

 


The reason it prints an AM time instead of a PM time is not because of the format, but due to the informat used in the INPUT functions, which ignored the "PM" when reading the character value.  Replace ANYDTDTM. with ANYDTDTM21 in the INPUT function.  This will not cause problems if all of your character values have exactly the number of characters as in your example.

 

However, it will still not print seconds.  I don't see a SAS format that would print

  10/20/2020 04:47:26 PM

so you'll have to define your own format, as in:

 

proc format;
  picture mdyampms (default=22)
    low-high=  '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime);
run;

then assign the format mdyampms in your proc sql statement.

 

The picture statement allows you to implement the ("datatype=datetime)" option, which in turn supports the use of the format directives  (%0m for month, %0d for day-of-month, etc.).  You can find more about the possible datatypes and corresponding directives in PICTURE Statement 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

7 REPLIES 7
PharmDoc
Fluorite | Level 6

The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:

[10/29/20:04:47:26 PM]     10/29/2020 4:47 AM

 

proc sql;
select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. 
FROM impact;
quit;

 

This code includes the seconds, but the date reads as DDMMMYYYY and the time is still treated as 24hr time. 

[10/29/20:04:47:26 PM]   →   29OCT2020:04:47:26 AM

proc sql;
select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=DATEAMPM22. 
FROM impact;
quit;

The following code produces the following notes in the log and outputs missing values:

NOTE: Invalid date value
NOTE: Invalid argument to function INPUT. Missing values may be generated.
 
[10/29/20:04:47:26 PM]   →  . 
proc sql;
select *,input(contact_date,MDYAMPM.) as sas_contact_date format=DATEAMPM22. 
FROM impact;
quit;

 

mkeintz
PROC Star

@PharmDoc wrote:

The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:

[10/29/20:04:47:26 PM]     10/29/2020 4:47 AM

 

proc sql;
select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. 
FROM impact;
quit;

 


The reason it prints an AM time instead of a PM time is not because of the format, but due to the informat used in the INPUT functions, which ignored the "PM" when reading the character value.  Replace ANYDTDTM. with ANYDTDTM21 in the INPUT function.  This will not cause problems if all of your character values have exactly the number of characters as in your example.

 

However, it will still not print seconds.  I don't see a SAS format that would print

  10/20/2020 04:47:26 PM

so you'll have to define your own format, as in:

 

proc format;
  picture mdyampms (default=22)
    low-high=  '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime);
run;

then assign the format mdyampms in your proc sql statement.

 

The picture statement allows you to implement the ("datatype=datetime)" option, which in turn supports the use of the format directives  (%0m for month, %0d for day-of-month, etc.).  You can find more about the possible datatypes and corresponding directives in PICTURE Statement 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PharmDoc
Fluorite | Level 6

Thank you mkeintz, that was very helpful and it worked. 

 
proc format; 
  picture mdyampms (default=22) 
    low-high=  '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime); 
run; 
 
 
proc sql; 
create table impact2 as  
select *,input(contact_date,ANYDTDTM21.) as sas_contact_date format=MDYAMPM22.  
FROM impact; 
quit; 
 
proc contents data=impact2; 
run; 
proc print data=impact2; 
run; 

 

However, when I run an analysis of the minimum and maximum date, SAS treats the year 1923 as 2023

[01/26/23:12:46:17 PM]  1/26/2023 12:46 PM

 

proc sql;
	select "sas_contact_date" label="Date variable", min(sas_contact_date) 
		format=MDYAMPM22. label="Minimum date" , max(sas_contact_date) 
		format=MDYAMPM22. label="Maximum date" from impact2;
quit;

 

 

PharmDoc
Fluorite | Level 6

Also, how would I code if instead the date was in the form [YYYY/MM/DD:HH:MM:SS AM/PM] ? 

With attributes of type=Char ; len=24 format=$24. ; informat = $24. 

 

The following code generates a table called impact2 with column sas_contact_date containing values of "."  

 

proc format;

  picture mdyampms (default=24)

    low-high=  'G/%0m/%0d %0I:%0M:%0S %p' (datatype=datetime);

run;

 

proc sql;

create table impact2 as 

select *,input(contact_date,ANYDTDTM21.) as sas_contact_date format=MDYAMPM24. 

FROM impact;

quit;

 

Tom
Super User Tom
Super User

Why are you only reading 21 of the 24 characters in your string?  In does not really hurt to tell INPUT() to use more characters than the string you are giving it actually has, as long as the width is within the range supported by the informat.

 

Does you string really have the square brackets?  You will want to remove those first.

data test;
  string='[2021/05/11:00:10:30 PM]' ;
  dt = input(compress(string,'[]'),anydtdtm24.);
  format dt datetime19.;
  put dt=;
run;
929   data test;
930     string='[2021/05/11:00:10:30 PM]' ;
931     dt = input(compress(string,'[]'),anydtdtm24.);
932     format dt datetime19.;
933     put dt=;
934   run;

dt=11MAY2021:12:10:30
NOTE: The data set WORK.TEST has 1 observations and 2 variables.
PharmDoc
Fluorite | Level 6

Okay, so again following as @mkeintz suggested earlier. I created a custom format:

 

proc format;
picture mdyampms (default=24)
low-high= '%G/%0m/%0d %0I:%0M:%0S %p' (datatype=datetime);
run;

But there's a caveat, when I specify format=mdyampms22 in proc sql below, the datetime correctly reads as YYYY/MM/SS HH:MM:SS AM/PM in a new column, which is great, but when I try to create a frequency plot (see last snippets of code) the dates do not show up on the x-axis, whereas they do show up on the xaxis of a frequency plot when I type in the non-customized format= mdyampm22 in the proc sql statement (illustrated at the end of this reply)

 

 

Following along the lines as what @Tom  suggests and incorporating into proc sql:

(I noticed that in proc contents the Informat is blank):

proc sql;
create table impact2 as
select *,input(compress(contact_date,'[]'),ANYDTDTM24.) as fcontact_date format=mdyampms22.
FROM impact;
quit;

proc contents data=impact2;
run;
proc print data=impact2;
run;

 

Or alternatively I can forgo proc sql and again follow more along what @Tom suggested but use a data step:

/* mdyampms22. is from the custom format created earlier   */
data
impact2;
set impact;
fcontact_date = input(compress(contact_date,'[]'),anydtdtm24.);
format contact_date mdyampms22.;
put fcontact_date=;
run;
proc contents data=impact2;
run;
proc print data=impact2;
run;

 

 

 

But even using the datastep method, the values do not show up on the xaxis of a frequency plot (see below) 

 

/*** Analyze date variables ***/
title "Minimum and Maximum Dates";

proc sql;
select "fcontact_date" label="Date variable", min(fcontact_date)
format=MDYAMPMS22. label="Minimum date" , max(fcontact_date)
format=MDYAMPMS22. label="Maximum date" from impact2;
quit;

title "Date Frequencies";

proc freq data=impact2 noprint;
tables fcontact_date / out=_tmpfreq1;
run;

proc sgplot data=_tmpfreq1;
yaxis min=0 minor ;
xaxis minor type=time;
series x=fcontact_date y=count;
run;

proc delete data=_tmpfreq1;
run;

 Date variable Minimum date Maximum date

fcontact_date2010/12/28 01:53:57 PM2026/09/19 10:44:42 AM

 

with format=mdyampms22.with format=mdyampms22.

 

Again, specifying mdyampm22. instead of the custom mdyampms22. in proc sql ultimately inputs the year labels on the xaxis of the frequency plot

(BUT, again, years 1923 and 1926 are being treated as 2023 and 2026):

This begs the question of if using the picture statement to create the custom format mdyampms22. is necessary? 

proc sql;
create table impact2 as
select *,input(compress(contact_date,'[]'),ANYDTDTM24.) as fcontact_date format=mdyampm22.
FROM impact;
quit;

title "Minimum and Maximum Dates";

proc sql;
select "fcontact_date" label="Date variable", min(fcontact_date)
format=MDYAMPM22. label="Minimum date" , max(fcontact_date)
format=MDYAMPM22. label="Maximum date" from impact2;
quit;

title "Date Frequencies";

proc freq data=impact2 noprint;
tables fcontact_date / out=_tmpfreq1;
run;

proc sgplot data=_tmpfreq1;
yaxis min=0 minor ;
xaxis minor type=time ;
series x=fcontact_date y=count;
run;

proc delete data=_tmpfreq1;
run;

 Date variable Minimum date Maximum date

fcontact_date12/28/2010 1:53 PM9/19/2026 10:44 AM

with format=mdyampm22.with format=mdyampm22.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 9648 views
  • 2 likes
  • 4 in conversation