I have a dataset called impact with a column called contact_date .
The contact_date column is in a character format [MM/DD/YY:HH:MM:SS AM/PM], that is, for example, one of the observations literally reads "[10/29/20:04:47:26 PM]".
I would like to use proc sql to convert the contact_date column from character to datetime format.
I'v tried the following code, but I want the date to read as 10/29/20 and not 29OCT20:
proc sql; select *,input(contact_date, anydtdtm.) as sas_contact_date format=datetime20. from impact; quit;
@PharmDoc wrote:
The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:
[10/29/20:04:47:26 PM] → 10/29/2020 4:47 AM
proc sql; select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. FROM impact; quit;
The reason it prints an AM time instead of a PM time is not because of the format, but due to the informat used in the INPUT functions, which ignored the "PM" when reading the character value. Replace ANYDTDTM. with ANYDTDTM21 in the INPUT function. This will not cause problems if all of your character values have exactly the number of characters as in your example.
However, it will still not print seconds. I don't see a SAS format that would print
10/20/2020 04:47:26 PM
so you'll have to define your own format, as in:
proc format;
picture mdyampms (default=22)
low-high= '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime);
run;
then assign the format mdyampms in your proc sql statement.
The picture statement allows you to implement the ("datatype=datetime)" option, which in turn supports the use of the format directives (%0m for month, %0d for day-of-month, etc.). You can find more about the possible datatypes and corresponding directives in PICTURE Statement
The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:
[10/29/20:04:47:26 PM] → 10/29/2020 4:47 AM
proc sql; select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. FROM impact; quit;
This code includes the seconds, but the date reads as DDMMMYYYY and the time is still treated as 24hr time.
[10/29/20:04:47:26 PM] → 29OCT2020:04:47:26 AM
proc sql; select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=DATEAMPM22. FROM impact; quit;
The following code produces the following notes in the log and outputs missing values:
proc sql; select *,input(contact_date,MDYAMPM.) as sas_contact_date format=DATEAMPM22. FROM impact; quit;
@PharmDoc wrote:
The following code somewhat works, except it doesn't include the seconds and treats the time as 24hr time:
[10/29/20:04:47:26 PM] → 10/29/2020 4:47 AM
proc sql; select *,input(contact_date,ANYDTDTM.) as sas_contact_date format=MDYAMPM22. FROM impact; quit;
The reason it prints an AM time instead of a PM time is not because of the format, but due to the informat used in the INPUT functions, which ignored the "PM" when reading the character value. Replace ANYDTDTM. with ANYDTDTM21 in the INPUT function. This will not cause problems if all of your character values have exactly the number of characters as in your example.
However, it will still not print seconds. I don't see a SAS format that would print
10/20/2020 04:47:26 PM
so you'll have to define your own format, as in:
proc format;
picture mdyampms (default=22)
low-high= '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime);
run;
then assign the format mdyampms in your proc sql statement.
The picture statement allows you to implement the ("datatype=datetime)" option, which in turn supports the use of the format directives (%0m for month, %0d for day-of-month, etc.). You can find more about the possible datatypes and corresponding directives in PICTURE Statement
Thank you mkeintz, that was very helpful and it worked.
proc format; picture mdyampms (default=22) low-high= '%0m/%0d/%G %0I:%0M:%0S %p' (datatype=datetime); run; proc sql; create table impact2 as select *,input(contact_date,ANYDTDTM21.) as sas_contact_date format=MDYAMPM22. FROM impact; quit; proc contents data=impact2; run; proc print data=impact2; run;
However, when I run an analysis of the minimum and maximum date, SAS treats the year 1923 as 2023
[01/26/23:12:46:17 PM] → 1/26/2023 12:46 PM
proc sql; select "sas_contact_date" label="Date variable", min(sas_contact_date) format=MDYAMPM22. label="Minimum date" , max(sas_contact_date) format=MDYAMPM22. label="Maximum date" from impact2; quit;
Also, how would I code if instead the date was in the form [YYYY/MM/DD:HH:MM:SS AM/PM] ?
With attributes of type=Char ; len=24 format=$24. ; informat = $24.
The following code generates a table called impact2 with column sas_contact_date containing values of "."
proc format;
picture mdyampms (default=24)
low-high= 'G/%0m/%0d %0I:%0M:%0S %p' (datatype=datetime);
run;
proc sql;
create table impact2 as
select *,input(contact_date,ANYDTDTM21.) as sas_contact_date format=MDYAMPM24.
FROM impact;
quit;
Why are you only reading 21 of the 24 characters in your string? In does not really hurt to tell INPUT() to use more characters than the string you are giving it actually has, as long as the width is within the range supported by the informat.
Does you string really have the square brackets? You will want to remove those first.
data test;
string='[2021/05/11:00:10:30 PM]' ;
dt = input(compress(string,'[]'),anydtdtm24.);
format dt datetime19.;
put dt=;
run;
929 data test; 930 string='[2021/05/11:00:10:30 PM]' ; 931 dt = input(compress(string,'[]'),anydtdtm24.); 932 format dt datetime19.; 933 put dt=; 934 run; dt=11MAY2021:12:10:30 NOTE: The data set WORK.TEST has 1 observations and 2 variables.
Okay, so again following as @mkeintz suggested earlier. I created a custom format:
proc format;
picture mdyampms (default=24)
low-high= '%G/%0m/%0d %0I:%0M:%0S %p' (datatype=datetime);
run;
But there's a caveat, when I specify format=mdyampms22 in proc sql below, the datetime correctly reads as YYYY/MM/SS HH:MM:SS AM/PM in a new column, which is great, but when I try to create a frequency plot (see last snippets of code) the dates do not show up on the x-axis, whereas they do show up on the xaxis of a frequency plot when I type in the non-customized format= mdyampm22 in the proc sql statement (illustrated at the end of this reply)
Following along the lines as what @Tom suggests and incorporating into proc sql:
(I noticed that in proc contents the Informat is blank):
proc sql;
create table impact2 as
select *,input(compress(contact_date,'[]'),ANYDTDTM24.) as fcontact_date format=mdyampms22.
FROM impact;
quit;
proc contents data=impact2;
run;
proc print data=impact2;
run;
Or alternatively I can forgo proc sql and again follow more along what @Tom suggested but use a data step:
/* mdyampms22. is from the custom format created earlier */
data impact2;
set impact;
fcontact_date = input(compress(contact_date,'[]'),anydtdtm24.);
format contact_date mdyampms22.;
put fcontact_date=;
run;
proc contents data=impact2;
run;
proc print data=impact2;
run;
But even using the datastep method, the values do not show up on the xaxis of a frequency plot (see below)
/*** Analyze date variables ***/
title "Minimum and Maximum Dates";
proc sql;
select "fcontact_date" label="Date variable", min(fcontact_date)
format=MDYAMPMS22. label="Minimum date" , max(fcontact_date)
format=MDYAMPMS22. label="Maximum date" from impact2;
quit;
title "Date Frequencies";
proc freq data=impact2 noprint;
tables fcontact_date / out=_tmpfreq1;
run;
proc sgplot data=_tmpfreq1;
yaxis min=0 minor ;
xaxis minor type=time;
series x=fcontact_date y=count;
run;
proc delete data=_tmpfreq1;
run;
Date variable Minimum date Maximum date
fcontact_date | 2010/12/28 01:53:57 PM | 2026/09/19 10:44:42 AM |
Again, specifying mdyampm22. instead of the custom mdyampms22. in proc sql ultimately inputs the year labels on the xaxis of the frequency plot
(BUT, again, years 1923 and 1926 are being treated as 2023 and 2026):
This begs the question of if using the picture statement to create the custom format mdyampms22. is necessary?
proc sql;
create table impact2 as
select *,input(compress(contact_date,'[]'),ANYDTDTM24.) as fcontact_date format=mdyampm22.
FROM impact;
quit;
title "Minimum and Maximum Dates";
proc sql;
select "fcontact_date" label="Date variable", min(fcontact_date)
format=MDYAMPM22. label="Minimum date" , max(fcontact_date)
format=MDYAMPM22. label="Maximum date" from impact2;
quit;
title "Date Frequencies";
proc freq data=impact2 noprint;
tables fcontact_date / out=_tmpfreq1;
run;
proc sgplot data=_tmpfreq1;
yaxis min=0 minor ;
xaxis minor type=time ;
series x=fcontact_date y=count;
run;
proc delete data=_tmpfreq1;
run;
Date variable Minimum date Maximum date
fcontact_date | 12/28/2010 1:53 PM | 9/19/2026 10:44 AM |
You NEVER, EVER want your date to read xx/xx/xx. Always use 4-digit years. Always.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.