Esteemed Advisers;
I'm trying to import a csv file with datetime data in the following format:
yyyy-mm-dd hh:mm:ss UTC
I can't find an informat that works and need some advice on how to proceed.
Below is a sample of what I've tried but isn't working--(thanks in advance!):
data have;
/*INFORMAT*/
informat created_at $19.;
/*FORMAT*/
format created_UTC datetime16. ;
/**/
input
created_at
;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
data want
created_UTC
01APR22:00:01:20
01APR22:00:03:20
01APR22:00:04:20
You cannot use a numeric format with a character variable. In particular to use the DATETIME format to display a variable it needs to be a numeric variable with datetime values (number of seconds since 1960).
So if you have this data:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
;
You can try to convert it to a datetime value by ignoring the timezone. You could then use the TZONEOFF() function to try to adjust to a different timezone if you want.
data want;
set have;
dt = input(created_at,anydtdtm19.);
timezone = scan(created_at,-1,' ');
offset = tzoneoff(timezone);
local_offset = tzoneoff();
local_dt = dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
Results:
data have;
format created_at datetime21. ;
input
created_at ANYDTDTM19.
;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
proc print;
run;
You cannot use a numeric format with a character variable. In particular to use the DATETIME format to display a variable it needs to be a numeric variable with datetime values (number of seconds since 1960).
So if you have this data:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
;
You can try to convert it to a datetime value by ignoring the timezone. You could then use the TZONEOFF() function to try to adjust to a different timezone if you want.
data want;
set have;
dt = input(created_at,anydtdtm19.);
timezone = scan(created_at,-1,' ');
offset = tzoneoff(timezone);
local_offset = tzoneoff();
local_dt = dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
Results:
Tom:
Thanks for the quick response. I tried your code but am not getting the result that you show. Namely, the dt and local_dt variable columns are empty. My code is below:
data have;
input created_at $23.;
datalines;
2022-04-01 00:01:20 UTC
2022-04-01 00:03:20 UTC
2022-04-01 00:05:20 UTC
run;
data want;
set have;
dt=input(created_at,anytdtdtm19.);
timezone=scan(created_at,-1,' ');
offset=tzoneoff(timezone);
local_offset=tzoneoff();
local_dt=dt - offset + local_offset;
format dt local_dt datetime19. offset local_offset time6.;
run;
proc print;
run;
An extra T got inserted into the informat name in the INPUT() function call.
1421 set have; 1422 dt=input(created_at,anytdtdtm19.); ------------ 48 ERROR 48-59: The informat ANYTDTDTM was not found or could not be loaded.
ANYDTDTM is the name of the informat.
If the extra T is not in your code then change that the spaces in CREATED_AT are actually spaces and not tabs ('09'x) or non-breaking spaces ('A0'x) or some other invisible character.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.